Matrix Lie Groups

Definition. A group $(G,\cdot,{}^{-1},e)$ is a Lie group if $G$ is also a differentiable manifold and the binary operation $\cdot: G\times G\longrightarrow G$ and the unary operation (inverse) ${}^{-1}: G\longrightarrow G$ are smooth maps.

A subgroup of a Lie group is not necessarily a Lie subgroup.

Theorem. [C. Chevalley] Every closed subgroup of a Lie group is a Lie subgroup.

Examples of Lie Groups.

  1. Let $M(m,n)=\{m\times n-\mbox{matrices over}\ \mathbb{R}\}\cong\mathbb{R}^{mn}$. Let $A=(a_{ij})\in M(m,n)$. Define an identification map\begin{align*}M(m,n)&\longrightarrow\mathbb{R}^{mn}\\(a_{ij})&\longmapsto(a_{11},\cdots,a_{1n};\cdots;a_{m1},\cdots,a_{mn}).\end{align*} We can naturally define topology on $M(m,n)$ by the identification map. $M(m,n)$ is covered by a single chart and the identification map is the coordinate map.
  2. The General Linear Group ${\rm GL}(n)$: Let $\mathrm{GL}(n)=\{\mbox{non-singular}\ n\times n-\mbox{matrices}\}$. Define a map\begin{align*}\mathrm{GL}(n)&\longrightarrow\mathbb{R}\\A&\longmapsto\det A.\end{align*} This map is onto and continuous since $\det A$ is a polynomial function of entries $a_{ij}$ of $A$. $\mathrm{GL}(n)=\det^{-1}(\mathbb{R}-\{0\})$is an open subset of $\mathbb{R}^{n^2}$, so that it is a submanifold of $\mathbb{R}^{n^2}$. This group is called the general linear group. The set of all $n\times n$ non-singular real (complex) matrices is denoted by $\mathrm{GL}(n;\mathbb{R})$ ($\mathrm{GL}(n;\mathbb{C})$, resp.). More generally, the set $n\times n$ non-singular matrices whose entries are the elements of a field $F$ is denoted by $\mathrm{GL}(n;F)$ or $\mathrm{GL}(V)$ where $V$ is the vector space isomorphic to $F^n$. Note that $\mathrm{GL}(V)$ is also the set of all linear isomorphisms of $V$.
  3. The Orthogonal Group $\mathrm{O}(n)$: The orthogonal group $\mathrm{O}(n)$ is defined to be the set $$\mathrm{O}(n)=\{n\times n-\mbox{orthogonal matrices}\},$$ i.e., $$A\in\mathrm{O}(n)\Longleftrightarrow A\cdot{}^tA=I,$$ where ${}^tA$ is the transpose of $A$ and $I$ is the $n\times n$ identity matrix.
  4. The Special Orthogonal Group $\mathrm{SO}(n)$: The special orthogonal group is defined to be the following subgroup of $\mathrm{O}(n)$: $$\mathrm{SO}(n)=\{A\in\mathrm{O}(n): \det A=1\}.$$
  5. The Special Linear Group $\mathrm{SL}(n)$: The special linear group is defined to be the following subgroup of $\mathrm{GL}(n)$ $$\mathrm{SL}(n)=\{A\in\mathrm{GL}(n): \det A=1\}.$$
  6. The Unitary Group $\mathrm{U}(n)$: The unitary group $\mathrm{U}(n)$ is the set of all $n\times n$-unitary matrices, i.e. $$\mathrm{U}(n)=\{U\in\mathrm{GL}(n;\mathbb{C}): UU^\ast=I\},$$ where $U^\ast={}^t\bar U$. Physicists often write $U^\ast$ as $U^\dagger$. $\mathrm{U}(n)$ is a Lie subgroup of $\mathrm{GL}(n;\mathbb{C})$.
  7. The Special Unitary Group $\mathrm{SU}(n)$: The special unitary group $\mathrm{SU}(n)$ is a Lie subgroup of $\mathrm{U}(n)$ and $\mathrm{SL}(2;\mathbb{C})$ $$\mathrm{SU}(n)=\{U\in\mathrm{SL}(2;\mathbb{C}):UU^\ast=I\}.$$

Proposition. For any $n\times n$ real or complex matrix $X$,
$$e^X:=\sum_{m=0}^\infty\frac{X^m}{m!}$$ converges and is a continuous function.

Proof. For the proof of the proposition click here.

Definition. Let $G$ be a matrix Lie group. The Lie algebra of $G$, denoted by $\mathfrak{g}$, is the set of all matrices $X$ such that $e^{tX}\in G$ for all $t\in\mathbb{R}$.

Definition. A function $A:\mathbb{R}\longrightarrow\mathrm{GL}(n;\mathbb{C})$ is called a one-parameter subgroup of $\mathrm{GL}(n;\mathbb{C})$ if

  1. $A$ is continuous;
  2. $A(0)=I$;
  3. $A(t+s)=A(t)A(s)$ for all $t,s\in\mathbb{R}$.

Theorem. If $A$ is a one-parameter subgroup of $\mathrm{GL}(n;\mathbb{C})$, then there exists uniquely an $n\times n$-complex matrix $X$ such that $A(t)=e^{tX}$ for all $t\in\mathbb{R}$.

In differential geometry, the Lie algebra $\mathfrak{g}$ is defined to be the tangent space $T_eG$ to $G$ at the identity $e$. The two definitions coincide if $G$ is $\mathrm{GL}(n;\mathbb{C})$ or its Lie subgroup. If $X\in\mathfrak{g}$ then by definiton $e^{tX}\in G$ for all $t\in\mathbb{R}$. The one-parameter subgroup  $\{e^{tX}:t\in\mathbb{R}\}$ of $G$ can be regarded as a differentiable curve $\gamma:\mathbb{R}\longrightarrow G$ such that $\gamma(0)=e$ where $e$ is the $n\times n$ identity matrix $I$. Thus $\dot\gamma(0)=X$ is the tangent vector to $G$ at the identity $e$, i.e. $X\in T_eG$. Conversely, $X\in T_eG$. Let $\{\phi_t:G\longrightarrow G\}_{t\in\mathbb{R}}$ be the flow generated by $X$, i.e.
$$\frac{d}{dt}\phi_t(p)=X_{\phi_t(p)}.$$ Then $\phi_t$ is smooth, $\phi_0=e$, and $\phi_t\circ \phi_s=\phi_{t+s}$. That is, $\{\phi_t:G\longrightarrow G\}_{t\in\mathbb{R}}$ is a one-parameter subgroup of $\mathrm{GL}(n;\mathbb{C})$. Hence by the above Theorem, there exists uniquely an $n\times n$-complex matrix $Y$ such that $A(t)=e^{tY}$. Since $\dot A(0)=Y$, $Y=X$ i.e. $A(t)=e^{tX}\in G\leq\mathrm{GL}(n;\mathbb C)$. Therefore $X\in\mathfrak{g}$.

Physicists’ convention: In the physics literature, the exponential map $\exp:\mathfrak{g}\longrightarrow G$ is usually given by $X\longmapsto e^{iX}$ instead of $X\longmapsto e^X$. The reason for that comes from quantum mechanics and it will be discussed later.

References:

[1] Andrew Baker, Matrix Groups, An Introduction to Lie Group Theory, Springer 2001

[2] Brian C. Hall, Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, Springer-Verlag 2004

Electrostatic Potential in a Hollow Cylinder

An electrostatic field $E$ (i.e. an electric field produced only by a static charge) is a conservative field, i.e. there exists a scalar potential $\psi$ such that $E=-\nabla\psi$. This is clear from Maxwell’s equations. Since there is no change of the magnetic field $B$ in time, $\nabla\times E=0$. If there is no charge present in a region, $\nabla\cdot E=0$. Together with $E=-\nabla\psi$, we obtain the Laplace equation $\nabla^2\psi=0$. Thus the Laplace equation can be used to find the electrostatic potential $\psi(\rho,\varphi,z)$ in a hollow cylinder with radius $a$ and height $l$ ($0\leq z\leq l$).

Using the separation of variables, we find the mode
\begin{align*}
\psi_{km}(\rho,\varphi,z)&=P_{km}(\rho)\Phi_m(\varphi)Z_k(z)\\
&=J_m(k\rho)[a_m\sin m\varphi+b_m\cos m\varphi][c_1e^{kz}+c_2e^{-kz}].
\end{align*}
The boundary conditions are:
$$\psi(\rho,\varphi,l)=\psi(\rho,\varphi),$$
where $\psi(\rho,\varphi)$ is a potential distribution. Elsewhere on the surface $\psi=0$. Now we find electrostatic potential
$$\psi(\rho,\varphi,z)=\sum_{k,m}\psi_{km}$$
inside the cylinder. From the boundary condition $\psi(\rho,\varphi,0)=0$, we find $c_1+c_2=1$. So we choose $c_1=-c_2=\frac{1}{2}$ and thereby $c_1e^{kz}+c_2e^{-kz}\sinh kz$. Since $\psi=0$ on the lateral surface of the cylinder, $\psi(a,\varphi,z)=0$. This implies that $J_m(ka)=0$. If we write the $n$-th Bessel zero as $a_{mn}$, then $k_{mn}a=a_{mn}$ or $k_{mn}=\frac{a_{mn}}{a}$. Hence,
$$\psi(\rho,\varphi,z)=\sum_{m=0}^\infty\sum_{n=1}^\infty J_m\left(\alpha_{mn}\frac{\rho}{a}\right)[a_m\sin m\varphi+b_m\cos m\varphi]\sinh\left(\alpha_{mn}\frac{z}{a}\right).$$
Finally using the boundary condition
$$\psi(\rho,\varphi)=\sum_{m=0}^\infty\sum_{n=1}^\infty J_m\left(\alpha_{mn}\frac{\rho}{a}\right)[a_m\sin m\varphi+b_m\cos m\varphi]\sinh\left(\alpha_{mn}\frac{1}{a}\right)$$ and the orthogonality of $\sin m\varphi$ and $\cos m\varphi$, we can determine the coefficients $a_m$ and $b_m$ as
\begin{align*}\left\{\begin{aligned}a_{mn}\\b_{mn}\end{aligned}\right\}=\frac{2}{\pi a^2\sinh\left(\alpha_{mn}\frac{1}{a}\right)J_{m+1}^2(\alpha_{mn})}\int_0^{2\pi}\int_0^a\psi(\rho,\varphi)&J_m\left(\alpha_{mn}\frac{\rho}{a}\right)\\
&\left\{\begin{aligned}
\sin m\varphi\\
\cos m\varphi
\end{aligned}\right\}\rho d\rho d\varphi.\end{align*}

Bessel Functions of the First Kind $J_n(x)$ II: Orthogonality

To accommodate boundary conditions for a finite interval $[0,a]$, we need to consider Bessel functions of the form $J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)$. For $x=\frac{\alpha_{\nu m}}{a}\rho$, Bessel’s equation (9) in here can be written as
\begin{equation}\label{eq:bessel10}\rho^2\frac{d^2}{d\rho^2}J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)+\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)+\left(\frac{\alpha_{\nu m}^2\rho}{a^2}-\frac{\nu^2}{\rho}\right)J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)=0.\end{equation} Changing $\alpha_{\nu m}$ to $\alpha_{\nu n}$, $J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)$ satisfies
\begin{equation}\label{eq:bessel11}\rho^2\frac{d^2}{d\rho^2}J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)+\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)+\left(\frac{\alpha_{\nu n}^2\rho}{a^2}-\frac{\nu^2}{\rho}\right)J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)=0.\end{equation}
Multiply \eqref{eq:bessel10} by $J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)$ and \eqref{eq:bessel11} by $J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)$ and subtract:
\begin{align*}
J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\frac{d}{d\rho}&\left[\rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\right]-J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\frac{d}{d\rho}\left[\rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\right]\\&=\frac{\alpha_{\nu n}^2-\alpha_{\nu m}^2}{a^2}\rho J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right).\end{align*}
Integrate this equation with respect to $\rho$ from $\rho=0$ to $\rho=a$:
\begin{equation}\begin{aligned}\int_0^\rho J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\frac{d}{d\rho}&\left[\rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\right]d\rho\\&-\int_0^\rho J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\frac{d}{d\rho}\left[\rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\right]d\rho\\&=\frac{\alpha_{\nu n}^2-\alpha_{\nu m}^2}{a^2}\int_0^\rho J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\rho d\rho.\end{aligned}\label{eq:bessel12}\end{equation}
Using Integration by Parts, we have
\begin{align*}
\int_0^\rho &J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\frac{d}{d\rho}\left[\rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\right]d\rho\\&=\left[\rho J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\right]_0^a-\int_0^a \rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)dJ_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right).\end{align*}
Thus \eqref{eq:bessel12} can be written as
\begin{equation}\begin{aligned}\left[\rho J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\right]_0^a-\left[\rho J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\rho\frac{d}{d\rho}J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\right]_0^a\\=\frac{\alpha_{\nu n}^2-\alpha_{\nu m}^2}{a^2}\int_0^\rho J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\rho d\rho.\end{aligned}\label{eq:bessel13}\end{equation}
Clearly the LHS of \eqref{eq:bessel13} vanishes at $\rho=0$. (Here we consider only $\nu=\mbox{integer}$ case.) It also vanishes at $\rho=a$ if we choose $\alpha_{\nu n}$ and $\alpha_{\nu m}$ to be $n$-th and $m$-th zeros of $J_\nu$. Therefore, for $m\ne n$
\begin{equation}\label{eq:bessel14}\int_0^\rho J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)J_\nu\left(\frac{\alpha_{\nu n}}{a}\rho\right)\rho d\rho=0.\end{equation}
\eqref{eq:bessel14} gives us orthogonality over the interval $[0,a]$.

For $m=n$, we have the normalization integral
\begin{equation}\int_0^a\left[J_\nu\left(\frac{\alpha_{\nu m}}{a}\rho\right)\right]^2\rho d\rho=\frac{a^2}{2}[J_{\nu+1}(\alpha_{\nu m})]^2.\end{equation}

Analyzing Graphs of Quadratic Functions

There are two important topics in this section: graphing the quadratic function $f(x)=ax^2+bx+c$ and finding the (absolute) maximum or the minimum value of $f(x)=ax^2+bx+c$.

First the sign of the leading coefficient $a$ tells us some information about the graph. If $a>0$ then the tail of the graph goes up, i.e. the graph is a smiling face $\smile$. If $a<0$ then the tail of the graph goes down, i.e. the graph is a frowning face $\frown$.

Using the completing the square $f(x)=ax^2+bx+c$ can be written as
$$f(x)=a(x-h)^2+k,$$
where $h=-\frac{b}{2a}$ and $k=f(h)=f\left(-\frac{b}{2a}\right)$. The ordered pair $\left(-\frac{b}{2a},f\left(-\frac{b}{2a}\right)\right)$ is called the vertex of the parabola $f(x)$ and the vertical line $x=-\frac{b}{2a}$ is called the axis of symmetry (this is the vertical line that divides the graph of $f(x)$ into two halves). If $a>0$, then $f\left(-\frac{b}{2a}\right)$ is the absolute minimum value of $f(x)$. If $a<0$, then $f\left(-\frac{b}{2a}\right)$ is the absolute maximum value of $f(x)$.

How to sketch the graph of $f(x)=a(x-h)^2+k$?

Your textbook is telling you to sketch the graph of $f(x)=a(x-h)^2+k$ using transformations that you learned in section 1.7 (here and here). In principle, it is right to use transformations but in practice there is an easier way to do. All you need is the sign of $a$, the vertex $\left(-\frac{b}{2a},f\left(-\frac{b}{2a}\right)\right)$, and the $y$-intercept $c$. (Although not required, it would be better if you know $x$-intercepts as well.)

Example. Let $f(x)=x^2+7x-8$.

(a) Find the vertex.

Solution. $-\frac{b}{2a}=-\frac{7}{2}$ and
\begin{align*}
f\left(-\frac{b}{2a}\right)&=f\left(-\frac{7}{2}\right)\\
&=\left(-\frac{7}{2}\right)^2+7\left(-\frac{7}{2}\right)-8\\
&=-\frac{81}{4}.
\end{align*}

(b) Find the axis of symmetry.

Solution. The axis of symmetry is the vertical line $x=-\frac{b}{2a}=-\frac{7}{2}$.

(c) Determine whether there is a maximum or minimum value and find that value.

Solution. Since $a=1>0$, there is a minimum and the minimum value is the $y$-coordinate of the vertex $f\left(-\frac{7}{2}\right)=-\frac{81}{4}$.

(d) Graph the function.

Solution. Since $a=1>0$, the graph is a parabola that opens up (smiling face). Also note that the $y$-intercept of $f(x)$ is $-8$. In fact, we can extract more information since $f(x)$ can be easily factored as $(x+8)(x-1)$, so the $x$-intercepts are $x=-8,1$.

Quadratic Equations

In this lecture note we study how to solve a quadratic equation $ax^2+bx+c=0$. There are three ways to solve a quadratic equation. The first one is

1. By Factoring: This is a typical method to solve a quadratic equation whenever the polynomial $ax^2+bx+c$ can be easily factored. Here is an example.

Example. Solve the quadratic equation $x^2-3x-4=0$ by factoring.

Solution. The polynomial $x^2-3x-4$ is factored as $(x-4)(x+1)$. So the equation is $(x-4)(x+1)=0$. This means that $x-4=0$ or $x+1=0$, i.e. we obtain two real solutions $x=-1$ or $x=4$.

Example. Solve the quadratic equation $x^2-3=0$.

Solution 1. Recall the factorization formula $(a^2-b^2)=(a+b)(a-b)$. Now
\begin{align*}
x^2-3&=x^2-(\sqrt{3})^2\\
&=(x+\sqrt{3})(x-\sqrt{3}).
\end{align*}
Thus our equation becomes $(x+\sqrt{3})(x-\sqrt{3})=0$ whose solutions are $x=\pm\sqrt{3}$.

Solution 2. The quadratic equation can be written as $x^2=3$. Solving this equation for $x$, we obtain $x=\pm\sqrt{3}$.

Next method is

2. By Completing the Square:

This is a method that can be used to solve any quadratic equation. First note that \begin{equation}\label{eq:cts}x^2+bx+\left(\frac{b}{2}\right)^2=\left(x+\frac{b}{2}\right)^2.\end{equation}

Example. Solve the equation $x^2-6x-10=0$ by completing the square.

Solution. By adding 10 to each side of the equation, we obtain
\begin{equation}\label{eq:cthex1}x^2-6x=10.\end{equation} Note that half of the coefficient of $x$ is $\frac{-6}{2}=-3$. Add $(-3)^2$ to each side of \eqref{eq:cthex1}:
\begin{equation}\label{eq:cthex1a}x^2-6x+(-3)^2=10+(-3)^2.\end{equation} Now notice that the LHS of \eqref{eq:cthex1a} is exactly the same form as the LHS of the forumula \eqref{eq:cth}. Hence, the equation \eqref{eq:cthex1a} becomes
$$(x-3)^2=19.$$ Solving this for $x-3$, we obtain $x-3=\pm\sqrt{19}$. That is, $x=3\pm\sqrt{19}$.

While completing the square can be a useful tool for some other things, I do not strongly recommend this method because there is a more convenient method of solving quadratic equations.

3. By the Quadratic Formula:

If you apply the method by completing the square to solve the quadratic equation $ax^2+bx+c=0$, we obtain the quadratic formula
\begin{equation}\label{eq:quadform}x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}\end{equation}

Example. Solve the quadratic equation $3x^2+2x-7=0$.

Solution. $a=3$, $b=2$, and $c=-7$. Thus
\begin{align*}
x&=\frac{-b\pm\sqrt{b^2-4ac}}{2a}\\
&=\frac{-2\pm\sqrt{2^2-4(3)(-7)}}{2(3)}\\
&=\frac{-1\pm\sqrt{22}}{3}.
\end{align*}

The expression inside radical $b^2-4ac$ is called the discriminant. Using the discriminant, we can tell the following without solving the equation itself.

Theorem. For $ax^2+bx+c=0$ with $a\ne 0$,

  • If $b^2-4ac>0$, then the equation has two distinct real solutions.
  • If $b^2-4ac=0$, then the equation has only one real solution (which is $x=-\frac{b}{2a}$).
  • If $b^2-4ac<0$, then the equation has two complex solutions that are conjugate of each other.

Update: There is a convenient formula for quadratic equations of the form $ax^2+bx+c=0$ with $b=2b’$ i.e. a multiple of 2. I wrote about it as a forum entry. For details click here.