System of Linear Equations and Determinant

In this note, we discuss the relationship between a system of linear equations and the determinant of its coefficients. For simplicity, I am considering only a system of two linear equations with two variables. But a similar argument can be used for more general cases. Let us consider the system of linear equations $$\left\{\begin{aligned}ax+by&=e\\cx+dy&=f\end{aligned}\right.,$$ where none of $a,b,c,d,e,f$ is zero. The two linear equations are equations of lines in the plane, so we know there are three possibilities: There is no solution of the system in which case the two lines are parallel (so they do not meet), the system has a unique solution in which case the two lines meet at exactly one point, or the system has infinitely many solutions in which case the two lines are identical. This system can be written in terms of matrices as $$\begin{pmatrix}a & b\\c & d\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}e\\f\end{pmatrix}$$ Let $A=\begin{pmatrix}a & b\\c & d\end{pmatrix}$. If $\det A\ne 0$, then the system has a unique solution and it can be found using the Cramer’s rule as follows: $$x=\frac{\begin{vmatrix}e & b\\f & d\end{vmatrix}}{\det A},\ y=\frac{\begin{vmatrix}a & e\\c & f\end{vmatrix}}{\det A}$$ Note that $\det A=0$ if and only if the two lines have the same slope. Suppose that $\det A=0$. Then one can easily show that $\begin{vmatrix}e & b\\f & d\end{vmatrix}=0$ if and only if $\begin{vmatrix}a & e\\c & f\end{vmatrix}=0$. From $\det A=0$ and $\begin{vmatrix}e & b\\f & d\end{vmatrix}=0$, we have the system of equations: \begin{align}\label{eqn1}ad-bc&=0\\\label{eqn2}ed-fc&=0\end{align} Subtracting $a$ times \eqref{eqn2} from $e$ times \eqref{eqn1} yields $b(af-ec)=0$. Since $b\ne 0$, $af-ec=\begin{vmatrix}a & e\\c & f\end{vmatrix}=0$ which means that the two lines have the same $y$-intercept. This is the case when the two lines coincide and hence the system has infinitely many solutions (all the points on the line are solutions). Lastly, we know $\begin{vmatrix}e & b\\f & d\end{vmatrix}\ne0$ if and only if $\begin{vmatrix}a & e\\c & f\end{vmatrix}\ne0$. If $\begin{vmatrix}a & e\\c & f\end{vmatrix}\ne0$ while $\det A=0$, from the Cramer’s rule the system does not have a solution. $\begin{vmatrix}a & e\\c & f\end{vmatrix}\ne0$ means that the two lines have different $y$-intercepts, so this is the case when the two lines are parallel i.e. they do not meet. A system of homogeneous linear equations $$\left\{\begin{aligned}ax+by&=0\\cx+dy&=0\end{aligned}\right.$$ comes down to only two cases: the system has a unique solution $x=y=0$ (if $\det A\ne 0$) or has infinitely many solutions (if $\det A=0$). This is also obvious from considering two lines passing through the origin.

Should the sign of Coulomb potential be positive or negative?

In classical physics, the sign of a potential (say, gravitational potential or electric potential) is merely a convention. For example, since the electric field $\mathbb{E}$ is conservative, it is the gradient of some potential, which we call the electric potential or Coulomb potential $V$, so it can be written as $\mathbb{E}=\nabla V$ or $\mathbb{E}=-\nabla V$. Mathematically, it doesn’t matter whichever you use. If you choose $\mathbb{E}=\nabla V$, Coulomb potential should be $V=-\frac{1}{4\pi\epsilon_0}\frac{Q}{r}$ and if you choose $\mathbb{E}=-\nabla V$, $V=\frac{1}{4\pi\epsilon_0}\frac{Q}{r}$. In physics, the usual convention is $\mathbb{E}=-\nabla V$ and it is interpreted as electric field pointing downhill towards lower voltages.

On the other hand, in quantum mechanics the sign of Coulomb potential matters and it does depend on the problem that you are studying. In a hydrogen atom or a hydrogen-like atom (for example, an ionized helium atom $\mathrm{He}^+$) an electron is trapped in the atom by Coulomb force between proton and electron. In this case, we say the electron is in a bound state. The bound state with the minimum energy is called the ground state and the minimum energy is called the ground state energy. A bound state with energy higher than the ground state energy is said to be unstable and the electron in an unstable bound state always tend to move to the ground state by emitting photons. Modeling bound state of a hydrogen atom or a hydrogen-like atom requires negative Coulomb potential $V(r)=-\frac{Z\alpha}{r}$ as seen in the following figure.

Coulomb potential (in red) with bound state energy (in blue)

In this simple figure, the electron is trapped in the region $0<r<1$. The figure also clearly shows you that bound state energy must be negative. By drawing a picture, one can easily see that bound state cannot be modeled with positive Coulomb potential, but positive Coulomb potential is used to model scattering of a particle.

The Momentum Representation

This note is based on my friend Khin Maung’s short lecture.

Let us begin with Schrödinger equation
$$\left(\frac{\hat p^2}{2m}+\hat V(r)\right)|\psi\rangle=E|\psi\rangle$$
Use the completeness relation
$$1=\int|\vec{p}\rangle\langle\vec{p}|d\vec{p}$$
to get the momentum space representation of the Schrödinger equation
\begin{equation}
\label{eq:schrodingerms}
\frac{p^2}{2m}\psi(\vec{p})+\int\langle\vec{p}|\hat V(r)|\vec{p’}\rangle\psi(\vec{p’})d\vec{p’}=E\psi(\vec{p})
\end{equation}
where $\psi(\vec{p}):=\langle\vec{p}|\psi\rangle$. \eqref{eq:schrodingerms} is called the Schrödinger equation in momentum space. Using the completeness relation
$$1=\int|\vec{r}\rangle\langle\vec{r}|d\vec{r}$$
we obtain
$$\langle\vec{p}|\hat V(r)|\vec{p’}\rangle=\frac{1}{(2\pi\hbar)^3}\int e^{\frac{i}{\hbar}(\vec{p’}-\vec{p})\cdot\vec{r}}V(r)d\vec{r}$$
Here, recall that $\langle\vec{p}|\vec{r}\rangle=\frac{1}{(2\pi\hbar)^{\frac{3}{2}}}e^{-\frac{i}{\hbar}\vec{p}\cdot\vec{r}}$. Let $\vec{q}=\vec{p’}-\vec{p}$ and $V(\vec{q})=\langle\vec{p}|\hat V(r)|\vec{p’}\rangle$. Then we have
$$V(\vec{q})=\frac{1}{(2\pi\hbar)^3}\int e^{\frac{i}{\hbar}\vec{q}\cdot\vec{r}}V(r)d\vec{r}$$
This is just the Fourier transform of $V(r)$. For the Yukawa potential
$$V(r)=V_0\frac{e^{-\mu r}}{r}$$
\begin{align*} V(\vec{q})&=\frac{1}{(2\pi\hbar)^3}\int e^{\frac{i}{\hbar}qr\cos\theta}\int_0^{2\pi}\int_0^\pi\int_0^\infty V(r)r^2dr\sin\theta d\theta d\phi\\ &=\frac{1}{(2\pi)^2\hbar^3}\int_0^\infty V(r)r^2\int_{-1}^1e^{\frac{i}{\hbar}qru}dudr\\ &=\frac{1}{(2\pi\hbar)^2iq}\int_0^\infty V(r)r(e^{\frac{i}{\hbar}qr}-e^{-\frac{i}{\hbar}qr})dr\\ &=\frac{V_0}{(2\pi\hbar)^2iq}\int_0^\infty e^{-\mu r}(e^{\frac{i}{\hbar}qr}-e^{-\frac{i}{\hbar}qr})dr\\ &=\frac{V_0}{2\pi^2\hbar^3}\frac{1}{\mu^2+\frac{q^2}{\hbar^2}} \end{align*}
From here on, we assume that $\hbar=1$ for simplicity.

Let
$$\psi(\vec{p})=\psi_l(p)Y_l^m(\hat p)$$
Here, $\hat p$ stands for the unit vector in the momentum space in spherical coordinates $\hat p=\frac{\vec{p}}{p}=(\theta_l,\phi_p)$ where $\theta_l$ is the polar angle and $\phi_p$ is the azimuth angle corresponding to the momentum vector $\vec{p}$. $V(\vec{q})$ can be written as the series
\begin{align*} V(\vec{q})&=\sum_{l=0}^\infty\sum_{m=-l}^lV_l(p,p’)Y_l^m(\hat p)Y_l^{m}(\hat p’)\\
&=\sum_{l=0}^\infty\frac{2l+1}{4\pi}V_l(p,p’)P_l(x)
\end{align*} The last line is obtained by the addition theorem for spherical harmonics $$\frac{4\pi}{2l+1}\sum_{m=-l}^lY_l^m(\hat p)Y_l^{m}(\hat p’)=P_l(x)$$
where $x=\cos\theta_{pp’}$.
By the orthonormality of spherical harmonics
$$\int_0^{2\pi}\int_0^\pi Y_{l_1}^{m_1}(\hat p)Y_{l_2}^{m_2}(\hat p)\sin\theta d\theta d\phi=\delta_{l_1l_2}\delta_{m_1m_2}$$
the Schrödinger equation in momentum space \eqref{eq:schrodingerms} yields
$$\frac{p^2}{2m}\psi_l(p)+\int_0^\infty V_l(p,p’)\psi_l(p’)p’^2dp’=E\psi_l(p)$$
Using the orthogonality of Legendre polynomials
$$\int_{-1}^1P_{l’}(x)P_l(x)dx=\frac{2}{2l+1}\delta_{ll’}$$
we obtain
$$V_l(p,p’)=2\pi\int_{-1}^1V(\vec{q})P_l(x)dx$$
For the Yukawa potential, we have
$$V(\vec{q})=\frac{V_0}{2\pi^2}\frac{1}{\mu^2+q^2}=\frac{V_0}{2\pi^2}\frac{1}{p^2+p’^2+\mu^2-2pp’x}$$
The Neumann’s formula yields then
$$V_l(p,p’)=\frac{V_0}{pp’\pi}Q_l\left(\frac{p^2+p’^2+\mu^2}{2pp’}\right)$$
Note here that we require $\left|\frac{p^2+p’^2+\mu^2}{2pp’}\right|>1$. Recall that
$$Q_0(z)=\frac{1}{2}\ln\frac{z+1}{z-1}$$
for $|z|>1$. Hence for the Yukawa potential, $V_0(p,p’)$ can be written as
$$V_0(p,p’)=\frac{V_0}{2pp’\pi}\ln\frac{(p+p’)^2+\mu^2}{(p-p’)^2+\mu^2}$$
With $V_0=Z\alpha$ and $\mu=0$, the Yukawa potential reduces to the Coulomb potential $V(r)=\frac{Z\alpha}{r}$. $V_l(p,p’)$ is then given by$$V_l(p,p’)=\frac{Z\alpha}{pp’\pi}Q_l\left(\frac{p^2+p’^2}{2pp’}\right)$$
and accordingly,$$V_0(p,p’)=\frac{Z\alpha}{pp’\pi}\ln\left|\frac{p+p’}{p-p’}\right|$$

Neumann’s Formula

Today I learned a pretty cool formula called Neumann’s formula while reading a paper by Maurice Lévy, Wave equations in momentum space, Proceedings of the Royal Society of London. Series A, Vol. 204, No. 10 (7 December 1950), pp. 145-169. When $n$ is a positive integer and $|z|>1$, the Legendre function of the second kind $Q_n(z)$ can be expressed in terms of the Legendre function of the first kind $P_n(x)$ as
$$Q_n(z)=\frac{1}{2}\int_{-1}^1\frac{P_n(x)}{z-x}dx$$
A derivation of this formula can be found on p. 320 of E. T. Whittaker and G. N. Watson, A Course in Modern Analysis, 4th edition, Cambridge University Press, 1927 as cited in the Lévy’s paper (the page number is incorrectly cited as p. 330). The formula is originally appeared in a paper by J. Neumann (the author’s name is incorrectly cited as F. Neumann in Whittaker & Watson), Entwicklung der in elliptischen Coordinaten ausgedrückten reciproken Entfernung zweier Puncte in Reihen, welche nach den Lalace’schen $Y^{(n)}$ fortschreiten; und Anwendung dieser Reihen zur Bestimmung des magetischen Zustandes eines Rotations-Ellipsoïds, welcher durch vertheilende Kräfte erregt ist. pp. 21-50 (the formula appears on page 22), Journal für die reine und angewandte Mathematik (Crelle’s journal), de Gruyter, 1848. The title is unusually long. It reads like an abstract rather than a title. Maybe it was not unusual back then.

The formula can be used to evaluate the following integral
$$I=2\pi\int_0^\pi\frac{P_l(\cos\theta)\sin\theta d\theta}{|\vec{p}-\vec{p’}|^2+\mu^2}$$
which appears in the momentum representation of Schrödinger equation with Yukawa potential. With $x=\cos\theta$, the integral $I$ can be written as
\begin{align*} I&=2\pi\int_{-1}^1\frac{P_l(x)dx}{p^2+p’^2-2pp’x+\mu^2}\\ &=\frac{2\pi}{pp’}\frac{1}{2}\int_{-1}^1\frac{P_l(x)dx}{\frac{p^2+p’^2+\mu^2}{2pp’}-x}\\ &=\frac{2\pi}{pp’}Q_l\left(\frac{p^2+p’^2+\mu^2}{2pp’}\right) \end{align*}

Maxwell-Boltzmann Statistics

From here, we continue to consider the problem of distributing $N$ particles into $K$ boxes. Assume that the probability of a particle going into box $i$ is the same as the probability of a particle going into box $j$ for all $i,j$, i.e. we assume the equal probability for the distribution of a single particle going into a box. Let us call it $p$. Then the probability of distributing $n_1$ particles in box 1, $n_2$ particles in box 2, …, $n_K$ particles in box $K$ is
\begin{equation}
\begin{aligned}
p&=N!\prod_{i=1}^K\frac{1}{n_i!}p^{n_1}p^{n_2}\cdots p^{n_K}\\
&=N!\prod_{i=1}^K\frac{1}{n_i!}p^N
\end{aligned}\label{eq:maxwell-boltzmann}
\end{equation}
We want to find the distribution of particles into different boxes by maximizing the probability \eqref{eq:maxwell-boltzmann}. Since $p^N$ is constant, maximizing the probability is the same as maximizing $W:=N!\prod_{i=1}^K\frac{1}{n_i!}$. Boltzmann defined entropy corresponding to a distribution of particles by
$$S=k\log W$$
where $k$ is the Boltzmann constant. By Sterling’s formula, we can write $S$ as
$$S\approx k[N\log N-N-\sum_i(n_i\log n_i-n_i)]$$
We are going to neglect $N\log N -N$ from this entropy, so the form of entropy we are considering is
\begin{equation}
\label{eq:maxwell-boltzmann2}
S\approx -k\sum_i(n_i\log n_i-n_i)
\end{equation}
This amounts to dropping $N!$ from $W$. The reason for this mysterious step is to avoid the so-called Gibbs paradox. For details about Gibbs paradox see, for example, [1] of the references at the end of this note.
Let $\epsilon_i$ denote the single particle energy. We have two conservative quantities that we want to keep fixed: the particle number $N=\sum_i n_i$ and the energy $U=\sum_i n_i\epsilon_i$. So, we add the terms of these constraints to $S$:
$$S\approx k[-\sum_i(n_i\log n_i-n_i)]+\beta(U-\sum_i n_i\epsilon_i)-\beta\mu (N-\sum_i n_i)$$
$\frac{\partial S}{\partial n_i}=0$ results in the critical point $n_i=e^{-\beta(\epsilon_i-\mu)}$. This is the value at which the probability and entropy assume a maximum. This is called the Maxwell-Boltzmann distribution.
From the constraints, we obtain
\begin{align*} N&=\sum_i e^{-\beta(\epsilon_i-\mu)}\\ U&=\sum_i \epsilon_ie^{-\beta(\epsilon_i-\mu)} \end{align*}
Substituting $n_i=e^{-\beta(\epsilon_i-\mu)}$ in \eqref{eq:maxwell-boltzmann2}, the value of the entropy at the maximum is given by
\begin{equation}
\begin{aligned}
S&\approx k[\beta\sum_i\epsilon_i e^{-\beta(\epsilon_i-\mu)}-\beta\mu\sum_i e^{-\beta(\epsilon_i-\mu)}+\sum_i e^{-\beta(\epsilon_i-\mu)}]\\
&=k[\beta U-\beta\mu N+N]
\end{aligned}\label{eq:maxwell-boltzmann3}
\end{equation}
We are going to determine $\mu$ and $\beta$. The single particle kinetic energy is $\epsilon=\frac{p^2}{2m}$. The summation covers all possible states of each particle. This means that we may replace the summation by an integration over momentum and position:
$$N\to e^{\beta\mu}\int d^3xd^3p e^{-\beta\frac{p^2}{2m}},\ U\to e^{\mu\beta}\int d^3xd^3p \frac{p^2}{2m}e^{-\beta\frac{p^2}{2m}}$$
However, note that the number of states cannot be given only by $d^3x d^3p$ because of its dimension. To make it dimensionless, we make the following quantum mechanical correction:
\begin{equation}
\label{eq:maxwell-boltzmann4}
\frac{d^3x d^3p}{h^3}=\frac{d^3 xd^3p}{(2\pi\hbar)^3}
\end{equation}
Recall that the Planck constant $h$ has the dimension length$\times$momentum.
\begin{equation}
\begin{aligned}
N&=\frac{e^{\beta\mu}}{h^3}\int d^3xd^3p e^{-\beta\frac{p^2}{2m}}\\
&=\frac{e^{\beta\mu}}{h^3}V\left(\frac{2m\pi}{\beta}\right)^{\frac{3}{2}},\\
U&=\frac{e^{\beta\mu}}{h^3}\int d^3xd^3p \frac{p^2}{2m}e^{-\beta\frac{p^2}{2m}}\\
&=\frac{e^{\beta\mu}}{h^3}V\frac{3}{2\beta}\left(\frac{2m\pi}{\beta}\right)^{\frac{3}{2}}
\end{aligned}\label{eq:maxwell-boltzmann5}
\end{equation}
For some details about Gaussian integrals, see here. From \eqref{eq:maxwell-boltzmann5}, we obtain
\begin{align*} \beta&=\frac{3N}{2U},\\ \beta\mu&=\log\left[\frac{Nh^3}{V}\left(\frac{\beta}{2\pi m}\right)^{\frac{3}{2}}\right]=\log\left[\frac{Nh^3}{V}\left(\frac{3N}{4\pi mU}\right)^{\frac{3}{2}}\right] \end{align*}
Consequently, the entropy in \eqref{eq:maxwell-boltzmann3} can be written as
\begin{equation}
\label{eq:maxwell-boltzmann6}
S=kN\left[\frac{5}{2}+\log\left(\frac{V}{N}\right)+\frac{3}{2}\log\left(\frac{U}{N}\right)+\frac{3}{2}\log\left(\frac{4\pi m}{3h^2}\right)\right]
\end{equation}
\eqref{eq:maxwell-boltzmann6} is known in statistical mechanics as the Sackur-Tetrode formula for the entropy of a classical ideal gas (we will soon see its relationship with an ideal gas). According to the Huang’s book [1], this formula has been experimentally verified as the correct entropy of an ideal gas at high temperatures.

Differentiating $S$ in \eqref{eq:maxwell-boltzmann6}, we obtain
\begin{align*} dU&=\frac{\partial U}{\partial S}dS-\frac{\partial U}{\partial S}\frac{\partial S}{\partial N}dN-\frac{\partial U}{\partial S}\frac{\partial S}{\partial V}dV\\ &=\frac{1}{k\beta}dS-\frac{NkT}{V}+\mu dN \end{align*}
Comparing this with
$$dU=TdS-pdV+\mu dN$$ from the first law of thermodynamics, we have
\begin{align*} \beta&=\frac{1}{kT},\\ p&=\frac{NkT}{V} \end{align*}
The second equation is the well-known ideal gas equation of state. The chemical potential $\mu$ and the internal energy $U$ can be expressed as functions of the temperature $T$ as
\begin{align*} \mu&=kT\log\left[\frac{h^3N}{V}\frac{1}{(2\pi mkT)^{\frac{3}{2}}}\right],\\ U&=\frac{3}{2}NkT \end{align*}

References:

  1. Kerson Huang, Statistical Mechanics, John Wiley & Sons, 1987
  2. V. P. Nair, Lectures on Thermodynamics and Statistical Mechanics