Dirac Sea and Antiparticles

In here, I mentioned that one of the issues of the Klein-Gordon equation is that it admits solutions yielding negative energies.This issue continues to appear with the Dirac equation which is a relativistic generalization of the Schrödinger equation. (I will discuss the Dirac equation later in a different note.) The possibility that an electron can keep falling down to a higher negative energy level indefinitely seems unphysical and initially the suggestion had to face a huge backlash from physicists including Wolfgang Pauli. P.A.M. Dirac came up with a brilliant idea based on electron hole that each negative energy state is filled by an electron (remember that electrons are fermions so no more than one electron can occupy the same energy state due to Pauli’s exclusion principle). This idea is called Dirac sea of infinite electrons. Since all negative energy states are already occupied by electrons, an electron cannot fall down below zero energy level. This may not, however, definitely be true as David Hilbert has shown in his paradox of the Grand Hotel even after the negative energy states are all occupied by electrons it may accommodate additional electrons. Dirac also suggested that it may be possible that all negative energy levels are filled by electrons except for one. This would leave a hole with a negative energy. This hole was interpreted as a positron, the antiparticle on an electron. While it is a brilliant idea, Dirac sea also appears to be unphysical. Regardless, positron was discovered by Carl Anderson in 1932 and no one raised an issue about it afterwards. Move along, nothing to see here. Still it seems that many physicists are not very comfortable with the notion of Dirac sea and that they don’t believe that it is an actual physical reality. Dirac sea is nowadays introduced more for a pedagogical purpose rather than for the purpose of defining antiparticles. In modern quantum field theory, antiparticles are defined by wave functions traveling backward in time. If I remember correctly, this definition of antiparticles is due to John Archibald Wheeler. Note that those wave functions traveling backward in time do have negative energies.

Here is a thought. Physically an electron would have the minimum energy at rest and the rest energy is given by $E_0=mc^2$. This can be obtained by putting ${\bf p}\cdot{\bf p}=0$ in the relativistic energy-momentum relation. In fact, since ${\bf p}\cdot{\bf p}\geq 0$, we have $$E^2\geq m_0^2c^4$$ which implies that either $E\geq m_0c^2$ or $E\leq -m_0c^2$. So we can say that the energy of an electron cannot be negative and that the negative energy condition could be considered as merely a mathematical fluke. But if that were the case, what about antiparticles? That is a big question which seems to have no apparent answer within conventional quantum theory and for this reason physicists are still sticking to the negative energies.

I am currently working on an unconventional quantum theory and this might shed light on alternative possibility of antiparticles. Those who are curious can read about its brief idea and motivation here. In that quantum theory, antiparticles, while having positive energies, are described by wave functions that have negative probabilities. Of course the notion of negative probabilities sounds unphysical but it actually isn’t in this case. According to the theory, antiparticles do not live in our universe but in its twin parallel universe where the roles of time coordinate and a spatial coordinate are switched from those in our universe. While the wave function of an antiparicle is seen to have a negative probability in our universe, it actually has a positive probability in its own universe. I will write more details about it elsewhere in the very near future.

Matter and Waves

Since Huygens and Newton, physicists have known that light is described by (electromagnetic) waves and particles (photons). Such peculiar nature of light is called wave-particle duality. What about material particles such as electrons? de Broglie proposed a bold hypothesis that what is true for light is also true for material particles i.e. they will also exhibit wave nature. (This was indeed confirmed by experiments.) So how do we mathematically model such a wave? What physicists thought of using to study material particles was a complex plane wave called de Broglie wave. It looks like $$\psi(x,t)=Ae^{i(kx-\omega t)}$$ for 1-dimensional case. For 3-dimensional case, it would be \begin{equation}\label{eq:planewave}\psi({\bf r},t)=Ae^{i({\bf k}\cdot{\bf r}-\omega t)}\end{equation} Before we continue, one may wonder how physicists came up with this kind of wave. I can only speculate but such a complex plane wave was already well-known to physicists as it is a solution of Maxwell’s equation in electromagnetism. Hence, complex plane wave may describe electromagnetic wave, and naturally it became the first candidate for modeling material particles. In fact, it worked out well as we shall see and consequently complex numbers played a crucial role in building quantum mechanics.

Let us first study some properties of plane waves. The plane wave \eqref{eq:planewave} describes a free particle, more accurately a free particle in a state. In order for a plane wave to behave like a particle, we want it to be localized i.e. the wave is defined in a tiny region. (There is a more mathematically subtle reason why we require this.) We can achieve this by redefining $\psi({\bf r},t)$ as $$\psi({\bf r},t)=\left\{\begin{array}{ccc}Ae^{i({\bf k}\cdot{\bf r}-\omega t)} & \mbox{for} & {\bf r}\ \mbox{within a volume}\ V=L^3\\0 & \mbox{for} & {\bf r}\ \mbox{outside a volume}\ V=L^3\end{array}\right.$$ Physicists call this box renormalization. Physically the state of a particle must not depend on a particular location of the tiny box, so we require the periodicity condition $$\psi(x,y,z,t)=\psi(x+L,y,z,t)=\psi(x,y+L,z,t)=\psi(x,y,z+L,t)$$ Here, $L$ is called wave length. The periodicity condition implies that ${\bf k}$ is quantized as $${\bf k}=\frac{2\pi}{L}{\bf n}$$ where ${\bf k}=(k_x,k_y,k_z)$, ${\bf n}=(n_x,n_y,n_z)$, and $n_i=0,1,2,\cdots$, $i=x,y,z$. The vector ${\bf k}$ is called wave vector and for 1-dimensional case, $k$ is called wave number. If the wave is periodic in time, say $\psi(x,t)=\psi(x,t+T)$, then we obtain $e^{-i\omega T}=1$. The nonzero minimum value of $T$ is $T=\frac{2\pi}{\omega}$. $\omega=\frac{2\pi}{T}$ is called angular frequency. $kx-\omega t$ is called phase and if $kx-\omega t$ is constant, the wave moves at the speed $v_p=\frac{dx}{dt}=\frac{\omega}{k}$. This $v_p$ is called phase velocity. For 3-dimensional case, \begin{align*}\psi({\bf r},t)&=Ae^{i({\bf k}\cdot{\bf r}-\omega t)}\\&=Ae^{i{\bf k}\cdot\left({\bf r}-\frac{\omega t}{|{\bf k}|^2}{\bf k}\right)}\\&=Ae^{i{\bf k}\cdot\left({\bf r}-\frac{\omega t}{|{\bf k}|}\hat{\bf k}\right)}\end{align*}So, the phase velocity would be $${\bf v}_p=\frac{d{\bf r}}{dt}=\frac{\omega}{|{\bf k}|}\hat{\bf k}$$

The image of wave function $\psi(x,t)=Ae^{i(kx-\omega t)}$ is a circle. We are in fact quite familiar with this kind of waves. On a beautiful day, you go to a lake. You would be then tempted to throw a rock into the cam water. When you do, you would see circular water waves spreading out from the point of impact.


[1] Walter Greiner, Quantum Mechanics, An Introduction, 4th Edition, Springer, 2001

[2] Quantum Mechanics, H.-S. Song (in Korean)

Related Rates

Related rates problems often involve (context-wise) real-life applications of the chain rule/implicit differentiation. Here are some of the examples that are commonly seen in calculus textbooks.

Example. Car A is traveling west at 50mi/h and car B is traveling north at 60mi/h. Both are headed for the intersection of the two roads. At what rate are the cars approaching each other when car A is 0.3 mi and car B is 0.4 mi from the intersection?


Denote by $x$ and $y$ the distances from the intersection to car A and to car B, respectively. Then we have $\frac{dx}{dt}=-50$mi/h and $\frac{dy}{dt}=-60$mi/h. Let us denote $z$ the distance between $A$ and $B$. Then by Pythagorean law we have $$z^2=x^2+y^2$$ Differentiating this with respect to $t$, we obtain $$z\frac{dz}{dt}=x\frac{dx}{dt}+y\frac{dy}{dt}$$ and thus \begin{align*}\frac{dz}{dt}&=\frac{1}{z}\left[x\frac{dx}{dt}+y\frac{dy}{dt}\right]\\&=\frac{1}{0.5}[0.3(-50)+0.4(-60)]=-78\mathrm{mi/h}\end{align*}

Example. Air is being pumped into a spherical balloon so that its volume increases at a rate of $100\mathrm{cm}^3/\mathrm{s}$. How fast is the radius of the balloon increasing when the diameter is 50 cm?

Solution. Let $V$ and $r$ denote the volume and the radius of the spherical balloon. Then $V=\frac{4}{3}\pi r^3$. Differentiating this with respect to $t$, we obtain $$\frac{dV}{dt}=4\pi r^2\frac{dr}{dt}$$ So, \begin{align*}\frac{dr}{dt}&=\frac{1}{4\pi r^2}\frac{dV}{dt}\\&=\frac{1}{4\pi(25)^2}100\\&=\frac{1}{25\pi}\mathrm{cm/s}\end{align*}

Example. Gravel is being dumped from a conveyor belt at a rate of $30 \mathrm{ft}^3/\mathrm{min}$ and its coarseness is such that it forms a pile in the shape of a cone whose base diameter and height are the same. How fast is the height of the pile increasing when the pile is 10 ft high?

Solution. The cross section of the gravel pile is shown in the figure below.

The amount of gravel dumped is the same as the volume of the cone. Let us denote the volume by $V$, its base radius by $r$, and its height by $h$. Then $V=\frac{1}{3}\pi r^2h$. Since $h=2r$, $V$ can be written as $$V=\frac{1}{12}\pi h^3$$ Differentiating this with respect to $t$, we obtain $$\frac{dV}{dt}=\frac{1}{4}\pi h^2\frac{dh}{dt}$$ So, we have \begin{align*}\frac{dh}{dt}&=\frac{4}{\pi h^2}\frac{dV}{dt}\\&=\frac{4}{\pi(10)^2}(30)=\frac{1.2}{\pi}\mathrm{ft/min}\approx 0.38\mathrm{ft/min}\end{align*}

Example. A ladder 10 ft long rests against a vertical wall. If the bottom of the ladder slides away from the wall at a rate of 1 ft/s, how fast is the top of the ladder sliding down the wall when the bottom of the ladder is 6 ft from the wall?


Let us denote by $x$ and $y$ the distance from the wall to the bottom of the ladder and the distance from the top of the ladder to the floor, respectively. By Pythagorean law, we have $x^2+y^2=100$. Differentiating this with respect to $t$, we obtain $$x\frac{dx}{dt}+y\frac{dy}{dt}=0$$ Hence, we have \begin{align*}\frac{dy}{dt}&=-\frac{x}{y}\frac{dx}{dt}\\&=-\frac{6}{8}(1)=-\frac{3}{4}\mathrm{ft/s}\end{align*}

Example. A water tank has the shape of an inverted circular cone with base radius 2m and heigh 4 m. If water is being pumped into the tank at a rate of $2 \mathrm{m}^3/\mathrm{min}$, find the rate at which the water level is rising when the water is 3 m deep.

Solution. The cross section of the water tank is shown in the figure below.

The amount of water $V$ when the water level is $h$ and the surface radius is $r$ is $V=\frac{1}{3}\pi r^2h$. From the above figure we have the following ratio holds $$\frac{2}{4}=\frac{r}{h}$$ i.e. $r=\frac{h}{2}$. SO $V$ can be written as $$V=\frac{1}{12}\pi h^3$$ Differentiating this with respect to $t$, we obtain $$\frac{dV}{dt}=\frac{1}{4}\pi h^2\frac{dh}{dt}$$ Hence, \begin{align*}\frac{dh}{dt}&=\frac{4}{\pi h^2}\frac{dV}{dt}\\&=\frac{4}{\pi(3)^2}(2)\\&=\frac{8}{9\pi}\mathrm{m/min}\approx 0.28\mathrm{m/min}\end{align*}

The Klein-Gordon Equation

For the Schrödinger equation $$i\hbar\frac{\partial\psi}{\partial t}=\hat H\psi({\bf x},t),$$ the Hamiltonian $$\hat H=-\frac{\hbar^2}{2m_0}\nabla^2+V({\bf x})$$ corresponds to the nonrelativistic energy-momentum relation $$\hat E=\frac{\hat p^2}{2m_0}+V({\bf x})$$ where $$\hat E=i\hbar\frac{\partial}{\partial t},\ \hat p=-i\hbar\nabla$$ So, naturally considering the relativistic energy-momentum relation \begin{equation}\label{eq:e-m}\frac{E^2}{c^2}-{\bf p}\cdot{\bf p}=m_0^2c^2\end{equation} would be the starting point to obtain a relativistic generalization of the Schrödinger equation. Replacing $E$ and ${\bf p}\cdot{\bf p}$ in \eqref{eq:e-m} by operators $$\hat E=i\hbar\frac{\partial}{\partial t}\ \mbox{and}\ \hat p\cdot\hat p=-\hbar^2\nabla^2$$ acting on a wave function $\psi$, we obtain the Klein-Gordon equation for a free particle \begin{equation}\label{eq:k-g}\left(\Box-\frac{m_0^2c^2}{\hbar^2}\right)\psi=0\end{equation} where $$\Box=-\frac{1}{c^2}\frac{\partial^2}{\partial t^2}+\nabla^2$$

Free solutions of the Schrödinger equation with $V({\bf x})=0$ are of the form $$\psi=\exp\left[\frac{i}{\hbar}(-Et+{\bf p}\cdot{\bf x})\right]$$ They are also free solutions of the Klein-Gordon equation \eqref{eq:k-g} with the energy condition $$E=\pm c\sqrt{m_0^2c^2+p^2}$$ The solutions yielding negative energies appear to be unphysical and initially considered so by physicists, but later they were interpreted as antiparticles. Antiparticles are indeed seen in nature. In reality, antiparticles also have positive energies. Antiparticles as wave functions with negative energies is merely an interpretation of the mathematical representation of the energy condition. If antiparticles weren’t discovered, the negative energy condition would have been still thought to be unphysical.

Other than allowing solutions with negative energies, there was another issue with the Klein-Gordon equation noted by physicists. The conservation of four-current density $$j_\mu=\frac{i\hbar}{2m_0}(\psi^\ast\nabla_\mu\psi-\psi\nabla_\mu\psi^\ast),$$ where $\psi^\ast$ denotes the complex conjugate of $\psi$ and $\nabla_\mu=\left(-\frac{1}{c^2}\frac{\partial}{\partial t},\nabla\right)$, implies that the quantity $$\rho=\frac{i\hbar}{2m_0c^2}\left(\psi^\ast\frac{\partial\psi}{\partial t}-\psi\frac{\partial\psi^\ast}{\partial t}\right)$$ can be considered as a probability density. However, the problem is that $\rho$ can be negative. This is due to the appearance of first-order partial derivative $\frac{\partial\psi}{\partial t}$, which is the consequence of the Klein-Gordon equation being of second-order in time. Because of this, the Klein-Gordon equation was not regarded as a physically viable relativistic generalization of the Schrödinger equation and physicists were instead looking for a relativistic generalization of first-order in time like the Schrödinger equation. Such an equation was finally discovered by P. A. M. Dirac and is called the Dirac equation. On the other hand, the Klein-Gordon equation drew attention of physicists again after they realized that $\rho$ can be interpreted as charge density, and indeed charged pions $\pi^+$ and $\pi^-$ were discovered. Today, the Klein-Gordon equation is an important relativistic equation that describes charged spin-0 particles.


[1] Walter Greiner, Relativistic Quantum Mechanics, 3rd Edition, Springer-Verlag, 2000

The Curvature

In this note, we study different notions of curvatures of a Riemannian or a pseudo-Riemannian $n$-manifold $M$ with metric tensor $g_{ij}$. This note is intended mainly for students of physics. Hence, we will discuss only local expressions of curvatures as those are the ones we mostly use for doing physics in general relativity.

First we need to introduce the Christoffel symbols $\Gamma_{ij}^k$. The Christoffel symbols are associated with the differentiation of vector fields in a Riemannian or a pseudo Riemannian manifold $M$, called the Levi-Civita connection. The Levi-Civita connection $\nabla$ is a generalization of the covariant derivative of vector fields in the Euclidean space. Locally the Levi-civita connection is defined by $$\nabla_{\frac{\partial}{\partial x^i}}\frac{\partial}{\partial x^j}=\sum_{k}\Gamma_{ij}^k\frac{\partial}{\partial x^k}$$ and the Christoffel symbol is given by $$\Gamma_{ij}^k=\frac{1}{2}\sum_\ell g^{k\ell}\left\{\frac{\partial g_{j\ell}}{\partial x^i}+\frac{\partial g_{\ell i}}{\partial x^j}-\frac{\partial g_{ij}}{\partial x^\ell}\right\}$$ where $g^{k\ell}$ is the inverse of the metric tensor.

Locally the Riemann curvature tensor $R_{ijk}^\ell$ is given by $$R_{ijk}^\ell=\frac{\partial}{\partial x^j}\Gamma_{ik}^\ell-\frac{\partial}{\partial x^k}\Gamma_{ij}^\ell+\sum_p\left\{\Gamma_{jp}^\ell\Gamma_{ik}^p-\Gamma_{kp}^\ell\Gamma_{ij}^p\right\}$$

Locally the sectional curvature $K(X,Y)$ of $M$ with respect to the plane spanned by tangent vectors $X,Y\in T_pM$ is given by \begin{equation}\label{eq:sectcurv}K_p(X,Y)=g^{ii}R_{iji}^j\end{equation} assuming that $X,Y\in\mathrm{span}\left\{\frac{\partial}{\partial x^i},\frac{\partial}{\partial x^j}\right\}$. The sectional curvature is a generalization of the Gaußian curvature of a surface in 3-space. To see this, let $\varphi: M^2\longrightarrow M^3$ be a conformal parametric surface $M^2$ immersed in 3-space $M^3$ with metric $e^{u(x,y)}(dx^2+dy^2)$. The Gaußian curvature $K$ of $\varphi$ can be calculated using the formula (due to Karl Friedrich Gauß) $$K=\frac{\ell n-m^2}{EG-F^2}$$ where \begin{align*}E&=\langle\varphi_x,\varphi_x\rangle,\ F=\langle\varphi_x,\varphi_y\rangle,\ G=\langle\varphi_y,\varphi_y\rangle,\\\ell&=\langle\varphi_{xx},N\rangle,\ m=\langle\varphi_{xy},N\rangle,\ n=\langle\varphi_{yy},N\rangle\end{align*} Here, $\langle\ ,\ \rangle$ stands for the inner product induced by the conformal metric $e^{u(x,y)}(dx^2+dy^2)$ and $N$ is the unit normal vector field on $\varphi$. The Gaußian curvature is then obtained as the Liouville’s partial differential equation \begin{equation}\label{eq:liouville}\nabla^2 u=-2Ke^u\end{equation} On the other hand, using \eqref{eq:sectcurv} we find the sectional curvature of $\varphi$ to be $$g^{11}R_{121}^2=-\frac{e^{u(x,y)}}{2}\nabla^2u$$ which coincides with the Gaußian curvature $K$ from \eqref{eq:liouville}

Example. Let us compute the sectional curvature of the hyperbolic plane $$\mathbb{H}^2=\{(x,y)\in\mathbb{R}^2: y>0\}$$ with metric $$ds^2=\frac{dx^2+dy^2}{y^2}$$

The metric tensor is $(g_{ij})=\begin{pmatrix}\frac{1}{y^2} & 0\\0 & \frac{1}{y^2}\end{pmatrix}$. The Riemann curvature tensor $R_{121}^2$ is \begin{align*}R_{121}^2&=\frac{\partial}{\partial y}\Gamma_{11}^2-\frac{\partial}{\partial x}\Gamma_{12}^2+\sum_p\{\Gamma_{2p}^p\Gamma_{11}^p-\Gamma_{1p}^2\Gamma_{12}^p\}\\&=\frac{\partial}{\partial y}\Gamma_{11}^2-\frac{\partial}{\partial x}\Gamma_{12}^2+\Gamma_{21}^2\Gamma_{11}^1-\Gamma_{11}^2\Gamma_{12}^1+\Gamma_{22}^2\Gamma_{11}^2-\Gamma_{12}^2\Gamma_{12}^2\end{align*} We find the Christoffel symbols $$\Gamma_{11}^2=\frac{1}{y},\ \Gamma_{12}^1=-\frac{1}{y},\ \Gamma_{12}^2=0,\ \Gamma_{21}^2=0,\ \Gamma_{22}^2=-\frac{1}{y}$$ Thus we obtain $R_{121}^2=-\frac{1}{y^2}$ and hence $\mathbb{H}^2$ has the constant negative sectional curvature $$K=g^{11}R_{121}^2=y^2\left(-\frac{1}{y^2}\right)=-1$$ What is the shortest path connecting two points $(x_1,y_1)$ and $(x_2,y_2)$ in $\mathbb{H}^2$? Such shortest paths are called geodesics in differential geometry. To find out what a geodesic in $\mathbb{H}^2$ looks like, let $$J=\int_{(x_1,y_1)}^{(x_2,y_2)}ds=\int_{(x_1,y_1)}^{(x_2,y_2)}\frac{\sqrt{1+y_x^2}}{y}dx$$ where $y_x=\frac{dy}{dx}$. The shortest path would satisfy the Euler-Lagrange equation \begin{equation}\label{eq:E-L}\frac{\partial f}{\partial x}-\frac{d}{dx}\left(f-y_x\frac{\partial f}{\partial y_x}\right)=0\end{equation}with $f(y,y_x,x)=\frac{\sqrt{1+y_x^2}}{y}$. Since $f$ does not depend on $x$, $\frac{\partial f}{\partial x}=0$ and the Euler-Lagrange equation \eqref{eq:E-L} becomes $$\frac{d}{dx}\left[\frac{1}{y\sqrt{1+y_x^2}}\right]=0$$ i.e. \begin{equation}\label{eq:E-L2}\frac{1}{y\sqrt{1+y_x^2}}=C\end{equation} where $C$ is a constant. The equation \eqref{eq:E-L2} results in a separable differential equation $$\frac{dy}{dx}=\frac{\sqrt{r^2-y^2}}{y}$$ where $r^2=\frac{1}{C}$. The solution of this equation is $$(x-a)^2+y^2=r^2$$ where $a$ is a constant. Since $y>0$, the solution represents an equation of upper semi circle centered at $(a,0)$ with radius $r$, that is the shortest path (geodesic) between two points $(x_1,y_1)$ and $(x_2,y_2)$ in $\mathbb{H}^2$ is a part of an upper semicircle joining them. In particular, if $x_1=x_2$, the geodesic between $(x_1,y_1)$ and $(x_2,y_2)$ is the vertical line passing through the two points. Such a vertical line can still be considered as an upper semicircle with radius $\infty$.

Geodesics in Hyperbolic Plane

Two other notions of curvatures are Ricci and scalar curvatures. The Ricci curvature tensor is given by $$\mathrm{Ric}_p\left(\frac{\partial}{\partial x^i},\frac{\partial}{\partial x^j}\right)=\sum_kR_{ikj}^k$$ We usually denote $\mathrm{Ric}_p\left(\frac{\partial}{\partial x^i},\frac{\partial}{\partial x^j}\right)$ simply by $R_{ij}$. The scalar curvature $\mathrm{Scal}(p)$ is given by $$\mathrm{Scal}(p)=\sum_{i}g^{ii}R_{ii}$$ The scalar curvature can be given, in terms of the sectional curvature, by $$\mathrm{Scal}(p)=\sum_{i\ne j}K_p\left(\frac{\partial}{\partial x^i},\frac{\partial}{\partial x^j}\right)$$ The scalar curvature is usually denoted by $R$ in general relativity.

Definition. A Riemannian or a pseudo-Riemannian manifold $(M,g)$ is said to be maximally symmetric if $(M,g)$ has constant sectional curvature $\kappa$.

Theorem. If a Riemannian or a pseudo-Riemannian manifold $(M,g)$ is maximally symmetric, then $$R_{ii}=\kappa(n-1)g_{ii}$$ where $\kappa$ is the constant sectional curvature of $(M,g)$ and $n=\dim M$.

Corollary. If $(M,g)$ has the constant sectional curvature $\kappa$, then $$\mathrm{Scal}(p)=n(n-1)\kappa$$ where $n=\dim M$.