Author Archives: Sung Lee

About Sung Lee

I am a mathematician and a physics buff. I am interested in studying mathematics inspired by theoretical physics as well as theoretical physics itself. Besides, I am also interested in studying theoretical computer science.

Simple Epidemics: Deterministic Model

Suppose that we have a homogeneously mixing group of individuals of total size $n+1$ and that the epidemics is started off at $t=0$ by one individual becoming infectious. The remaining $n$ individuals are susceptible. Denote by $x$ and $y$ the numbers of susceptibles and infectives, respectively. Then $x+y=n+1$. Also suppose that the rate of occurrence of new infections is proportional to the number of infectives as well the number of susceptibles. Then we obtain the following differential equation \begin{equation}\label{eq:infectrate}\frac{dx}{dt}=-\beta xy=-\beta x(n-x+1)\end{equation} where $\beta$ is the constant of proportionality called the infection-rate. Let $\tau=\beta t$. \eqref{eq:infectrate} becomes \begin{equation}\label{eq:infectrate2}\frac{dx}{d\tau}=-x(n-x+1)\end{equation} With initial condition $x(0)=n$, the solution of \eqref{eq:infectrate2} is given by $$x(\tau)=\frac{n(n+1)}{n+e^{(n+1)\tau}}$$ The number of infectives at $\tau$ is then given by $$y(\tau)=\frac{n+1}{1+ne^{-(n+1)\tau}}$$ The rate at which new infectives accrue $$w(\tau)=\frac{dy}{d\tau}$$ is called the epidemic curve. It follows from \eqref{eq:infectrate2} that $$w(\tau)=-\frac{dx}{d\tau}=-xy=\frac{n(n+1)^2e^{(n+1)\tau}}{[n+e^{(n+1)\tau}]^2}$$ Using the standard argument from calculus, we find that $w(\tau)$ assumes its maximum when $x=y$ i.e. when the number of susceptibles and the number of infectives are the same and that happens when $\tau=\frac{\ln n}{n+1}$, and the maximum rate at which new infections accrue is $w=\frac{1}{4}(n+1)^2$.

Figure 1. Deterministic Epidemic Curve for n=20 (red) and for n=30 (blue).


  1. Norman T. J. Bailey, A Simple Stochastic Epidemic, Biometrika, Vol. 37, No. 3/4 (Dec., 1950), 193-202.
  2. The Mathematical Theory of Infectious Diseases and Its Applications, Norman T. J. Bailey, Second Edition, Charles Griffin & Company LTD, 1975.

The Tensor Product

Let $V$ and $W$ be two vector spaces of dimensions $m$ and $n$, respectively. The tensor product of $V$ and $W$ is a space $V\otimes W$ of dimension $mn$ together with a bilinear map $$\varphi: V\times W\longrightarrow V\otimes W;\ \varphi(v,w)=v\otimes w$$ which satisfy the following universal property: For any vector space $X$ and any bilinear map $\psi: V\times W\longrightarrow X$, there exists uniquely a linear map $\gamma : V\otimes W\longrightarrow X$ such that $\psi(v,w)=\gamma(u\otimes w)$ for all $v\in V$ and $w\in W$.

$$\begin{array}[c]{ccc}V\otimes W & & \\\uparrow\scriptstyle{\varphi} & \scriptstyle{\gamma}\searrow& \\V\times W & \stackrel{\psi}\rightarrow & X\end{array}\ \ \ \gamma\circ\varphi=\psi$$

Often, we use more down to earth definition of the tensor product. Let $\{e_1,\cdots,e_m\}$ and $\{f_1,\cdots,f_n\}$ be bases of $V$ and $W$, respectively. The tensor product $V\otimes W$ is a vector space of dimension $mn$ spanned by the basis $\{e_i\otimes f_j: i=1,\cdots,m,\ j=1,\cdots,n\}$. Let $v\in V$ and $w\in W$. Then $$v=\sum_i v_ie_i\ \mbox{and}\ w=\sum_j w_jf_j$$ The tensor product $v$ and $w$ is then given by $$v\otimes w=\sum_{i,j}v_iw_je_i\otimes f_j$$ It can be easily shown that this definition of the tensor product in terms of prescribed bases satisfies the universality property. Although this definition uses a choice of bases of $V$ and $W$, the tensor product $V\otimes W$ must not depend on a particular choice of bases, i.e. regardless of the choice of bases the resulting tensor product must be the same. This also can be shown easily using some basic properties from linear algebra. I will leave them for exercise for readers.

The tensor product can be used to describe the state of a quantum memory register. A quantum memory register consists of many 2-state systems (Hilbert spaces of qubits). Let $|\psi^{(1)}\rangle$ and $|\psi^{(2)}\rangle$ be qubits associated with two different 2-state systems. In terms of the standard orthogonal basis $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$ for each 2-state system, we have \begin{align*}|\psi^{(1)}\rangle&=\begin{pmatrix}\omega_0^{(1)}\\\omega_1^{(1)}\end{pmatrix}=\omega_0^{(1)}\begin{pmatrix}1\\0\end{pmatrix}+\omega_1^{(1)}\begin{pmatrix}0\\1\end{pmatrix}\\|\psi^{(2)}\rangle&=\begin{pmatrix}\omega_0^{(2)}\\\omega_1^{(2)}\end{pmatrix}=\omega_0^{(2)}\begin{pmatrix}1\\0\end{pmatrix}+\omega_1^{(2)}\begin{pmatrix}0\\1\end{pmatrix}\end{align*} Define $\otimes$ on the basis members as follows: \begin{align*}|00\rangle&=\begin{pmatrix}1\\0\end{pmatrix}\otimes\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}1\\0\\0\\0\end{pmatrix},\ |01\rangle=\begin{pmatrix}1\\0\end{pmatrix}\otimes\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}0\\1\\0\\0\end{pmatrix}\\|10\rangle&=\begin{pmatrix}0\\1\end{pmatrix}\otimes\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}0\\0\\1\\0\end{pmatrix},\ |11\rangle=\begin{pmatrix}0\\1\end{pmatrix}\otimes\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}0\\0\\0\\1\end{pmatrix}\end{align*} These four vectors form a basis for a 4-dimensional Hilbert space (a 2-qubit memory register). It follows that \begin{align*}|\psi^{(1)}\rangle\otimes|\psi^{(2)}\rangle&=\omega_0^{(1)}\omega_0^{(2)}|00\rangle+\omega_0^{(1)}\omega_1^{(2)}|01\rangle+\omega_1^{(1)}\omega_0^{(2)}|10\rangle+\omega_1^{(1)}\omega_1^{(2)}|11\rangle\\&=\begin{pmatrix}\omega_0^{(1)}\omega_0^{(2)}\\\omega_0^{(1)}\omega_1^{(2)}\\\omega_1^{(1)}\omega_0^{(2)}\\\omega_1^{(1)}\omega_1^{(2)}\end{pmatrix}\end{align*}Similarly, to describe the state of a 3-qubit memory register, one performs the tensor product $|\psi^{(1)}\rangle\otimes|\psi^{(2)}\rangle\otimes|\psi^{(3)}\rangle$.

Quantum memory registers can store an exponential amount of classical information in only a polynomial number of qubits using the quantum property the Principle of Superposition. For example, consider the two classical memory registers storing complimentary sequences of bits $$\begin{array}{|c|c|c|c|c|c|c|}\hline1 & 0 & 1 & 1 & 0 & 0 &1\\\hline 0 & 1 & 0 & 0 & 1 & 1 & 0\\\hline\end{array}$$ A single quantum memory register can store both sequences simultaneously in an equally weighted superposition of the two states representing each 7-bit input $$\frac{1}{\sqrt{2}}(|1011001\rangle+|0100110\rangle)$$

A matrix can be considered as a vector. For example, a $2\times 2$ matrix $\begin{pmatrix}a & b\\c & d\end{pmatrix}$ can be identified with the vector $(a, b, c, d) \in \mathbb{R}^4$. Hence one can define the tensor product of two matrices in a similar manner to that of two vectors. For example, $$\begin{pmatrix}a_{11} & a_{12}\\a_{21} & a_{22}\end{pmatrix}\otimes\begin{pmatrix}b_{11} & b_{12}\\b_{21} & b_{22}\end{pmatrix}:=\begin{pmatrix}a_{11}b_{11} & a_{11}b_{12} & a_{11}b_{21} & a_{11}b_{22}\\a_{12}b_{11} & a_{12}b_{12} & a_{12}b_{21} & a_{12}b_{22}\\a_{21}b_{11} & a_{21}b_{12} & a_{21}b_{21} & a_{21}b_{22}\\a_{22}b_{11} & a_{22}b_{12} & a_{22}b_{21} & a_{22}b_{22}\end{pmatrix}$$


[1] A. Yu. Kitaev, A. H. Shen and M. N. Vyalyi, Classical and Quantum Computation, Graduate Studies in Mathematics Volume 47, American Mathematical Society, 2002

[2] Colin P. Williams and Scott H. Clearwater, Explorations in Quantum Computing, Springer TELOS, 1998

What is an angle?

The notion of an angle is so basic (and we use it everyday indeed) that anyone with basic math knowledge would know what it is, right? Turns out it is not the case. I often was surprised to find out that even math majors in upper level math courses can’t explain what an angle is. They commonly answer like this: First draw two rays that intersect at a point and then draw an arc between them. They would call this arc an angle.

Figure 1. An angle

But then there can be so many different choices of arcs between two such rays. So which arc is an angle? The arc that defines an angle is one that is a part of the unit circle.

Figure 2. An angle

A measurement of an angle requires a unit. It is speculated that the degree measurement was invented by ancient Sumerians. Sumer is the earliest known civilization which was in southern Mesopotamia (modern day southern Iraq) and it dates back before 3,000 B.C. The biblical Genesis appears to have originated from Sumerian creation myth Enûma Eliš which predates Torah (Hebrew Bible). The first man’s name is Adamu and it does have the story of the Great Deluge. Sumerians had own writing system using cuneiform (wedge-shaped marks on clay tablets) and they built massive structures called ziggurats. Amazingly, those ziggurats still remain to this day in modern day Iraq. Sumerians possessed highly advanced level of knowledge in math and science including astronomy. Here is how they introduced the degree unit of angle measurement: Divide the unit circle into 360 equal sections (imagine 360 equal pizza slices). The length of the arc of each section is then defined to be $1^\circ$ and the circumference of the unit circle is $360^\circ$.

Figure 3. Degree angle measurement

Why the number 360? A common speculation is that Sumerians thought 1 year = 360 days, a complete rotation. Oh yes, they appear to have known that Earth is rotating around the Sun on a circular (actually elliptic) orbit. I have a different theory though. I do believe that Sumerians actually knew 1 year = 365 days. But if they used the number 365 to define $1^\circ$, it would have been an awfully ugly and inconvenient angle measurement. I believe that they knew this and used 360 instead.

On the other hand, you know from elementary geometry that the circumference of a circle with radius $r$ is $2\pi r$. So the unit circle has circumference $2\pi$. Hence we must have $$2\pi=360^\circ$$ However, something looks really awkward. The right hand side has a unit while the left hand side doesn’t. So it was given a unit called radian (denoted by rad), i.e. $$2\pi\ \mathrm{rad}=360^\circ$$ or $$\pi\ \mathrm{rad}=180^\circ$$ Note that 1 rad is merely a number (it’s same as the number 1) while $1^\circ$ is not. So in calculus we don’t use degree angle measurement but only use radian angle measurement.

Given an angle $\theta$, there is a corresponding point $(x,y)$ on the unit circle as shown in Figure 4.

Figure 4. Sine and cosine by unit circle

In order to make this correspondence more transparent, one may want to represent $x$ and $y$ in terms of the angle $\theta$. This is how cosine and sine of an angle were introduced: \begin{equation}\label{eq:trig}x=\cos\theta,\ y=\sin\theta\end{equation} The unit circle centered at the origin satisfies the equation \begin{equation}\label{eq:circ}x^2+y^2=1\end{equation} Using \eqref{eq:trig}, \eqref{eq:circ} can be written as \begin{equation}\label{eq:circ2}(\cos\theta)^2+(\sin\theta)^2=1\end{equation} But this looks a bit ugly, so a better looking notations were introduced: $$\cos^2\theta:=(\cos\theta)^2,\ \sin^2\theta:=(\sin\theta)^2$$ Thereby, \eqref{eq:circ2} is written as $$\cos^2\theta+\sin^2\theta=1$$ For an obvious reason, $\cos\theta$ and $\sin\theta$ along with $\tan\theta$, $\cot\theta$, $\sec\theta$, and $\csc\theta$ are called circular functions. I bet though most of you who are reading this have never heard of the name circular functions. The reason is that nowadays those quantities are defined more commonly (and shamefully) by using right triangles. (See Figure 5.) Hence they are called by more familiar name, trigonometic functions.

Figure 5. Sine and cosine by right triangle

An ancient Babylonian clay tablet which dates to 3,700 years ago, around the time when King Hammurabi ruled the Babylonian Empire, was recently decoded by scientists. It turns out that the tablet contains the oldest trigonometric table. In general, trigonometry is thought to have originated by Greek astronomer Hipparchu around 120 B.C. This remarkable discovery tells that ancient Babylonians already have known trigonometry 1,000 years before Greeks did. More details about the clay tablet can be read here.

The Dirac Equation

The Schrödinger equation $$i\hbar\frac{\partial\psi}{\partial t}=\hat H\psi$$ is a non-relativistic approximation of what is supposed to be more realistic a relativistic equation. The first place one would look at to find a relativisitic generalization is the relativistic energy $$E=\sqrt{c^2p^2+m^2c^4}$$ Replacing $E$ and $p$ by operators $i\hbar\frac{\partial}{\partial t}$ and $-i\hbar\nabla$, respectively, we obtain the square-root Klein-Gordon equation $$-\hbar\frac{\partial\psi(t,x)}{\partial t}=\sqrt{-c^2\hbar^2\nabla^2+m^2c^4}\psi(t,x)$$ This equation is however not a desirable one. Due to the appearance of the radical in the right hand side, it is impossible to include external electromagnetic fields in a relativistically invariant way.

P.A.M. Dirac considered a linearization of the relativistic energy by writing \begin{equation}\label{eq:linenergy}E=c\sum_{i=1}^3\alpha_ip_i+\beta mc^2=c\alpha\cdot p+\beta mc^2\end{equation} where $\alpha=(\alpha_1,\alpha_2,\alpha_3)$ and $\beta$ have to be determined by comparing it with the relativistic energy.

Squaring \eqref{eq:linenergy}, we have \begin{equation}\begin{aligned}E^2=&c^2[\alpha_1^2p_1^2+\alpha_2^2p_2^2+\alpha_3^3p_3^2+(\alpha_1\alpha_2+\alpha_2\alpha_1)p_1p_2+\\&(\alpha_2\alpha_3+\alpha_3\alpha_2)p_2p_3+(\alpha_3\alpha_1+\alpha_1\alpha_3)p_3p_1]+\\&mc^3[(\alpha_1\beta+\beta\alpha_1)p_1+(\alpha_2\beta+\beta\alpha_2)p_2+(\alpha_3\beta+\beta\alpha_3)p_3]+\\&\beta^2m^2c^4\end{aligned}\label{eq:linenergy2}\end{equation} \eqref{eq:linenergy2} must coincide with $c^2p^2+m^2c^4$. For that to happen we must require that \begin{align*}\alpha_1^2p_1^2+\alpha_2^2p_2^2+\alpha_3^2p_3^2&=p^2\\\alpha_i\alpha_j+\alpha_j\alpha_i&=0\ \mbox{for $i\ne j$}\\\alpha_i\beta+\beta\alpha_i&=0\\\beta^2m^2c^4&=m^2c^4\end{align*}If the $\alpha_i$’s and $\beta$ were numbers, we would have $\alpha_1=\alpha_2=\alpha_3=\beta=0$ which is not desirable. Since $\alpha_i$’s and $\beta$ are anticommuting, we may assume that they are $n\times n$ matrices. Now the $\alpha_i$’s and $\beta$, as $n\times n$ matrices, are required to satisfy \begin{equation}\begin{aligned}\alpha_i\alpha_j+\alpha_j\alpha_i&=2\delta_{ij}{\bf 1},\ i,j=1,2,3\\\alpha_i\beta+\beta\alpha_i&=0,\ i=1,2,3\\\beta^2&={\bf 1}\end{aligned}\label{eq:linenergy3}\end{equation}where ${\bf 1}$ denotes the $n\times n$ identity matrix. In order for the Hamiltonian to be Hermitian, the $\alpha_i$’s and $\beta$ are required to be Hermitian. From \eqref{eq:linenergy3}, $$\mathrm{tr}\alpha_i=\mathrm{tr}\beta^2\alpha_i=\mathrm{tr}\beta(\beta\alpha_i)=-\mathrm{tr}\beta\alpha_i\beta=-\mathrm{tr}\alpha_i$$ Thus, $\mathrm{tr}\alpha_i=0$. Since $\alpha_i^2={\bf 1}$, $\alpha_i$ has eigenvalues $1,-1$. Together, we see that $n$ has to be an even number. The smallest $n$ is $n=2$, but this can’t be right as there are only three linearly independent anticommuting Hermitian matrices. For example, the Pauli matrices $$\sigma_1=\begin{pmatrix}0 & 1\\1 & 0\end{pmatrix},\ \sigma_2=\begin{pmatrix}0 & -i\\i & 0\end{pmatrix},\ \sigma_3=\begin{pmatrix}1 & 0\\0 & -1\end{pmatrix}$$ together with ${\bf 1}$ form a basis for the space of $2\times 2$ Hermitian matrices. For $n=4$, if we choose \begin{equation}\label{eq:diracmat}\beta=\begin{pmatrix}{\bf 1} & {\bf 0}\\{\bf 0} & -{\bf 1}\end{pmatrix},\ \alpha_i=\begin{pmatrix}{\bf 0} & \sigma_i\\\sigma_i & {\bf 0}\end{pmatrix},\ i=1,2,3\end{equation} then \eqref{eq:linenergy3} is satisfied.

Now replacing $E$ and $p$ by operators $i\hbar\frac{\partial}{\partial t}$ and $-i\hbar\nabla$, respectively, we obtain the Dirac equation $$i\hbar\frac{\partial\psi(t,x)}{\partial t}=H_o\psi(t,x)$$ where \begin{align*}H_0&=-i\hbar c\alpha\cdot\nabla+\beta mc^2\\&=\begin{pmatrix}mc^2{\bf 1} & -i\hbar c\sigma\cdot\nabla\\-i\hbar c\sigma\cdot\nabla & -mc^2{\bf 1}\end{pmatrix}\end{align*} Here, $\alpha=(\alpha_1,\alpha_2,\alpha_3)$ and $\sigma=(\sigma_1,\sigma_2,\sigma_3)$ are triplets of matrices. The Dirac equation acts of $\mathbb{C}^4$-valued wave functions $$\psi(t,x)=\begin{pmatrix}\psi_1(t,x)\\\psi_2(t,x)\\\psi_3(t,x)\\\psi_4(t,x)\end{pmatrix},\ \psi_i\in\mathbb{C},\ i=1,2,3,4$$

If $m=0$, then only three anticommuting $\alpha_i$ are needed, so it would be sufficient to use $2\times 2$ matrices. For example, one may choose $\alpha_i=\sigma_i$, $i=1,2,3$. Then we obtain the equation $$i\hbar\frac{\partial\psi(t,x)}{\partial t}=c\sigma\cdot\nabla\psi(t,x)$$ This equation is called the Weyl equation. The Weyl equation is thought to describe neutrinos. We will discuss more about this later.

If the space dimension is two, then we can also use Pauli matrices instead of Dirac matrices \eqref{eq:diracmat}. In this case, $H$ has the form $$H=-i\hbar c\left(\sigma_1\frac{\partial}{\partial x_1}+\sigma_2\frac{\partial}{\partial x_2}\right)+\sigma_3 mc^2$$


[1] Walter Greiner, Relativistic Quantum Mechanics, 3rd Edition, Springer-Verlag, 2000

[2] Bernd Thaller, The Dirac Equation, Springer-Verlag, 1992

Energy-Momentum Relation

In this note, we obtain the energy-momentum relation \begin{equation}\label{eq:e-m2}E^2=p^2c^2+m_0c^4\end{equation} which was first introduced by P.A.M. Dirac. Before we get to that, let us study some basic quantities of mechanics in special relativity. Suppose the world vector $\vec{r}=(ct,x,y,z)$ is timelike. Then $$dr^2=-c^2dt^2+dx^2+dy^2+dz^2<0$$ Define $d\tau:=\sqrt{-\frac{dr^2}{c^2}}$. Then \begin{equation}\label{eq:proptime}d\tau=\sqrt{1-\frac{v^2}{c^2}}dt\end{equation} $d\tau$ is the actual time measured by the clock in the moving system at the speed $v$. At rest ($v=0$), $d\tau$ coincides with the coordinate time $dt$. $d\tau$ is called the proper time of the moving frame. The derivative of the world vector $\vec{r}=(ct,x,y,z)$ with respect to the proper time $\tau$ is called the four-velocity $$\vec{v}=\frac{d\vec{r}}{d\tau}=\left(c\frac{dt}{d\tau},\frac{dx}{d\tau},\frac{dy}{d\tau},\frac{dz}{d\tau}\right)$$ Using \eqref{eq:proptime}, the four-velocity can be written as \begin{equation}\label{eq:4-velocity}\vec{v}=\frac{1}{\sqrt{1-\frac{v^2}{c^2}}}(c,{\bf v})\end{equation} where ${\bf v}=\left(\frac{dx}{dt},\frac{dy}{dt},\frac{dz}{dt}\right)$. $$\vec{v}\cdot\vec{v}=\frac{1}{1-\frac{v^2}{c^2}}(-c^2+v^2)=-c^2$$ The four-momentum $\vec{p}$, as a natural generalization of the Newtonian momentum, is defined by \begin{equation}\label{eq:4-momentum}\vec{p}:=m_0\vec{v}=\frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}(c,{\bf v})\end{equation} $$\vec{p}\cdot\vec{p}=-m_0^2c^2$$ On the other hand, $\vec{p}$ also can be written as \begin{equation}\label{eq:4-momentum2}\vec{p}=(mc,m{\bf v})=\left(\frac{E}{c},{\bf p}\right)\end{equation} where $m=\frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}$ and ${\bf p}=(p_x,p_y,p_z)$.

Remark. It is customary in physics literature to write the world vector as $(ict,x,y,z)$ while keeping the metric of spacetime in the appearance of Euclidean metric as seen in [1]. Likewise, the four-momentum is written as $\vec{p}=\left(i\frac{E}{c},{\bf p}\right)$. I am a geometer, not a physicist and I don’t personally like such convention so I am not using it.

Now, from \eqref{eq:4-momentum} and \eqref{eq:4-momentum2}, we obtain the energy-momentum relation \eqref{eq:e-m2}. At rest (${\bf p}=0$), we retrieve Einstein’s famous mass-energy equivalence $E=m_0c^2$. The energy-momentum relation \eqref{eq:e-m2} leads to two very important equations in relativistic quantum mechanics, the Klein-Gordon equation for charged spin-0 particles and the Dirac equation for spin-1/2 fermions [2].

Analogous to the Newtonian force, the four-force is defined by $$\vec{F}=\frac{d\vec{p}}{d\tau}=\frac{1}{\sqrt{1-\frac{v^2}{c^2}}}\frac{d\vec{p}}{dt}$$

Food for thought: For a tachyon, the energy-momentum relation is given by $$E^2=p^2c^2-m_0^2c^4$$ as its rest mass is purely imaginary. Can we obtain a physically sound equation for tachyons from this energy-momentum relation?


[1] Walter Greiner, Classical Mechanics, Point Particles and Relativity, Spinger-Verlag, 2004

[2] Walter Greiner, Relativistic Quantum Mechanics, 3rd Edition, Springer-Verlag, 2000