The Poisson Process: Use of Generating Functions

In here, we obtained the Poisson distribution by successively solving a differential-difference equation. While getting successive solutions of the Poisson distribution was easy and simple, the same thing can’t be said in general. In this note, we discuss another method of obtaining probability distributions. Let $P(x,t)=\sum_{n=0}^\infty p_n(t)x^n$. $P(x,t)$ is called a probability generating function for the probability distribution $p_n(t)$. Recall the differential-difference equation (2) in here: $$\frac{dp_n(t)}{dt}=\lambda\{p_{n-1}(t)-p_n(t)\},\ n\geq 0$$ with $p_{-1}(t):=0$ and $p_0(0)=1$. Multiplying the equation by $x^n$ and then summing over $n$ from $n=1$ to $\infty$, we obtain the differential equation \begin{equation}\label{eq:genfn}\frac{\partial P(x,t)}{\partial t}=\lambda(x-1)P(x,t)\end{equation} with $P(x,0)=1$. \eqref{eq:genfn} only contains derivative with respect to $t$, so it is a separable equation. Its solution is given by $$P(x,t)=e^{\lambda t(x-1)}$$ From this we easily obtain the Poisson distribution $$p_n(t)=\frac{e^{-\lambda t}(\lambda t)^n}{n!}$$


  1. Norman T. J. Bailey, The Elements of Stochastic Processes with Applications to the Natural Sciences, John Wiley & Sons, Inc., 1964.

Quantum Gates: Single Qubit Gates

A classical computer is built from electrical circuits that consist of logic gates and wires. The logic gates perform intended logical operations of information and wires pass information between logic gates. Similarly, a quantum computer would be built from quantum circuits that consist of quantum gates and wires. However, when it comes down to comparing how operations are performed at elementary gates level, there is no similarity between classical logic gates and quantum gates. Unlike classical logic gates, quantum gates operate not according to logical rules but to physical (quantum mechanical) laws. A single qubit gate acts on not only on qubits $|0\rangle=\begin{bmatrix}1\\0\end{bmatrix}$ and $|1\rangle=\begin{bmatrix}0\\1\end{bmatrix}$ but also on a superposition of $|0\rangle$ and $|1\rangle$, $\alpha|0\rangle+\beta|1\rangle$, where $\alpha$ and $\beta$ are complex numbers. For an obvious reason, single qubit gates are $2\times 2$ matrices. Furthermore, a quantum gate operation is performed through a unitary time evolution, and this means that they are required to be unitary matrices. Recall that a matrix $U$ of complex entries is unitary if $UU^\dagger=I$ where $\dagger$ denotes transpose conjugate i.e. $U^\dagger = \bar U^t$. If we write $U=\begin{bmatrix}a & b\\c & d\end{bmatrix}$, then $U$ is unitary if and only if $|a|^2+|b|^2=|c|^2+|d|^2=1$ and $a\bar c+b\bar d=0$. Any unitary matrix is qualified to be a quantum gate but not all unitary matrices would be interesting or useful gates. For practical purposes, it is important to see which gates are universal if there are any and that which gates are physically implementable. The following are some examples of well-known single cubit gates.

  1. Pauli-X, Pauli-Y, Pauli-Z gates. A matrix $H$ is Hermitian if $H=H^\dagger$. Hermitian matrices play a crucial role in quantum mechanics as Hamiltonians are required to be Hermitian. The eigenvalues of a Hamiltonian are the energies of the corresponding quantum system and of course they all have to be real. Hermitian matrices are guaranteed to have all real eigenvalues. A $2\times 2$ Hermitian matrix can be written in the form $H=\begin{bmatrix}x_0+x_3 & x_1-ix_2\\x_1+ix_2 & x_0-x_3\end{bmatrix}$ where the $x_i$ are all reals. The matrices \begin{equation}\label{eq:pauli}\sigma_0=\begin{bmatrix}1 & 0\\0 & 1\end{bmatrix},\ \sigma_1=\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix},\ \sigma_2=\begin{bmatrix}0 & -i\\i & 0\end{bmatrix},\ \sigma_3=\begin{bmatrix}1 & 0\\0 & -1\end{bmatrix}\end{equation} form a basis for the space of all Hermitian matrices. The $\sigma_i$ in \eqref{eq:pauli} are called the Pauli spin matrices in physics. $\sigma_0$ is just the identity matrix but notice that $\sigma_1,\sigma_2,\sigma_3$ are also unitary matrices. They are called Pauli-X, Pauli-Y, Pauli-Z gates, respectively. Pauli-X gate is also called NOT gate as it acts like the classical NOT gate. \begin{align*}\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}\begin{bmatrix}1\\0\end{bmatrix}&=\begin{bmatrix}0\\1\end{bmatrix}\\\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix}&=\begin{bmatrix}0\\1\end{bmatrix}\end{align*}
  2. Square root of NOT gate. The gate $$\sqrt{\mathrm{NOT}}=\frac{1}{2}\begin{bmatrix}1+i & 1-i\\1-i & 1+i\end{bmatrix}$$ is called square root of NOT gate because $$\sqrt{\mathrm{NOT}}\cdot\sqrt{\mathrm{NOT}}=\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}$$ $\sqrt{\mathrm{NOT}}$ is one of the gates that are truly quantum because there is no classical counterpart.
  3. Hadamard gate. $$H=\frac{1}{\sqrt{2}}\begin{bmatrix}1 & 1\\1 & -1\end{bmatrix}$$ is called the Hadamard gate. It turns $|0\rangle$ to $\frac{|0\rangle+|1\rangle}{\sqrt{2}}$ and $|1\rangle$ to $\frac{|0\rangle-|1\rangle}{\sqrt{2}}$.


  1. Michael A. Nielsen and Isaac L. Chung, Quantum Computation and Quantum Information, Cambridge University Press, 2000.
  2. Colin P. Williams and Scott H. Clearwater, Explorations in Quantum Computing, Spinger Telos, 1998.

Markov Processes: The Poisson Process

A Markov process is a stochastic process such that given the present state of the system, the future behavior is independent of the past history.

Suppose that the size of the population under consideration at time $t$ is represented by a discrete random variable $X(t)$ with probability $$P\{X(t)=n\}=p_n(t),\ n=0,1,2,\cdots$$ At certain instants of time, there will be discrete changes in the population size, due to the loss or death of an individual, or to the appearance or birth of new individuals.

We now consider an example of a stochastic process, which will give rise to a simple model of a Markov process called the Poisson process. Suppose that we are measuring the radiation of a certain radioactive substance. Let $X(t)$ be the total number of particles recorded up to time $t$. Also suppose that the chance of a new particle being recorded is any short time interval is independent of not only the previous states but also the present state. Then the chance of a new addition to the total count during a very short time interval $\Delta t$ can be written as $\lambda\Delta t+o(\Delta t)$ where $\lambda$ is a constant and $o(\Delta t)$ is the chance of two or more simultaneous emissions. The chance of no change in the total count would be $1-\Delta t-o(\Delta t)$. Thus, we obtain \begin{equation}\label{eq:poisson}p_n(t+\Delta t)=p_{n-1}(t)\lambda\Delta t+p_n(t)(1-\lambda\Delta t)\end{equation} where terms that are small compared with $\Delta t$ have been disregarded. \eqref{eq:poisson} results in the following differential-difference equation \begin{equation}\begin{aligned}\frac{dp_n(t)}{dt}&=\lim_{\Delta t\to 0}\frac{p_n(t+\Delta t)-p_n(t)}{\Delta t}\\&=\lambda\{p_{n-1}(t)-p_n(t)\},\ n>0\end{aligned}\label{eq:poisson2}\end{equation}

Note that we have $n=0$ at time $t+\Delta t$ only if $n=0$ at time $t$ and no new particles are emitted in $\Delta t$. This means that we have $$p_0(t+\Delta t)=p_0(t)(1-\lambda\Delta t)$$ and subsequently the differential equation $$\frac{dp_0}{dt}=-\lambda p_0(t)$$ whose solution is $$p_0(t)=e^{-\lambda t}$$ with the initial condition $p_0(0)=1$. This initial condition can be obtained by setting the Geiger counter to $0$ in the beginning of the process. Starting off with $p_0(t)$, iterative applications of \eqref{eq:poisson2} will result in $$p_n(t)=\frac{(\lambda t)^ne^{-\lambda t}}{n!},\ n=0,1,2,\cdots$$ which is a Poisson distribution with parameter $\lambda t$.


  1. Norman T. J. Bailey, The Elements of Stochastic Processes with Applications to the Natural Sciences, John Wiley & Sons, Inc., 1964.

Simple Epidemics: Deterministic Model

Suppose that we have a homogeneously mixing group of individuals of total size $n+1$ and that the epidemics is started off at $t=0$ by one individual becoming infectious. The remaining $n$ individuals are susceptible. Denote by $x$ and $y$ the numbers of susceptibles and infectives, respectively. Then $x+y=n+1$. Also suppose that the rate of occurrence of new infections is proportional to the number of infectives as well the number of susceptibles. Then we obtain the following differential equation \begin{equation}\label{eq:infectrate}\frac{dx}{dt}=-\beta xy=-\beta x(n-x+1)\end{equation} where $\beta$ is the constant of proportionality called the infection-rate. Let $\tau=\beta t$. \eqref{eq:infectrate} becomes \begin{equation}\label{eq:infectrate2}\frac{dx}{d\tau}=-x(n-x+1)\end{equation} With initial condition $x(0)=n$, the solution of \eqref{eq:infectrate2} is given by $$x(\tau)=\frac{n(n+1)}{n+e^{(n+1)\tau}}$$ The number of infectives at $\tau$ is then given by $$y(\tau)=\frac{n+1}{1+ne^{-(n+1)\tau}}$$ The rate at which new infectives accrue $$w(\tau)=\frac{dy}{d\tau}$$ is called the epidemic curve. It follows from \eqref{eq:infectrate2} that $$w(\tau)=-\frac{dx}{d\tau}=-xy=\frac{n(n+1)^2e^{(n+1)\tau}}{[n+e^{(n+1)\tau}]^2}$$ Using the standard argument from calculus, we find that $w(\tau)$ assumes its maximum when $x=y$ i.e. when the number of susceptibles and the number of infectives are the same and that happens when $\tau=\frac{\ln n}{n+1}$, and the maximum rate at which new infections accrue is $w=\frac{1}{4}(n+1)^2$.

Figure 1. Deterministic Epidemic Curve for n=20 (red) and for n=30 (blue).

As a reminder the model we discussed above assumed that the epidemic is started off by one individual (patient zero). Now, we want to generalized the model by assuming that the epidemic is started off by several infectious individuals. Let $x(0)=n$ and $y(0)=a$. Then the equation \eqref{eq:infectrate2} is modified to \begin{equation}\label{eq:infectrate3}\frac{dx}{dt}=-x(n-x+a)\end{equation} The solution of \eqref{eq:infectrate3} is $$x(\tau)=\frac{n(n+a)}{n+ae^{(n+a)\tau}}$$ The corresponding epidemic curve is given by $$w=\frac{an(n+a)^2e^{(n+a)\tau}}{[n+ae^{(n+a)\tau}]^2}$$ Let $W=\frac{w}{n}=\frac{a(n+a)^2e^{(n+a)\tau}}{[n+ae^{(n+a)\tau}]^2}$, $0\leq\tau<\infty$. Then \begin{align*}\int_0^\infty Wd\tau&=\int_{n+a}^\infty\frac{(n+a)}{u^2}du\ (u=n+ae^{(n+a)\tau})\\&=1\end{align*} So, $W$ can be interpreted as the probability density of a new infection occurring. The mean value of the variable $\tau$ is \begin{align*}\bar\tau&=\int_0^\infty\tau Wd\tau\\&=-\frac{1}{n}\int_0^\infty\tau\frac{dx}{d\tau}d\tau\\&=\frac{1}{n}\ln\left(1+\frac{n}{a}\right)\end{align*}


  1. Norman T. J. Bailey, A Simple Stochastic Epidemic, Biometrika, Vol. 37, No. 3/4 (Dec., 1950), 193-202.
  2. The Mathematical Theory of Infectious Diseases and Its Applications, Norman T. J. Bailey, Second Edition, Charles Griffin & Company LTD, 1975.

The Tensor Product

Let $V$ and $W$ be two vector spaces of dimensions $m$ and $n$, respectively. The tensor product of $V$ and $W$ is a space $V\otimes W$ of dimension $mn$ together with a bilinear map $$\varphi: V\times W\longrightarrow V\otimes W;\ \varphi(v,w)=v\otimes w$$ which satisfy the following universal property: For any vector space $X$ and any bilinear map $\psi: V\times W\longrightarrow X$, there exists uniquely a linear map $\gamma : V\otimes W\longrightarrow X$ such that $\psi(v,w)=\gamma(u\otimes w)$ for all $v\in V$ and $w\in W$.

$$\begin{array}[c]{ccc}V\otimes W & & \\\uparrow\scriptstyle{\varphi} & \scriptstyle{\gamma}\searrow& \\V\times W & \stackrel{\psi}\rightarrow & X\end{array}\ \ \ \gamma\circ\varphi=\psi$$

Often, we use more down to earth definition of the tensor product. Let $\{e_1,\cdots,e_m\}$ and $\{f_1,\cdots,f_n\}$ be bases of $V$ and $W$, respectively. The tensor product $V\otimes W$ is a vector space of dimension $mn$ spanned by the basis $\{e_i\otimes f_j: i=1,\cdots,m,\ j=1,\cdots,n\}$. Let $v\in V$ and $w\in W$. Then $$v=\sum_i v_ie_i\ \mbox{and}\ w=\sum_j w_jf_j$$ The tensor product $v$ and $w$ is then given by $$v\otimes w=\sum_{i,j}v_iw_je_i\otimes f_j$$ It can be easily shown that this definition of the tensor product in terms of prescribed bases satisfies the universality property. Although this definition uses a choice of bases of $V$ and $W$, the tensor product $V\otimes W$ must not depend on a particular choice of bases, i.e. regardless of the choice of bases the resulting tensor product must be the same. This also can be shown easily using some basic properties from linear algebra. I will leave them for exercise for readers.

The tensor product can be used to describe the state of a quantum memory register. A quantum memory register consists of many 2-state systems (Hilbert spaces of qubits). Let $|\psi^{(1)}\rangle$ and $|\psi^{(2)}\rangle$ be qubits associated with two different 2-state systems. In terms of the standard orthogonal basis $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$ for each 2-state system, we have \begin{align*}|\psi^{(1)}\rangle&=\begin{pmatrix}\omega_0^{(1)}\\\omega_1^{(1)}\end{pmatrix}=\omega_0^{(1)}\begin{pmatrix}1\\0\end{pmatrix}+\omega_1^{(1)}\begin{pmatrix}0\\1\end{pmatrix}\\|\psi^{(2)}\rangle&=\begin{pmatrix}\omega_0^{(2)}\\\omega_1^{(2)}\end{pmatrix}=\omega_0^{(2)}\begin{pmatrix}1\\0\end{pmatrix}+\omega_1^{(2)}\begin{pmatrix}0\\1\end{pmatrix}\end{align*} Define $\otimes$ on the basis members as follows: \begin{align*}|00\rangle&=\begin{pmatrix}1\\0\end{pmatrix}\otimes\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}1\\0\\0\\0\end{pmatrix},\ |01\rangle=\begin{pmatrix}1\\0\end{pmatrix}\otimes\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}0\\1\\0\\0\end{pmatrix}\\|10\rangle&=\begin{pmatrix}0\\1\end{pmatrix}\otimes\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}0\\0\\1\\0\end{pmatrix},\ |11\rangle=\begin{pmatrix}0\\1\end{pmatrix}\otimes\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}0\\0\\0\\1\end{pmatrix}\end{align*} These four vectors form a basis for a 4-dimensional Hilbert space (a 2-qubit memory register). It follows that \begin{align*}|\psi^{(1)}\rangle\otimes|\psi^{(2)}\rangle&=\omega_0^{(1)}\omega_0^{(2)}|00\rangle+\omega_0^{(1)}\omega_1^{(2)}|01\rangle+\omega_1^{(1)}\omega_0^{(2)}|10\rangle+\omega_1^{(1)}\omega_1^{(2)}|11\rangle\\&=\begin{pmatrix}\omega_0^{(1)}\omega_0^{(2)}\\\omega_0^{(1)}\omega_1^{(2)}\\\omega_1^{(1)}\omega_0^{(2)}\\\omega_1^{(1)}\omega_1^{(2)}\end{pmatrix}\end{align*}Similarly, to describe the state of a 3-qubit memory register, one performs the tensor product $|\psi^{(1)}\rangle\otimes|\psi^{(2)}\rangle\otimes|\psi^{(3)}\rangle$.

Quantum memory registers can store an exponential amount of classical information in only a polynomial number of qubits using the quantum property the Principle of Superposition. For example, consider the two classical memory registers storing complimentary sequences of bits $$\begin{array}{|c|c|c|c|c|c|c|}\hline1 & 0 & 1 & 1 & 0 & 0 &1\\\hline 0 & 1 & 0 & 0 & 1 & 1 & 0\\\hline\end{array}$$ A single quantum memory register can store both sequences simultaneously in an equally weighted superposition of the two states representing each 7-bit input $$\frac{1}{\sqrt{2}}(|1011001\rangle+|0100110\rangle)$$

A matrix can be considered as a vector. For example, a $2\times 2$ matrix $\begin{pmatrix}a & b\\c & d\end{pmatrix}$ can be identified with the vector $(a, b, c, d) \in \mathbb{R}^4$. Hence one can define the tensor product of two matrices in a similar manner to that of two vectors. For example, $$\begin{pmatrix}a_{11} & a_{12}\\a_{21} & a_{22}\end{pmatrix}\otimes\begin{pmatrix}b_{11} & b_{12}\\b_{21} & b_{22}\end{pmatrix}:=\begin{pmatrix}a_{11}b_{11} & a_{11}b_{12} & a_{11}b_{21} & a_{11}b_{22}\\a_{12}b_{11} & a_{12}b_{12} & a_{12}b_{21} & a_{12}b_{22}\\a_{21}b_{11} & a_{21}b_{12} & a_{21}b_{21} & a_{21}b_{22}\\a_{22}b_{11} & a_{22}b_{12} & a_{22}b_{21} & a_{22}b_{22}\end{pmatrix}$$


[1] A. Yu. Kitaev, A. H. Shen and M. N. Vyalyi, Classical and Quantum Computation, Graduate Studies in Mathematics Volume 47, American Mathematical Society, 2002

[2] Colin P. Williams and Scott H. Clearwater, Explorations in Quantum Computing, Springer TELOS, 1998