Divisibility

Definition. We say $a$ divides $b$ and write $a|b$ if there exists $d\in\mathbb{Z}$ such that $b=ad$. We say that $a$ is a divisor of $b$ and that $b$ is a multiple of $a$. If $a|b$ is false, we write $a\not|b$.

Definition. We say $d$ is the greatest common divisor (gcd in short) of $a$ and $b$ if $d$ is the largest of all integers dividing both $a$ and $b$. We write $d=(a,b)$.

Example. Let $a=4$ and $b=6$. The divisors of $4$ are $1$, $-1$, $2$, $-2$, $4$, $-4$. The divisors of $6$ are $1$, $-1$, $2$, $-2$, $3$, $-3$, $6$, $-6$. So the common divisors of $4$ and $6$ are $1$, $-1$, $2$, $-2$ and $2=(4,6)$.

Definition. We say $m$ is the least common multiple (lcm in short) of $a$ and $b$ if $m$ is the smallest of all the positive integers that are multiples of both $a$ and $b$. We write $m=[a,b]$.

Example. Let $a=4$ and $b=6$. The positive multiples of $4$ are $4$, $8$, $12$, $16$, $20$, $24$, $28$, $\cdots$ and the positive multiples of $6$ are $6$, $12$, $18$, $24$, $30$, $\cdots$. Common positive multiples of $4$ and $6$ are $12$, $24$, $\cdots$ and $[4,6]=12$.

Theorem. Given integers $a$, $b$, and $c$,

  1. if $a|b$ then $a|bc$.
  2. if $a|b$ and $b|c$ then $a|c$.
  3. if $a|b$ and $a|c$ then $a|bx+cy$ for any $x,y\in\mathbb{Z}$.

Proof.

  1. If $a|b$ then there exists $d\in\mathbb{Z}$ such that $b=ad$. Now $bc=(ad)c=a(dc)$ and $dc\in\mathbb{Z}$ and hence $a|bc$.
  2. Let $a|b$ and $b|c$. Then there exist $d_1,d_2\in\mathbb{Z}$ such that $b=ad_1$ and $c=bd_2$. Now we have
    $$c=bd_2=(ad_1)d_2=a(d_1d_2)$$
    and $d_1d_2\in\mathbb{Z}$. Hence, $a|c$.
  3. Let $a|b$ and $a|c$. Then there exist $d_1,d_2\in\mathbb{Z}$ such that $b=ad_1$ and $c=ad_2$. For any $x,y\in\mathbb{Z}$ \begin{align*} bx+cy&=(ad_1)x+(ad_2)y\\&=a(d_1x+d_2y) \end{align*}
    and $d_1x+d_2y\in\mathbb{Z}$. Hence, $a|bx+cy$.

Simple Epidemics: Stochastic Model 1

Suppose that initially there are $n$ susceptibles and one infective. Let $X(t)$ denote a random variable which represents that number of susceptibles still uninfected at time $t$. The probability that $X(t)$ the value $r$ is $p_r(t)$. At time $t$, there are $X(t)$ susceptibles and $n-X(t)+1$ infectives. Assume that the chance of a new infection in a short time interval is proportional to the product of the number of susceptibles, the number of infectives and the length of the interval. Then the chance of an infection in $\Delta t$ is given by $\beta X(t)(n-X(t)+1)\Delta t$, where $\beta$ is the contact rate. For simplicity, we change the time scale to $\tau=\beta t$. The chance then becomes $X(n-X+1)\Delta\tau$. Denote by $p_r(\tau)$ the probability that there are still $r$ susceptibles remaining uninfected at $\tau$. There are two possibilities that can occur at time $\tau+\Delta\tau$: There are either $r+1$ susceptibles at $\tau$ followed by a new infection with probability $(r+1)(n-r)\Delta\tau$ , or $r$ susceptibles at $\tau$ followed by no infection with probability $1-r(n-r+1)\Delta\tau$. Thus $p_r(\tau+\Delta\tau)$, the probability of $r$ susceptibles remaining at time $\tau+\Delta\tau$ is given by $$p_r(\tau+\Delta\tau)=(r+1)(n-r)\Delta\tau p_{r+1}(\tau)+\{1-r(n-r+1)\Delta\tau\}p_r(\tau)$$ from which we obtain the differential-difference equation \begin{align*}\frac{dp_r(\tau)}{d\tau}&=\lim_{\Delta\tau\to 0}\frac{p_r(\tau+\Delta\tau)-p_r(\tau)}{\Delta\tau}\\&=(r+1)(n-r)p_{r+1}(\tau)-r(n-r+1)p_r(\tau)\end{align*} where $0\leq r\leq n-1$. For $r=n$ at $\tau+\Delta\tau$, there are $n$ susceptibles at $\tau$ followed by no new infection with probability $1-n\Delta\tau$. Thus we have $$p_n(\tau+\Delta\tau)=(1-n\Delta\tau)p_n(\tau)$$ and $$\frac{dp_n}{d\tau}=-np_n(\tau)$$ At $\tau=0$, there is only one infective, so we have the initial condition $p_n(0)=1$. We now have the system of differential-difference equations \begin{equation}\label{eq:stocepidemic}\frac{dp_r(\tau)}{d\tau}=(r+1)(n-r)p_{r+1}(\tau)-r(n-r+1)p_r,\ 0\leq r\leq n\end{equation} with initial condition $p_n(0)=1$.

One may attempt to solve the system \eqref{eq:stocepidemic} of differential-difference equations successively, similarly to that of the Poisson distribution (see here). The first three solutions are given by \begin{align*}p_n(\tau)&=e^{-n\tau}\\p_{n-1}(\tau)&=\frac{n}{n-1}[1-e^{-(n-1)\tau}]e^{-(n-1)\tau}\\p_{n-1}(\tau)&=\frac{2n}{2n-5}e^{-(n-1)\tau}-\frac{2n}{n-4}e^{-2(n-1)\tau}+\frac{2n(n-1)}{(n-4)(2n-5)}e^{-3(n-2)\tau}\end{align*} It is already difficult to see any recognizable pattern that may lead us to a general form of the solution. There are alternative ways to solve \eqref{eq:stocepidemic}, one of which is by using a generating function (see here). Let $P(x,r)=\sum_{r=0}^np_r(\tau)x^r$. Multiplying \eqref{eq:stocepidemic} by $x^r$ and summing the result over $r$ from $r=0$ to $r=n$, we obtain the partial differential equation $$\frac{\partial P}{\partial\tau}=(1-x)\left\{n\frac{\partial P}{\partial x}-x\frac{\partial^2 P}{\partial x^2}\right\}$$ with initial condition $P(x,0)=x^n$.

References:

  1. Norman T. J. Bailey, A Simple Stochastic Epidemic, Biometrika, Vol. 37, No. 3/4 (Dec., 1950), 193-202.
  2. The Mathematical Theory of Infectious Diseases and Its Applications, Norman T. J. Bailey, Second Edition, Charles Griffin & Company LTD, 1975.

The Poisson Process: Use of Generating Functions

In here, we obtained the Poisson distribution by successively solving a differential-difference equation. While getting successive solutions of the Poisson distribution was easy and simple, the same thing can’t be said in general. In this note, we discuss another method of obtaining probability distributions. Let $P(x,t)=\sum_{n=0}^\infty p_n(t)x^n$. $P(x,t)$ is called a probability generating function for the probability distribution $p_n(t)$. Recall the differential-difference equation (2) in here: $$\frac{dp_n(t)}{dt}=\lambda\{p_{n-1}(t)-p_n(t)\},\ n\geq 0$$ with $p_{-1}(t):=0$ and $p_0(0)=1$. Multiplying the equation by $x^n$ and then summing over $n$ from $n=1$ to $\infty$, we obtain the differential equation \begin{equation}\label{eq:genfn}\frac{\partial P(x,t)}{\partial t}=\lambda(x-1)P(x,t)\end{equation} with $P(x,0)=1$. \eqref{eq:genfn} only contains derivative with respect to $t$, so it is a separable equation. Its solution is given by $$P(x,t)=e^{\lambda t(x-1)}$$ From this we easily obtain the Poisson distribution $$p_n(t)=\frac{e^{-\lambda t}(\lambda t)^n}{n!}$$

References:

  1. Norman T. J. Bailey, The Elements of Stochastic Processes with Applications to the Natural Sciences, John Wiley & Sons, Inc., 1964.

Quantum Gates: Single Qubit Gates

A classical computer is built from electrical circuits that consist of logic gates and wires. The logic gates perform intended logical operations of information and wires pass information between logic gates. Similarly, a quantum computer would be built from quantum circuits that consist of quantum gates and wires. However, when it comes down to comparing how operations are performed at elementary gates level, there is no similarity between classical logic gates and quantum gates. Unlike classical logic gates, quantum gates operate not according to logical rules but to physical (quantum mechanical) laws. A single qubit gate acts on not only on qubits $|0\rangle=\begin{bmatrix}1\\0\end{bmatrix}$ and $|1\rangle=\begin{bmatrix}0\\1\end{bmatrix}$ but also on a superposition of $|0\rangle$ and $|1\rangle$, $\alpha|0\rangle+\beta|1\rangle$, where $\alpha$ and $\beta$ are complex numbers. For an obvious reason, single qubit gates are $2\times 2$ matrices. Furthermore, a quantum gate operation is performed through a unitary time evolution, and this means that they are required to be unitary matrices. Recall that a matrix $U$ of complex entries is unitary if $UU^\dagger=I$ where $\dagger$ denotes transpose conjugate i.e. $U^\dagger = \bar U^t$. If we write $U=\begin{bmatrix}a & b\\c & d\end{bmatrix}$, then $U$ is unitary if and only if $|a|^2+|b|^2=|c|^2+|d|^2=1$ and $a\bar c+b\bar d=0$. Any unitary matrix is qualified to be a quantum gate but not all unitary matrices would be interesting or useful gates. For practical purposes, it is important to see which gates are universal if there are any and that which gates are physically implementable. The following are some examples of well-known single cubit gates.

  1. Pauli-X, Pauli-Y, Pauli-Z gates. A matrix $H$ is Hermitian if $H=H^\dagger$. Hermitian matrices play a crucial role in quantum mechanics as Hamiltonians are required to be Hermitian. The eigenvalues of a Hamiltonian are the energies of the corresponding quantum system and of course they all have to be real. Hermitian matrices are guaranteed to have all real eigenvalues. A $2\times 2$ Hermitian matrix can be written in the form $H=\begin{bmatrix}x_0+x_3 & x_1-ix_2\\x_1+ix_2 & x_0-x_3\end{bmatrix}$ where the $x_i$ are all reals. The matrices \begin{equation}\label{eq:pauli}\sigma_0=\begin{bmatrix}1 & 0\\0 & 1\end{bmatrix},\ \sigma_1=\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix},\ \sigma_2=\begin{bmatrix}0 & -i\\i & 0\end{bmatrix},\ \sigma_3=\begin{bmatrix}1 & 0\\0 & -1\end{bmatrix}\end{equation} form a basis for the space of all Hermitian matrices. The $\sigma_i$ in \eqref{eq:pauli} are called the Pauli spin matrices in physics. $\sigma_0$ is just the identity matrix but notice that $\sigma_1,\sigma_2,\sigma_3$ are also unitary matrices. They are called Pauli-X, Pauli-Y, Pauli-Z gates, respectively. Pauli-X gate is also called NOT gate as it acts like the classical NOT gate. \begin{align*}\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}\begin{bmatrix}1\\0\end{bmatrix}&=\begin{bmatrix}0\\1\end{bmatrix}\\\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix}&=\begin{bmatrix}0\\1\end{bmatrix}\end{align*}
  2. Square root of NOT gate. The gate $$\sqrt{\mathrm{NOT}}=\frac{1}{2}\begin{bmatrix}1+i & 1-i\\1-i & 1+i\end{bmatrix}$$ is called square root of NOT gate because $$\sqrt{\mathrm{NOT}}\cdot\sqrt{\mathrm{NOT}}=\begin{bmatrix}0 & 1\\1 & 0\end{bmatrix}$$ $\sqrt{\mathrm{NOT}}$ is one of the gates that are truly quantum because there is no classical counterpart.
  3. Hadamard gate. $$H=\frac{1}{\sqrt{2}}\begin{bmatrix}1 & 1\\1 & -1\end{bmatrix}$$ is called the Hadamard gate. It turns $|0\rangle$ to $\frac{|0\rangle+|1\rangle}{\sqrt{2}}$ and $|1\rangle$ to $\frac{|0\rangle-|1\rangle}{\sqrt{2}}$.

References:

  1. Michael A. Nielsen and Isaac L. Chung, Quantum Computation and Quantum Information, Cambridge University Press, 2000.
  2. Colin P. Williams and Scott H. Clearwater, Explorations in Quantum Computing, Spinger Telos, 1998.

Markov Processes: The Poisson Process

A Markov process is a stochastic process such that given the present state of the system, the future behavior is independent of the past history.

Suppose that the size of the population under consideration at time $t$ is represented by a discrete random variable $X(t)$ with probability $$P\{X(t)=n\}=p_n(t),\ n=0,1,2,\cdots$$ At certain instants of time, there will be discrete changes in the population size, due to the loss or death of an individual, or to the appearance or birth of new individuals.

We now consider an example of a stochastic process, which will give rise to a simple model of a Markov process called the Poisson process. Suppose that we are measuring the radiation of a certain radioactive substance. Let $X(t)$ be the total number of particles recorded up to time $t$. Also suppose that the chance of a new particle being recorded is any short time interval is independent of not only the previous states but also the present state. Then the chance of a new addition to the total count during a very short time interval $\Delta t$ can be written as $\lambda\Delta t+o(\Delta t)$ where $\lambda$ is a constant and $o(\Delta t)$ is the chance of two or more simultaneous emissions. The chance of no change in the total count would be $1-\Delta t-o(\Delta t)$. Thus, we obtain \begin{equation}\label{eq:poisson}p_n(t+\Delta t)=p_{n-1}(t)\lambda\Delta t+p_n(t)(1-\lambda\Delta t)\end{equation} where terms that are small compared with $\Delta t$ have been disregarded. \eqref{eq:poisson} results in the following differential-difference equation \begin{equation}\begin{aligned}\frac{dp_n(t)}{dt}&=\lim_{\Delta t\to 0}\frac{p_n(t+\Delta t)-p_n(t)}{\Delta t}\\&=\lambda\{p_{n-1}(t)-p_n(t)\},\ n>0\end{aligned}\label{eq:poisson2}\end{equation}

Note that we have $n=0$ at time $t+\Delta t$ only if $n=0$ at time $t$ and no new particles are emitted in $\Delta t$. This means that we have $$p_0(t+\Delta t)=p_0(t)(1-\lambda\Delta t)$$ and subsequently the differential equation $$\frac{dp_0}{dt}=-\lambda p_0(t)$$ whose solution is $$p_0(t)=e^{-\lambda t}$$ with the initial condition $p_0(0)=1$. This initial condition can be obtained by setting the Geiger counter to $0$ in the beginning of the process. Starting off with $p_0(t)$, iterative applications of \eqref{eq:poisson2} will result in $$p_n(t)=\frac{(\lambda t)^ne^{-\lambda t}}{n!},\ n=0,1,2,\cdots$$ which is a Poisson distribution with parameter $\lambda t$.

References:

  1. Norman T. J. Bailey, The Elements of Stochastic Processes with Applications to the Natural Sciences, John Wiley & Sons, Inc., 1964.