Category Archives: Linear Algebra

System of Linear Equations and Determinant

In this note, we discuss the relationship between a system of linear equations and the determinant of its coefficients. For simplicity, I am considering only a system of two linear equations with two variables. But a similar argument can be used for more general cases. Let us consider the system of linear equations $$\left\{\begin{aligned}ax+by&=e\\cx+dy&=f\end{aligned}\right.,$$ where none of $a,b,c,d,e,f$ is zero. The two linear equations are equations of lines in the plane, so we know there are three possibilities: There is no solution of the system in which case the two lines are parallel (so they do not meet), the system has a unique solution in which case the two lines meet at exactly one point, or the system has infinitely many solutions in which case the two lines are identical. This system can be written in terms of matrices as $$\begin{pmatrix}a & b\\c & d\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}e\\f\end{pmatrix}$$ Let $A=\begin{pmatrix}a & b\\c & d\end{pmatrix}$. If $\det A\ne 0$, then the system has a unique solution and it can be found using the Cramer’s rule as follows: $$x=\frac{\begin{vmatrix}e & b\\f & d\end{vmatrix}}{\det A},\ y=\frac{\begin{vmatrix}a & e\\c & f\end{vmatrix}}{\det A}$$ Note that $\det A=0$ if and only if the two lines have the same slope. Suppose that $\det A=0$. Then one can easily show that $\begin{vmatrix}e & b\\f & d\end{vmatrix}=0$ if and only if $\begin{vmatrix}a & e\\c & f\end{vmatrix}=0$. From $\det A=0$ and $\begin{vmatrix}e & b\\f & d\end{vmatrix}=0$, we have the system of equations: \begin{align}\label{eqn1}ad-bc&=0\\\label{eqn2}ed-fc&=0\end{align} Subtracting $a$ times \eqref{eqn2} from $e$ times \eqref{eqn1} yields $b(af-ec)=0$. Since $b\ne 0$, $af-ec=\begin{vmatrix}a & e\\c & f\end{vmatrix}=0$ which means that the two lines have the same $y$-intercept. This is the case when the two lines coincide and hence the system has infinitely many solutions (all the points on the line are solutions). Lastly, we know $\begin{vmatrix}e & b\\f & d\end{vmatrix}\ne0$ if and only if $\begin{vmatrix}a & e\\c & f\end{vmatrix}\ne0$. If $\begin{vmatrix}a & e\\c & f\end{vmatrix}\ne0$ while $\det A=0$, from the Cramer’s rule the system does not have a solution. $\begin{vmatrix}a & e\\c & f\end{vmatrix}\ne0$ means that the two lines have different $y$-intercepts, so this is the case when the two lines are parallel i.e. they do not meet. A system of homogeneous linear equations $$\left\{\begin{aligned}ax+by&=0\\cx+dy&=0\end{aligned}\right.$$ comes down to only two cases: the system has a unique solution $x=y=0$ (if $\det A\ne 0$) or has infinitely many solutions (if $\det A=0$). This is also obvious from considering two lines passing through the origin.

Square Roots of Operators II

The determinant (and also trace) of a nilpotent matrix is always zero, so a nilpotent matrix cannot be invertible. However, if $N$ is nilpotent of index $m$, then $I+N$ and $I-N$ are invertible and the inverses are given by
\begin{align*} (I+N)^{-1}&=\sum_{k=0}^{m-1}(-N)^k=I-N+N^2-N^3+\cdots+(-N)^{m-1}\\ (I-N)^{-1}&=\sum_{k=0}^{m-1}N^k=I+N+N^2+N^3+\cdots+N^{m-1} \end{align*}
We have shown here that invertible operators have square roots. By the same token, we see that the square roots of $(I+N)^{-1}$ and $(I-N)^{-1}$ exist. But how do we find them? Let us denote them by $(I+N)^{-\frac{1}{2}}$ and $(I-N)^{-\frac{1}{2}}$, respectively. Then by the same manner we proved the existence of $\sqrt{N+1}$ here, we obtain
\begin{align*} (I+N)^{-\frac{1}{2}}&=I-\frac{1}{2}N-\frac{3}{8}N^2-\frac{11}{16}N^3+\cdots\\ (I-N)^{-\frac{1}{2}}&=I+\frac{1}{2}N+\frac{3}{8}N^2+\frac{5}{16}N^3+\cdots \end{align*}

Square Roots of Operators

Mathematics is full of weird stuff. One of them is the square root of an operator. So far, I have seen only two books that discuss the square root of an operator. They are listed in the references below.

Definition. An operator $R$ is called a square root of an operator $T$ is $R^2=T$.

Example. In quantum computing, $\sqrt{\mathrm{NOT}}$ gate is given by
$$\sqrt{\mathrm{NOT}}=\begin{bmatrix}
\frac{1+i}{2} & \frac{1-i}{2}\\
\frac{1-i}{2} & \frac{1+i}{2}
\end{bmatrix}$$
and
$$\sqrt{\mathrm{NOT}}\cdot\sqrt{\mathrm{NOT}}=\mathrm{NOT}=\begin{bmatrix}
0 & 1\\
1 & 0
\end{bmatrix}$$
As required by quantum mechanics, quantum gates are unitary matrices. There is no counterpart of $\sqrt{\mathrm{NOT}}$ gate in classical computing, so $\sqrt{\mathrm{NOT}}$ gate is a truly quantum gate. As far as I know, no one has come up with a physical implementation of $\sqrt{\mathrm{NOT}}$ gate yet.

Example. An operator does not necessarily have a square root. For example, define $T:\mathbb{C}^3\longrightarrow\mathbb{C}^3$ by
$$T(z_1,z_2,z_3)=(z_2,z_3,0)$$
Then one can easily show that $T$ is linear. Suppose that $T$ has a square root $R$. Let $\begin{bmatrix}
r & s & t\\
u & v & w\\
z & y & z
\end{bmatrix}$ be the matrix associated with $R$. Since the matrix associated with $T$ is $\begin{bmatrix}
0 & 1 & 0\\
0 & 0 & 1\\
0 & 0 & 0
\end{bmatrix}$, we have the equation
$$\begin{bmatrix}
r & s & t\\
u & v & w\\
z & y & z
\end{bmatrix}\cdot\begin{bmatrix}
r & s & t\\
u & v & w\\
z & y & z
\end{bmatrix}=\begin{bmatrix}
0 & 1 & 0\\
0 & 0 & 1\\
0 & 0 & 0
\end{bmatrix}$$
This is a system of 9 scalar equations. One can attempt to solve this equation using a CAS (Computer Algebra System), for example, Maple and see that the system does not have a solution, i.e. $R$ does not exist.

So, what kind of operators have square roots?

Theorem. Identity + Nilpotent has a square root.

Proof. Let $N$ be nilpotent. Then $N^m=0$ for some positive integer $m$. Consider the Taylor series for $\sqrt{1+x}$:
$$\sqrt{1+x}=1+a_1x+a_2x^2+a_3x^3+\cdots$$
Using the nilpotency of $N$, we may guess that $\sqrt{I+N}$ takes the form
$$I+a_1N+a_2N^2+\cdots+a_{m-1}N^{m-1}$$
Now,
\begin{align*} (I+a_1N+a_2N^2+\cdots+a_{m-1}N^{m-1})^2=I&+2a_1N+(2a_2+a_1^2)N^2\\&+(2a_3+2a_1a_2)N^3+\cdots\\ &+(2a_{m-1}+\cdots)N^{m-1}=I+N \end{align*}
and by comparing the coefficients
$a_1=\frac{1}{2}$, $2a_2+a_1^2=0$, so $a_2=-\frac{1}{8}$, $2a_3+2a_1a_2=0$, so $a_3=\frac{1}{16}$, and so and forth.

Theorem. Suppose that $T: V\longrightarrow V$ is invertible. Then $T$ has a square root.

Proof. Let $\lambda_1,\lambda_2,\cdots,\lambda_m$ be the distinct eigenvalues of $T$. Then for each $j$, there exists a nilpotent operator $N_j: G(\lambda_j,T)\longrightarrow G(\lambda_j,T)$ such that $T|{G(\lambda_j,T)}=\lambda_jI+N_j$. (See [1] , p 252, Theorem 8.21 for a proof.) Here, $G(\lambda_j,T)$ is the eigenspace of $T$ corresponding to the eigenvalue $\lambda_j$. Since $T$ is invertible, no $\lambda_j$ is equal to 0, so we can write $$T|{G(\lambda_j,T)}=\lambda_j\left(I+\frac{N}{\lambda_j}\right)$$
Since $\frac{N}{\lambda_j}$ is nipotent, $I+\frac{N}{\lambda_j}$ has a square root. Let $R_j$ be $\sqrt{\lambda_j}$ times the square root of $I+\frac{N}{\lambda_j}$. Any vector $v\in V$ can be written uniquely in the form
$$v=u_1+u_2+\cdots+u_m,$$
where each $u_j$ is in $G(\lambda_j,T)$. Using this decomposition, define an operator $R:V\longrightarrow V$ by
$$Rv=R_1u_1+R_2 u_2+\cdots+R_m u_m$$
Then $R^2u_j=R_j^2 u_j=T|{G(\lambda_j,T)}u_j=\lambda_ju_j$ and so \begin{align*} R^2v&=R_1^2u_1+R_2^2u_2+\cdots+R_m^2u_m\\ &= T|{G(\lambda_1,T)}u_1+T|{G(\lambda_2,T)}u_2+\cdots+T|{G(\lambda_m,T)}u_m=Tv
\end{align*}
Therefore, $R$ is a square root of $T$.

Example. The $\mathrm{NOT}$ gate is invertible. It is an inverse of itself.

The proof of the above theorem suggests that an operator $T$ has a square root if there is a spectral decomposition of $T$ with respect to the standard inner product of $\mathbb{C}^n$. A normal operator is such an operator. Recall that an operator $T: V\longrightarrow V$ is normal if $T^\ast T=TT^\ast$. The following theorem guarantees the existence of such a spectral decomposition of a normal operator.

Theorem. A matrix $T$ is normal if and only if there exists a diagonal matrix $\Lambda$ and an unitary matrix $U$ such that $T=U\Lambda U^\ast$.

Here, the diagonal entries of $\Lambda$ are the eigenvalues of $T$ and the columns of $U$ are the eigenvectors of $T$. The matching eigenvalues in $\Lambda$ come in the same order as the eigenvectors are ordered as columns of $U$.

Let $T: V\longrightarrow V$ be a normal operator and write it as its spectral decomposition, using Dirac’s braket notation,
$$T=\sum_j\lambda_j|v_j\rangle\langle v_j|$$
Define
$$f(T)=\sum_j f(\lambda_j)|v_j\rangle\langle v_j|$$
Using this definition, one can define, for example, the exponential, square root, and logarithmic operator of $T$.

Example. Let $T=\begin{bmatrix}
0 & 1\\
1 & 0
\end{bmatrix}$. Find the square root of $T$.

Solution. $T$ is hermitian (symmetric), so it is normal. The spectral decomposition of $T$ is given by
$$T=|v_1\rangle\langle v_1|-|v_2\rangle\langle v_2|,$$
where
$$|v_1\rangle=\frac{1}{\sqrt{2}}\begin{bmatrix}
1\\
1
\end{bmatrix}\ \mathrm{and}\ |v_2\rangle=\frac{1}{\sqrt{2}}\begin{bmatrix}
1\\
-1
\end{bmatrix}$$
Now,
\begin{align*} \sqrt{T}&=|v_1\rangle\langle v_1|+i|v_2\rangle\langle v_2|\\ &=\frac{1}{2}\begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix}+\frac{i}{2}\begin{bmatrix} 1 & -1\\ -1 & 1 \end{bmatrix}\\ &=\begin{bmatrix} \frac{1+i}{2} & \frac{1-i}{2}\\ \frac{1-i}{2} & \frac{1+i}{2} \end{bmatrix} \end{align*}

References:

  1. Sheldon Axler, Linear Algebra Done Right, Third Edition, Springer, 2015
  2. Michael A. Nielsen and Isaac L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2000

The Eigenvectors of a Hermitian Operator Are Mutually Orthogonal

In quantum mechanics, operators are required to be Hermitian. A reason for this is that the eigenvalues of a quantum mechanical operator are interpreted as physically measurable quantities such as positions, momenta, energies, etc. and therefore they are required to be real. As is well-known, Hermitian operators have all real eigenvalues. Hermitian operators also have another nice property. The eigenvectors of a Hermitian operator are mutually orthogonal. Here, we prove this only for the case of matrix operators. Let $A$ be a Hermitian operator and let $|\lambda_i\rangle$ be the eigenvectors of $A$ with distinct eigenvalues $\lambda_i$. Then we have $A|\lambda_i\rangle=\lambda_i|\lambda_i\rangle$. So, $\langle\lambda_j|A|\lambda_i\rangle=\lambda_i\langle\lambda_j|\lambda_i\rangle$. On the other hand,
\begin{align*} \langle\lambda_j|A&=(A^\dagger|\lambda_j)^\dagger\\ &=(A|\lambda_j\rangle)^\dagger\ (A^\dagger=A)\\ &=(\lambda_j|\lambda_i\rangle)^\dagger\\ &=\bar\lambda_j\langle\lambda_j|\\ &=\lambda_j\langle\lambda_j|\ (\mbox{the $\lambda_i$ are real}) \end{align*}
From this we also obtain $\langle\lambda_j|A|\lambda_i\rangle=\lambda_j\langle\lambda_j|\lambda_i\rangle$. This means that $\lambda_i\langle\lambda_j|\lambda_i\rangle=\lambda_j\langle\lambda_j|\lambda_i\rangle$, i.e. we have
$$(\lambda_i-\lambda_j)\langle\lambda_j|\lambda_i\rangle=0$$
If $i\ne j$ then $\lambda_i\ne\lambda_j$ and so $\langle\lambda_j|\lambda_i\rangle=0$.

Example. Let us consider the matrix $A=\begin{pmatrix}
3 & 1+i\\
-1+i & -3
\end{pmatrix}$. The adjoint (i.e. the conjugate transpose of this matrix) of $A$ is $A^\dagger=\begin{pmatrix}
3 & -1-i\\
1-i & -3
\end{pmatrix}$. Since $A\ne A^\dagger$, $A$ is not Hermitian. Although $A$ is not Hermitian, it has real eigenvalues $\pm\sqrt{7}$ and the eigenvectors corresponding to the eigenvectors $\sqrt{7}$ and $=-\sqrt{7}$ are, respectively,
$$v_1=\begin{pmatrix}
\frac{1+i}{3-\sqrt{7}}\\
1
\end{pmatrix},\ v_2=\begin{pmatrix}
\frac{1+i}{3+\sqrt{7}}\\
1
\end{pmatrix}$$
$\langle v_1|v_2\rangle=2$, so they are not orthogonal. Interestingly, $v_1$ and $v_2$ are orthogonal with respect to the inner product
\begin{equation}
\label{eq:jproduct}
\langle v_1|J|v_2\rangle
\end{equation}
where
$$J=\begin{pmatrix}
1 & 0\\
0 & -1
\end{pmatrix}$$
The matrices of the form $$H=\begin{pmatrix}
a & b\\
-\bar b & d
\end{pmatrix},$$ where $a$ and $d$ are real and $(a-d)^2-4|b|^2>0$ (this condition is required to ensure that such a matrix has two distinct real eigenvalues), are self-adjoint with respect to the inner product \eqref{eq:jproduct} i.e. $H$ satisfies \begin{equation}\label{eq:jself-adjoint}\langle Hv_1|J|v_2\rangle=\langle v_1|J|Hv_2\rangle\end{equation} which is equivalent to $$H^\dagger J=JH$$

Exercise. Prove that the eigenvectors of a matrix which satisfies \eqref{eq:jself-adjoint} are mutually orthogonal withe respect to the inner product \eqref{eq:jproduct}.

The Tensor Product

Let $V$ and $W$ be two vector spaces of dimensions $m$ and $n$, respectively. The tensor product of $V$ and $W$ is a space $V\otimes W$ of dimension $mn$ together with a bilinear map $$\varphi: V\times W\longrightarrow V\otimes W;\ \varphi(v,w)=v\otimes w$$ which satisfy the following universal property: For any vector space $X$ and any bilinear map $\psi: V\times W\longrightarrow X$, there exists uniquely a linear map $\gamma : V\otimes W\longrightarrow X$ such that $\psi(v,w)=\gamma(u\otimes w)$ for all $v\in V$ and $w\in W$.

$$\begin{array}[c]{ccc}V\otimes W & & \\\uparrow\scriptstyle{\varphi} & \scriptstyle{\gamma}\searrow& \\V\times W & \stackrel{\psi}\rightarrow & X\end{array}\ \ \ \gamma\circ\varphi=\psi$$

Often, we use more down to earth definition of the tensor product. Let $\{e_1,\cdots,e_m\}$ and $\{f_1,\cdots,f_n\}$ be bases of $V$ and $W$, respectively. The tensor product $V\otimes W$ is a vector space of dimension $mn$ spanned by the basis $\{e_i\otimes f_j: i=1,\cdots,m,\ j=1,\cdots,n\}$. Let $v\in V$ and $w\in W$. Then $$v=\sum_i v_ie_i\ \mbox{and}\ w=\sum_j w_jf_j$$ The tensor product $v$ and $w$ is then given by $$v\otimes w=\sum_{i,j}v_iw_je_i\otimes f_j$$ It can be easily shown that this definition of the tensor product in terms of prescribed bases satisfies the universality property. Although this definition uses a choice of bases of $V$ and $W$, the tensor product $V\otimes W$ must not depend on a particular choice of bases, i.e. regardless of the choice of bases the resulting tensor product must be the same. This also can be shown easily using some basic properties from linear algebra. I will leave them for exercise for readers.

The tensor product can be used to describe the state of a quantum memory register. A quantum memory register consists of many 2-state systems (Hilbert spaces of qubits). Let $|\psi^{(1)}\rangle$ and $|\psi^{(2)}\rangle$ be qubits associated with two different 2-state systems. In terms of the standard orthogonal basis $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$ for each 2-state system, we have \begin{align*}|\psi^{(1)}\rangle&=\begin{pmatrix}\omega_0^{(1)}\\\omega_1^{(1)}\end{pmatrix}=\omega_0^{(1)}\begin{pmatrix}1\\0\end{pmatrix}+\omega_1^{(1)}\begin{pmatrix}0\\1\end{pmatrix}\\|\psi^{(2)}\rangle&=\begin{pmatrix}\omega_0^{(2)}\\\omega_1^{(2)}\end{pmatrix}=\omega_0^{(2)}\begin{pmatrix}1\\0\end{pmatrix}+\omega_1^{(2)}\begin{pmatrix}0\\1\end{pmatrix}\end{align*} Define $\otimes$ on the basis members as follows: \begin{align*}|00\rangle&=\begin{pmatrix}1\\0\end{pmatrix}\otimes\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}1\\0\\0\\0\end{pmatrix},\ |01\rangle=\begin{pmatrix}1\\0\end{pmatrix}\otimes\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}0\\1\\0\\0\end{pmatrix}\\|10\rangle&=\begin{pmatrix}0\\1\end{pmatrix}\otimes\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}0\\0\\1\\0\end{pmatrix},\ |11\rangle=\begin{pmatrix}0\\1\end{pmatrix}\otimes\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}0\\0\\0\\1\end{pmatrix}\end{align*} These four vectors form a basis for a 4-dimensional Hilbert space (a 2-qubit memory register). It follows that \begin{align*}|\psi^{(1)}\rangle\otimes|\psi^{(2)}\rangle&=\omega_0^{(1)}\omega_0^{(2)}|00\rangle+\omega_0^{(1)}\omega_1^{(2)}|01\rangle+\omega_1^{(1)}\omega_0^{(2)}|10\rangle+\omega_1^{(1)}\omega_1^{(2)}|11\rangle\\&=\begin{pmatrix}\omega_0^{(1)}\omega_0^{(2)}\\\omega_0^{(1)}\omega_1^{(2)}\\\omega_1^{(1)}\omega_0^{(2)}\\\omega_1^{(1)}\omega_1^{(2)}\end{pmatrix}\end{align*}Similarly, to describe the state of a 3-qubit memory register, one performs the tensor product $|\psi^{(1)}\rangle\otimes|\psi^{(2)}\rangle\otimes|\psi^{(3)}\rangle$.

Quantum memory registers can store an exponential amount of classical information in only a polynomial number of qubits using the quantum property the Principle of Superposition. For example, consider the two classical memory registers storing complimentary sequences of bits $$\begin{array}{|c|c|c|c|c|c|c|}\hline1 & 0 & 1 & 1 & 0 & 0 &1\\\hline 0 & 1 & 0 & 0 & 1 & 1 & 0\\\hline\end{array}$$ A single quantum memory register can store both sequences simultaneously in an equally weighted superposition of the two states representing each 7-bit input $$\frac{1}{\sqrt{2}}(|1011001\rangle+|0100110\rangle)$$

A matrix can be considered as a vector. For example, a $2\times 2$ matrix $\begin{pmatrix}a & b\\c & d\end{pmatrix}$ can be identified with the vector $(a, b, c, d) \in \mathbb{R}^4$. Hence one can define the tensor product of two matrices, called the Kronecker product, in a similar manner to that of two vectors. For example, $$\begin{pmatrix}a_{11} & a_{12}\\a_{21} & a_{22}\end{pmatrix}\otimes\begin{pmatrix}b_{11} & b_{12}\\b_{21} & b_{22}\end{pmatrix}:=\begin{pmatrix}a_{11}\begin{pmatrix}b_{11} & b_{12}\\b_{21} & b_{22}\end{pmatrix} & a_{12}\begin{pmatrix}b_{11} & b_{12}\\b_{21} & b_{22}\end{pmatrix}\\a_{21}\begin{pmatrix}b_{11} & b_{12}\\b_{21} & b_{22}\end{pmatrix} & a_{22}\begin{pmatrix}b_{11} & b_{12}\\b_{21} & b_{22}\end{pmatrix}\end{pmatrix}\\=\begin{pmatrix}a_{11}b_{11} & a_{11}b_{12} & a_{12}b_{11} & a_{12}b_{12}\\a_{11}b_{21} & a_{11}b_{22} & a_{12}b_{21} & a_{12}b_{22}\\a_{21}b_{11} & a_{21}b_{12} & a_{22}b_{11} & a_{22}b_{12}\\a_{21}b_{21} & a_{21}b_{22} & a_{22}b_{21} & a_{22}b_{22}\end{pmatrix}$$

References:

[1] A. Yu. Kitaev, A. H. Shen and M. N. Vyalyi, Classical and Quantum Computation, Graduate Studies in Mathematics Volume 47, American Mathematical Society, 2002

[2] Colin P. Williams and Scott H. Clearwater, Explorations in Quantum Computing, Springer TELOS, 1998