Category Archives: Linear Algebra

The Tensor Product

Let $V$ and $W$ be two vector spaces of dimensions $m$ and $n$, respectively. The tensor product of $V$ and $W$ is a space $V\otimes W$ of dimension $mn$ together with a bilinear map $$\varphi: V\times W\longrightarrow V\otimes W;\ \varphi(v,w)=v\otimes w$$ which satisfy the following universal property: For any vector space $X$ and any bilinear map $\psi: V\times W\longrightarrow X$, there exists uniquely a linear map $\gamma : V\otimes W\longrightarrow X$ such that $\psi(v,w)=\gamma(u\otimes w)$ for all $v\in V$ and $w\in W$.

$$\begin{array}[c]{ccc}V\otimes W & & \\\uparrow\scriptstyle{\varphi} & \scriptstyle{\gamma}\searrow& \\V\times W & \stackrel{\psi}\rightarrow & X\end{array}\ \ \ \gamma\circ\varphi=\psi$$

Often, we use more down to earth definition of the tensor product. Let $\{e_1,\cdots,e_m\}$ and $\{f_1,\cdots,f_n\}$ be bases of $V$ and $W$, respectively. The tensor product $V\otimes W$ is a vector space of dimension $mn$ spanned by the basis $\{e_i\otimes f_j: i=1,\cdots,m,\ j=1,\cdots,n\}$. Let $v\in V$ and $w\in W$. Then $$v=\sum_i v_ie_i\ \mbox{and}\ w=\sum_j w_jf_j$$ The tensor product $v$ and $w$ is then given by $$v\otimes w=\sum_{i,j}v_iw_je_i\otimes f_j$$ It can be easily shown that this definition of the tensor product in terms of prescribed bases satisfies the universality property. Although this definition uses a choice of bases of $V$ and $W$, the tensor product $V\otimes W$ must not depend on a particular choice of bases, i.e. regardless of the choice of bases the resulting tensor product must be the same. This also can be shown easily using some basic properties from linear algebra. I will leave them for exercise for readers.

The tensor product can be used to describe the state of a quantum memory register. A quantum memory register consists of many 2-state systems (Hilbert spaces of qubits). Let $|\psi^{(1)}\rangle$ and $|\psi^{(2)}\rangle$ be qubits associated with two different 2-state systems. In terms of the standard orthogonal basis $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$ for each 2-state system, we have \begin{align*}|\psi^{(1)}\rangle&=\begin{pmatrix}\omega_0^{(1)}\\\omega_1^{(1)}\end{pmatrix}=\omega_0^{(1)}\begin{pmatrix}1\\0\end{pmatrix}+\omega_1^{(1)}\begin{pmatrix}0\\1\end{pmatrix}\\|\psi^{(2)}\rangle&=\begin{pmatrix}\omega_0^{(2)}\\\omega_1^{(2)}\end{pmatrix}=\omega_0^{(2)}\begin{pmatrix}1\\0\end{pmatrix}+\omega_1^{(2)}\begin{pmatrix}0\\1\end{pmatrix}\end{align*} Define $\otimes$ on the basis members as follows: \begin{align*}|00\rangle&=\begin{pmatrix}1\\0\end{pmatrix}\otimes\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}1\\0\\0\\0\end{pmatrix},\ |01\rangle=\begin{pmatrix}1\\0\end{pmatrix}\otimes\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}0\\1\\0\\0\end{pmatrix}\\|10\rangle&=\begin{pmatrix}0\\1\end{pmatrix}\otimes\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}0\\0\\1\\0\end{pmatrix},\ |11\rangle=\begin{pmatrix}0\\1\end{pmatrix}\otimes\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}0\\0\\0\\1\end{pmatrix}\end{align*} These four vectors form a basis for a 4-dimensional Hilbert space (a 2-qubit memory register). It follows that \begin{align*}|\psi^{(1)}\rangle\otimes|\psi^{(2)}\rangle&=\omega_0^{(1)}\omega_0^{(2)}|00\rangle+\omega_0^{(1)}\omega_1^{(2)}|01\rangle+\omega_1^{(1)}\omega_0^{(2)}|10\rangle+\omega_1^{(1)}\omega_1^{(2)}|11\rangle\\&=\begin{pmatrix}\omega_0^{(1)}\omega_0^{(2)}\\\omega_0^{(1)}\omega_1^{(2)}\\\omega_1^{(1)}\omega_0^{(2)}\\\omega_1^{(1)}\omega_1^{(2)}\end{pmatrix}\end{align*}Similarly, to describe the state of a 3-qubit memory register, one performs the tensor product $|\psi^{(1)}\rangle\otimes|\psi^{(2)}\rangle\otimes|\psi^{(3)}\rangle$.

Quantum memory registers can store an exponential amount of classical information in only a polynomial number of qubits using the quantum property the Principle of Superposition. For example, consider the two classical memory registers storing complimentary sequences of bits $$\begin{array}{|c|c|c|c|c|c|c|}\hline1 & 0 & 1 & 1 & 0 & 0 &1\\\hline 0 & 1 & 0 & 0 & 1 & 1 & 0\\\hline\end{array}$$ A single quantum memory register can store both sequences simultaneously in an equally weighted superposition of the two states representing each 7-bit input $$\frac{1}{\sqrt{2}}(|1011001\rangle+|0100110\rangle)$$

A matrix can be considered as a vector. For example, a $2\times 2$ matrix $\begin{pmatrix}a & b\\c & d\end{pmatrix}$ can be identified with the vector $(a, b, c, d) \in \mathbb{R}^4$. Hence one can define the tensor product of two matrices in a similar manner to that of two vectors. For example, $$\begin{pmatrix}a_{11} & a_{12}\\a_{21} & a_{22}\end{pmatrix}\otimes\begin{pmatrix}b_{11} & b_{12}\\b_{21} & b_{22}\end{pmatrix}:=\begin{pmatrix}a_{11}b_{11} & a_{11}b_{12} & a_{11}b_{21} & a_{11}b_{22}\\a_{12}b_{11} & a_{12}b_{12} & a_{12}b_{21} & a_{12}b_{22}\\a_{21}b_{11} & a_{21}b_{12} & a_{21}b_{21} & a_{21}b_{22}\\a_{22}b_{11} & a_{22}b_{12} & a_{22}b_{21} & a_{22}b_{22}\end{pmatrix}$$

References:

[1] A. Yu. Kitaev, A. H. Shen and M. N. Vyalyi, Classical and Quantum Computation, Graduate Studies in Mathematics Volume 47, American Mathematical Society, 2002

[2] Colin P. Williams and Scott H. Clearwater, Explorations in Quantum Computing, Springer TELOS, 1998

A Linear Algebra Problem on Twitter

Problem: Let $A$ and $B$ be $n\times n$ matrices such that their sum $A+B$ is invertible. Then show that $$A(A+B)^{-1}B=B(A+B)^{-1}A$$ (Hat tip: Sam Walters)

Solution. \begin{equation}\begin{aligned}I&=(A+B)(A+B)^{-1}\\&=A(A+B)^{-1}+B(A+B)^{-1}\end{aligned}\label{eq:matrix}\end{equation} Multiply \eqref{eq:matrix} by $B$ from the right \begin{equation}\label{eq:matrix2}B=A(A+B)^{-1}B+B(A+B)^{-1}B\end{equation} Also multiply \eqref{eq:matrix} by $A$ from the left \begin{equation}\label{eq:matrix3}A=A(A+B)^{-1}A+B(A+B)^{-1}A\end{equation} Subtract \eqref{eq:matrix3} from \eqref{eq:matrix2}. \begin{equation}\label{eq:matrix4}B-A=A(A+B)^{-1}B-B(A+B)^{-1}A+B(A+B)^{-1}B-A(A+B)^{-1}A\end{equation} In a similar manner from $I=(A+B)^{-1}(A+B)$, we obtain \begin{equation}\label{eq:matrix5}A-B=A(A+B)^{-1}B-B(A+B)^{-1}A+A(A+B)^{-1}A-B(A+B)^{-1}B\end{equation} \eqref{eq:matrix4}+\eqref{eq:matrix5} results $$A(A+B)^{-1}B=B(A+B)^{-1}A$$

A mathematician who Twitter username is Manifoldless beat me to it by a few minutes :). But not just that. His solution is shorter (so better) than mine: \begin{align*}A(A+B)^{-1}B&=(A+B-B)(A+B)^{-1}(A+B-A)\\&=[I-B(A+B)^{-1}](A+B-A)\\&=A+B-A-B+B(A+B)^{-1}A\\&=B(A+B)^{-1}A\end{align*}

Determinants as Area and Volume

The Area of a Parallelogram

Let $v=(v_1,v_2)$ and $w=(w_1,w_2)$ be two linearly independent vectors in $\mathbb{R}^2$. Then they span a parallelogram as shown in Figure 1.

Figure 1. Parallelogram spanned by two vector v and w.

The area $A$ of the parallelogram is
\begin{align*}
A&=||v||||w||\sin\theta\\
&=||v\times w||\\
&=\left|\begin{array}{cc}
v_1 & v_2\\
w_1 & w_2
\end{array}\right|.
\end{align*}
In general, the resulting determinant is not necessarily positive. If it is negative, we need to take the absolute value of the determinant for the area.

Exercise. Given two vectors $v=(v_1,v_2,v_3)$ and $w=(w_1,w_2,w_3)$ in $\mathbb{R}^3$, show that $||v||||w||\sin\theta=||v\times w||$ where $\theta$ is the angle between $v$ and $w$.

Hint. Note that $||v||^2||w||^2\sin^2\theta=||v||^2||w||^2-(v\cdot w)^2$.

The Volume of a Parallelepiped

Let $u=(u_1,u_2,u_3)$, $v=(v_1,v_2,v_3)$, $w=(w_1,w_2,w_3)$ be three linearly independent vectors in $\mathbb{R}^3$. Then they span a parallelepiped as shown in Figure 2.

Figure 2. Parallelepiped spanned by vectors u, v and w.

The volume $V$ of the parallelepiped is
\begin{align*}
V&=||u||\cos\theta ||v\times w||\\
&=u\cdot (v\times w)\\
&=\left|\begin{array}{ccc}
u_1 & u_2 & u_3\\
v_1 & v_2 & v_3\\
w_1 & w_2 & w_3
\end{array}\right|.
\end{align*}
In general, the resulting determinant is not necessarily positive. If it is negative, we need to take the absolute value of the determinant for the volume.

Inverse of a Matrix

Let $A$ be an $n\times n$ matrix with $\det A\ne 0$. (A square matrix whose determinant is not equal to $0$ is called non-singular.) Let $X=(x_{ij})$ be an unknown $n\times n$ matrix such that
$AX=I$. Then
$$x_{1j}A^1+\cdots+x_{nj}A^n=E^j.$$
This is a system of linear equations and as we studied here, it can be solved by Cramer’s Rule as
\begin{align*}
x_{ij}&=\frac{\det(A^1,\cdots,E^j,\cdots A^n)}{\det A}\\
&=\frac{1}{\det A}\left|\begin{array}{ccccc}
a_{11} & \cdots & 0 & \cdots & a_{1n}\\
\vdots & & \vdots & & \vdots\\
a_{j1} & \cdots & 1 & \cdots & a_{jn}\\
\vdots & & \vdots & & \vdots\\
a_{n1} & \cdots & 0 & \cdots & a_{nn}
\end{array}\right|\\
&=\frac{1}{\det A}(-1)^{i+j}\det(A_{ji})
\end{align*}
for $i=1,\cdots,n$. If we show that $XA=I$ as well, then $X$ would be the inverse $A^{-1}$ and
$$A^{-1}=\frac{1}{\det A}{}^t((-1)^{i+j}\det(A_{ij})).$$
$\det({}^tA)=\det A\ne 0$, so we can find an $n\times n$ matrix $Y$ such that ${}^tAY=I$. Taking transposes, we obtain ${}^tYA=I$. Now,
$$I={}^tYA={}^tYIA={}^tY(AX)A={}^tYA(XA)=XA.$$

Example. Let $A=\begin{pmatrix}
a & b\\
c & d
\end{pmatrix}$ with $\det A=ad-bc\ne 0$. Then $A^{-1}$ is given by
$$A^{-1}=\frac{1}{ad-bc}\begin{pmatrix}
d & -b\\
-c & a
\end{pmatrix}.$$

Example. Find the inverse of the matrix
$$A=\begin{pmatrix}
3 & 1 & -2\\
-1 & 1 & 2\\
1 & -2 & 1
\end{pmatrix}.$$

Solution. $\det A=16$. We find
\begin{align*}
A_{11}=\begin{pmatrix}
1 & 2\\
-2 & 1
\end{pmatrix},\ A_{12}=\begin{pmatrix}
-1 & 2\\
1 & 1
\end{pmatrix},\ A_{13}=\begin{pmatrix}
-1 & 1\\
1 & -2
\end{pmatrix},\\
A_{21}=\begin{pmatrix}
1 & -2\\
-2 & 1
\end{pmatrix},\ A_{22}=\begin{pmatrix}
3 & -2\\
1 & 1
\end{pmatrix},\ A_{23}=\begin{pmatrix}
3 & 1\\
1 & -2
\end{pmatrix},\\
A_{31}=\begin{pmatrix}
1 & -2\\
1 & 2
\end{pmatrix},\ A_{32}=\begin{pmatrix}
3 & -2\\
-1 & 2
\end{pmatrix},\ A_{33}=\begin{pmatrix}
3 & 1\\
-1 & 1
\end{pmatrix}.
\end{align*}
Hence by the formula we obtain
\begin{align*}
A^{-1}&=\frac{1}{16}{}^t\begin{pmatrix}
\det(A_{11}) & -\det(A_{12}) & \det(A_{13})\\
-\det(A_{21}) & \det(A_{22}) & -\det(A_{23})\\
\det(A_{31}) &-\det(A_{32}) & \det(A_{33})
\end{pmatrix}\\
&=\frac{1}{16}{}^t\begin{pmatrix}
5 & 3 & 1\\
3 & 5 & 7\\
4 & -4 & 4
\end{pmatrix}\\
&=\frac{1}{16}\begin{pmatrix}
5 & 3 & 4\\
3 & 5 &-4\\
1 & 7 & 4
\end{pmatrix}.
\end{align*}

We introduce the following theorem without proof:

Theorem. For any two $n\times n$ matrices $A$, $B$,
$$\det(AB)=\det A\det B.$$

As a special case, we obtain:

Corollary. For an invertible matrix $A$,
$$\det(A^{-1})=\frac{1}{\det A}.$$

Proof. $AA^{-1}=I$, so by the theorem
$$1=\det(AA^{-1})=\det A\det (A^{-1}).$$
Thus proving the formula for the inverse.

Cramer’s Rule

Consider a system of $n$ linear equations in $n$ unknowns
$$x_1A^1+\cdots+x_nA^n=B,$$
where $x_1,\cdots,x_n$ are variables and $A^1,\cdots,A^n,B$ are column vectors of dimension $n$. Suppose that $\det(A^1,\cdots,A^n)\ne 0$. Recall that this condition ensures that the linear system has a unique solution as seen here. Let us consider the determinant of the matrix obtained by replacing $j$-th column $A^j$ by $B$. Then using the properties of determinants studied here
\begin{align*}
\det (A^1,\cdots,B,\cdots,A^n)&=\det (A^1,\cdots,x_1A^1+\cdots+x_nA^n,\cdots,A^n)\\
&=\det (A^1,\cdots,x_1A^1,\cdots,x^nA^n)+\cdots+\det (A^1,\cdots,x_jA^j,\cdots,x_nA^n)\\
&+\cdots+\det (A^1,\cdots,x_nA^n,\cdots,x_nA^n)\\
&=x_1\det (A^1,\cdots,A^1,\cdots,A^n)+\cdots+x_j\det (A^1,\cdots,A^j,\cdots,A^n)\\
&+\cdots+x_n\det (A^1,\cdots,A^n,\cdots,A^n).
\end{align*}
Two column vectors are the same in every term except the $j$-th term, and so every term except the $j$-th term. Hence, we have
$$\det (A^1,\cdots,B,\cdots,A^n)=x_j\det (A^1,\cdots,A^j,\cdots,A^n),$$
i.e.
\begin{align*}
x_j&=\frac{\det (A^1,\cdots,B,\cdots,A^n)}{\det (A^1,\cdots,A^j,\cdots,A^n)}\\
&=\frac{\left|\begin{array}{ccccc}
a_{11} & \cdots & b_1 & \cdots & a_{1n}\\
a_{21} & \cdots & b_2 & \cdots & a_{2n}\\
\vdots & &\vdots& &\vdots\\
a_{n1} & \cdots & b_n & \cdots & a_{nn}
\end{array}\right|}{\left|\begin{array}{ccccc}
a_{11} & \cdots & a_{1j} & \cdots & a_{1n}\\
a_{21} & \cdots & a_{2j} & \cdots & a_{2n}\\
\vdots & &\vdots& &\vdots\\
a_{n1} & \cdots & a_{nj} & \cdots & a_{nn}
\end{array}\right|}
\end{align*}
for $j=1,\cdots,n$. This is called the Cramer’s Rule.

Example. Solve the system of Linear equations:
\begin{align*}
3x+2y+4z&=1,\\
2x-y+z&=0,\\
x+2y+3z&=1.
\end{align*}

Solution. By the Caramer’s Rule, we have
$$x=\frac{\left|\begin{array}{ccc}
1 & 2 & 4\\
0 & -1 & 1\\
1 & 2 & 3
\end{array}\right|}{\left|\begin{array}{ccc}
3 & 2 & 4\\
2 & -1 & 1\\
1 & 2 & 3
\end{array}\right|}=-\frac{1}{5},\ y=\frac{\left|\begin{array}{ccc}
3 & 1 & 4\\
2 & 0 & 1\\
1 & 1 & 3
\end{array}\right|}{\left|\begin{array}{ccc}
3 & 2 & 4\\
2 & -1 & 1\\
1 & 2 & 3
\end{array}\right|}=0,\ z=\frac{\left|\begin{array}{ccc}
3 & 2 & 0\\
2 & -1 & 1\\
1 & 2 & 0
\end{array}\right|}{\left|\begin{array}{ccc}
3 & 2 & 4\\
2 & -1 & 1\\
1 & 2 & 3
\end{array}\right|}=\frac{2}{5}.$$