Category Archives: Linear Algebra

The Rank of a Matrix 2: The Rank of a Matrix and Subdeterminants

A test for the linear dependence of vectors may be given in terms of determinant.

Theorem. Let $A^1,\cdots,A^n$ be column vectors of dimension $n$. They are linearly dependent if and only if
$$\det(A^1,\cdots,A^n)=0.$$

Corollary. If a system of $n$ linear equations in $n$ unknowns has a matrix of coefficients whose determinants is not 0, then this system has a unique solution.

Proof. A system of $n$ linear equations in $n$ unknowns may be written as
$$x_1A^1+\cdots+x_nA^n=B,$$
where $A^1,\cdots,A^n$ are the column vectors of dimension $n$ of the matrix of coefficients and $B$ is a column vector of dimension $n$. Since $\det(A^1,\cdots,A^n)\ne 0$, $A^1,\cdots,A^n$ are linearly independent by the theorem. So there exists a unique solution $x_1,\cdots,x_n$ of the system.

Since determinants can be used to test linear dependence , they can be also used to determine the rank of a matrix in stead of using row operations as seen here.

Example. Let
$$A=\begin{pmatrix}
3 & 1 & 2 & 5\\
1 & 2 & -1 & 2\\
1 & 1 & 0 & 1
\end{pmatrix}.$$
Since $A$ is a $3\times 4$ matrix, its rank is at most 3. If we can find three linearly independent column vectors, the rank is 3. In fact,
$$\left|\begin{array}{ccc}
1 & 2 & 5\\
2 & -1 & 2\\
1 & 0 & 1
\end{array}\right|=4.$$
So, the rank is exactly 3.

Example. Let
$$B=\begin{pmatrix}
3 & 1 & 2 & 5\\
1 & 2 & -1 & 2\\
4 & 3 & 1 & 7
\end{pmatrix}.$$
Every $3\times 3$ subdeterminant has value 0, so the rank of $B$ is at most 2. The first two rows of $B$. The first two rows are linearly independent since teh determinant
$$\left|\begin{array}{cc}
3 & 1\\
1 & 2
\end{array}\right|$$
is not 0. Hence, the rank is 2.

Determinants II: Determinants of Order $n$

A determinant of order $n$ can be calculated by expanding it in terms of determinants of order $n-1$. Let $A=(a_{ij})$ be an $n\times n$ matrix and let us denote by $A_{ij}$ the $(n-1)\times (n-1)$ matrix obtained by deleting the $i$-th row and the $j$-th column from $A$:

Then $\det A$ is given by the Laplace expansion
\begin{align*}
\det A&=(-1)^{i+1}a_{i1}\det A_{i1}+\cdots+(-1)^{i+n}a_{in}\det A_{in}\\
&=(-1)^{1+j}a_{1j}\det A_{1j}+\cdots+(-1)^{n+j}a_{nj}\det A_{nj}.
\end{align*}

All the properties of the determinants of order 2 we studied here still hold in general for the determinants of order $n$. In particular,

Theorem. Let $A^1,\cdots,A^n$ be column vectors of dimension $n$. They are linearly dependent if and only if
$$\det(A^1,\cdots,A^n)=0.$$

Example. Let us calculate the determinant of the following $3\times 3$ matrix
$$A=\begin{pmatrix}
a_{11} & a_{12} & a_{13}\\
a_{21} & a_{22} & a_{23}\\
a_{31} & a_{32} & a_{33}
\end{pmatrix}.$$
You may use any column or row to calculate $\det A$ using the Laplace expansion. In this example, we use the first row to calculate $\det A$. By the Laplace expansion,
\begin{align*}
\det A&=a_{11}\det A_{11}-a_{12}\det A_{12}+a_{13}\det A_{13}\\
&=a_{11}\left|\begin{array}{cc}
a_{22} & a_{23}\\
a_{32} & a_{33}
\end{array}\right|-a_{12}\left|\begin{array}{cc}
a_{21} & a_{23}\\
a_{31} & a_{33}
\end{array}\right|+a_{13}\left|\begin{array}{cc}
a_{21} & a_{22}\\
a_{31} & a_{32}
\end{array}\right|.
\end{align*}
Replay this with
$$A=\begin{pmatrix}
2 & 1 & 0\\
1 & 1 & 4\\
-3 & 2 & 5
\end{pmatrix}.$$
Since the first row or the third column contains 0, you may want to use the first row or the third column to do the Laplace expansion.

For $3\times 3$ matrices, there is a quicker way to calculate the determinant as shown in the following figure. You multiply three entries along each indicated arrow. When you multiply three entries along each red arrow, you also multiply by $-1$. This is called the Rule of Sarrus named after a French mathematician Pierre Frédéric Sarrus. Please be warned that the rule of Sarrus works only for $3\times 3$ matrices.

The Rule of Sarrus

Example. [Cross Product] Let $v=v_1E_1+v_2E_2+v_3E_3$ and $w=w_1E_1+w_2E_2+w_3E_3$ be two vectors in Euclidean 3-space $\mathbb{R}^3$. The cross product is defined by
$$v\times w=\left|\begin{array}{ccc}
E_1 & E_2 & E_3\\
v_1 & v_2 & v_3\\
w_1 & w_2 & w_3
\end{array}\right|.$$
Note that the cross product is perpendicular to both $v$ and $w$.

Clearly, if there are many 0 entries in a given determinant, it would be easier to calculate the determinant since you will have a lesser than usual number of terms that you actually have to calculate in the Laplace expansion. For any given determinant, we can indeed make it happen. Recall the theorem we studied here:

Theorem. If one adds a scalar multiple of one column (row) to another column (row), then the value of the determinant does not change.

Using the particular column (row) operation in the Theorem, we can turn a given determinant into one with more 0 entries.

Example. Find
$$\left|\begin{array}{cccc}
1 & 3 & 1 & 1\\
2 & 1 & 5 & 2\\
1 & -1 & 2 & 3\\
4 & 1 & -3 & 7
\end{array}\right|.$$

Solution. By the above Theorem,
\begin{align*}
\left|\begin{array}{cccc}
1 & 3 & 1 & 1\\
2 & 1 & 5 & 2\\
1 & -1 & 2 & 3\\
4 & 1 & -3 & 7
\end{array}\right|&=\left|\begin{array}{cccc}
0 & 4 & -1 & -2\\
2 & 1 & 5 & 2\\
1 & -1 & 2 & 3\\
4 & 1 & -3 & 7
\end{array}\right|\ (\mbox{add $-1$ times row 3 to row 1})\\
&=\left|\begin{array}{cccc}
0 & 4 & -1 & -2\\
0 & 3 & 1 & -4\\
1 & -1 & 2 & 3\\
4 & 1 & -3 & 7
\end{array}\right|\ (\mbox{add $-2$ times row 3 to row 2})\\
&=\left|\begin{array}{cccc}
0 & 4 & -1 & -2\\
0 & 3 & 1 & -4\\
1 & -1 & 2 & 3\\
0 & 5 & -11 & -5
\end{array}\right|\ (\mbox{add $-4$ times row 3 to row 4})\\
&=\left|\begin{array}{ccc}
4 & -1 & -2\\
3 & 1 & -4\\
5 & -11 & -5
\end{array}\right|\ (\mbox{Laplace expansion along column 1})
\end{align*}
You may compute the resulting determinant of order 3 using the rule of Sarrus or you may further simpify it. For instance, you may do:
\begin{align*}
\left|\begin{array}{ccc}
4 & -1 & -2\\
3 & 1 & -4\\
5 & -11 & -5
\end{array}\right|&=\left|\begin{array}{ccc}
7 & 0 & -6\\
3 & 1 & -4\\
5 & -11 & -5
\end{array}\right|\ (\mbox{add row 2 to row 1})\\
&=\left|\begin{array}{ccc}
7 & 0 & -6\\
3 & 1 & -4\\
38 & 0 & -49
\end{array}\right|\ (\mbox{add 11 times row 2 to row 3})\\
&=\left|\begin{array}{cc}
7 & -6\\
38 & -49
\end{array}\right|\ (\mbox{Laplace expansion along column 2})\\
&=-115.
\end{align*}

Determinants I: Determinants of Order 2

Let $A=\begin{pmatrix}
a & b\\
c & d
\end{pmatrix}$. Then we define the determinant $\det A$ by
$$\det A=ad-bc.$$
$\det A$ is also denoted by $|A|$ or $\left|\begin{array}{ccc}
a & b\\
c & d
\end{array}\right|$. In terms of the column vectors $A^1,A^2$, the determinant of $A$ may also be written as $\det(A^1,A^2)$.

Example. If $A=\begin{pmatrix}
2 & 1\\
1 & 4
\end{pmatrix}$, then $\det A=8-1=7$.

Property 1. The determinant $\det (A^1,A^2)$ may be considered as a bilinear map of the column vectors. As a bilinear map $\det (A^1,A^2)$ is linear in each slot. For example, if $A^1=C+C’$ then
$$\det(A^1,A^2)=\det(C,A^2)+\det(C’,A^2).$$
If $x$ is a number,
$$\det(xA^1,A^2)=x\det(A^1,A^2).$$

Property 2. If the two columns are equal, the determinant is 0.

Property 3. $\det I=\det (E^1,E^2)=1$.

Combining Properties 1-3, we can show that:

Theorem. If one adds a scalar multiple of one column to another, then the value of the determinant does not change.

Proof. We prove the theorem for a particular case.
\begin{align*}
\det(A^1+xA^2,A^2)&=\det(A^1,A^2)+x\det(A^2,A^2)\\
&=\det(A^1,A^2).
\end{align*}

Theorem. If the two columns are interchanged, the determinant changes by a sign i.e.
$$\det(A^2,A^1)=-\det(A^1,A^2).$$

Proof. \begin{align*}
0&=\det(A^1+A^2,A^1+A^2)\\
&=\det(A^1,A^2)+\det(A^2,A^1).
\end{align*}

Theorem. $\det A=\det {}^tA$.

Proof. This theorem can be proved directly from the definition of $\det A$.

Remark. Because of this theorem, we can also say that if one adds a scalar multiple of one row to another row, then the value of the determinant does not change.

Theorem. The column vectors $A^1,A^2$ are linearly dependent if and only if $\det(A^1,A^2)=0$.

Proof. Suppose that $A^1,A^2$ are linearly dependent. Then there exists numbers $x,y$, not all equal to 0 such that $xA^1+yA^2=0$. Let us say $x\ne 0$. Then $A^1=-\frac{y}{x}A^2$. So,
\begin{align*}
\det(A^1,A^2)&=\det\left(-\frac{y}{x}A^2,A^2\right)\\
&=-\frac{y}{x}\det(A^2,A^2)\\
&=0.
\end{align*}
To prove the converse, suppose that $A^1,A^2$ are linearly independent. Then $E^1,E^2$ can be written as linear combinations of $A^1,A^2$:
\begin{align*}
E^1&=xA^1+yA^2,\\
E^2&=zA^1+wA^2.
\end{align*}
Now,
\begin{align*}
1&=\det(E^1,E^1)\\
&=xw\det(A^1,A^2)+yz\det(A^2,A^1)\\
&=(xw-yz)\det(A^1,A^2).
\end{align*}
Hence, $\det(A^1,A^2)\ne 0$.

Orthogonal Bases

Let $V$ be a vector space with a positive definite scalar product $\langle\ ,\ \rangle$. A basis $\{v_1,\cdots,v_n\}$ of $V$ is said to be orthogonal if $\langle v_i,v_j\rangle=0$ if $i\ne j$. In addition, if $||v_i||=1$ for all $i=1,\cdots,n$, then the basis is said to be orthonormal.

Example. $E_1,\cdots,E_n$ of $\mathbb{R}^n$ form an orthonormal basis of $\mathbb{R}^n$.

Why having a orthonormal basis is big deal? To answer this question, let us suppose that $e_1,\cdots,e_n$ is an orthonormal basis of a vector space $V$. Let $v,w\in V$. Then
\begin{align*}
v&=v_1e_1+\cdots+v_ne_n,\\
w&=w_1e_1+\cdots+w_ne_n.
\end{align*}
Since $\langle e_i,e_j\rangle=\delta_{ij}$,
$$\langle v,w\rangle=v_1w_1+\cdots+v_nw_n=v\cdot w.$$
Hence, once an orthonormal basis is given, the scalar product $\langle\ ,\ \rangle$ is identified with the dot product. Next question is then, can we always come up with an orthonormal basis? The answer is affirmative. Given a basis, we can construct a new basis which is orthonormal through a process called the Gram-Schmidt orthogonalization process. Here is how it works. Let $w_1,\cdots,w_n$ be a basis of a vector space $V$. Let $v_1=w_1$ and
$$v_2=w_2-\frac{\langle w_2,v_1\rangle}{\langle v_1,v_1\rangle}v_1.$$
Then $v_2$ is perpendicular to $v_1$. Note that if $w_2$ is already perpendicular to $v_1=w_1$, then $v_2=w_2$.

Gram-Schmidt Orthogonalization Process

Let
$$v_3=w_3-\frac{\langle w_3,v_1\rangle}{\langle v_1,v_1\rangle}v_1-\frac{\langle w_3,v_2\rangle}{\langle v_2,v_2\rangle}v_2.$$
Then $v_3$ is perpendicular both $v_1$ and $v_2$ as seen in the following figure.
Continuing this process, we have
$$v_n=w_n-\frac{\langle w_n,v_1\rangle}{\langle v_1,v_1\rangle}v_1-\cdots-\frac{\langle w_n,v_{n-1}\rangle}{\langle v_{n-1},v_{n-1}\rangle}v_{n-1}$$
and $v_1,\cdots,v_n$ are mutually perpendicular.

Gram-Schmidt Orthogonalization Process

Therefore, we have the following theorem holds.

Theorem. Let $V\ne\{O\}$ be a finite dimensional vector space with a positive definite scalar product. Then $V$ has an orthonormal basis.

Example. Find an orthonormal basis for the vector space generated by
$$A=(1,1,0,1),\ B=(1,-2,0,0),\ C=(1,0,-1,2).$$
Here the scalar product is the dot product.

Solution. Let
\begin{align*}
A’&=A,\\
B’&=B-\frac{B\cdot A’}{A’\cdot A’}A’\\
&=\frac{1}{3}(4,-5,0,1),\\
C’&=C-\frac{C\cdot A’}{A’\cdot A’}-\frac{C\cdot B’}{B’\cdot B’}B’\\
&=\frac{1}{7}(-4,-2,-7,6).
\end{align*}
Then $A’,B’,C’$ is an orthogonal basis. We obtain an orthonormal basis by normilizing each basis member:
\begin{align*}
\frac{A’}{||A’||}&=\frac{1}{\sqrt{3}}(1,1,0,1),\\
\frac{B’}{||B’||}&=\frac{1}{\sqrt{42}}(4,-5,0,1),\\
\frac{C’}{||C’||}&=\frac{1}{\sqrt{105}}(-4,-2,-7,6).
\end{align*}

Theorem. Let $V$ be a vector space of dimension $n$ with a positive definite scalar product $\langle\ ,\ \rangle$. Let $\{w_1,\cdots,w_r,u_1,\cdots,u_s\}$ with $r+s=n$ be an orthonormal basis of $V$. Let $W$ be a subspace generated by $w_1,\cdots,w_r$ and let $U$ be a subspace generated by $u_1,\cdots,u_s$. Then $U=W^{\perp}$ and $\dim V=\dim W+\dim W^{\perp}$.

Scalar Products

Let $V$ be vector space. A scalar product is a map $\langle\ ,\ \rangle: V\times V\longrightarrow\mathbb{R}$ such that

SP 1. $\langle v,w\rangle=\langle w,v\rangle$

SP 2. $\langle u,v+w\rangle=\langle u,v\rangle+\langle u,w\rangle$

SP 3. If $x$ is a number, then
$$\langle xu,v\rangle=x\langle u,v\rangle=\langle u,xv\rangle$$

Additionally, we also assume the condition

SP 4. $\langle v,v\rangle>0$ if $v\ne O$

A scalar product with SP 4 is said to be positive definite.

Remark. If $v=O$, then $\langle v,w\rangle=0$ for any vector $w$ in $V$. This follows immediately from SP 3.

Example. Let $V=\mathbb{R}^n$ and define
$$\langle X,Y\rangle=X\cdot Y.$$
Then $\langle\ ,\ \rangle$ is a positive definite scalar product.

Example. Let $V$ be the function space of all continuous real-valued function on $[-\pi,\pi]$. For $f,g\in V$, we define
$$\langle f,g\rangle=\int_{-\pi}^{\pi}f(t)g(t)dt.$$
Then $\langle\ ,\ \rangle$ is a positive definte scalar product.

Using a scalar product, we can introduce the notion of orthogonality of vectors. Two vectors $v,w$ are said to be orthogonal or perpendicular if $\langle v,w\rangle=0$.

Let $S\subset V$ be a subspace of $V$. Let $S^{\perp}=\{w\in V: \langle v,w\rangle=0\ \mbox{for all}\ v\in V\}$.Then $S^{\perp}$ is also a subspace of $V$. (Check for yourself.) It is called the orthogonal space of $S$.

Define the length or the norm of $v\in V$ by
$$||v||=\sqrt{\langle v,v\rangle}.$$
It follows from SP 3 that
$$||cv||=|c|||v||$$
for any number $c$.

The Projection of a Vector onto Another Vector

For vectors in $\mathbb{R}^2$ or $\mathbb{R}^3$, the vector projection of a vector $v$ onto another vector $w$ is
$$||v||\cos\theta\frac{w}{||v||}=\langle v,w\rangle\frac{w}{||w||^2}=\frac{\langle v,w\rangle}{\langle w,w\rangle}w$$
as seen in the above Figure. The number $c=\frac{\langle v,w\rangle}{\langle w,w\rangle}$ is called the component of $v$ along $w$. Note that the vector projection of $v$ onto $w$
$$\frac{\langle v,w\rangle}{\langle w,w\rangle}w$$
can still be defined in any vector space with a scalar product.

Proposition. The vector $v-cw$ is perpendicular to $w$.

Proof. \begin{align*}
\langle v-cw,w\rangle&=\langle v,w\rangle-c\langle w,w\rangle\\
&=\langle v,w\rangle-\langle v,w\rangle\\
&=0.
\end{align*}

Example. Let $V=\mathbb{R}^n$. Then the component of $X=(x_1,\cdots,x_n)$ along $E_i$ is
$$X\cdot E_i=x_i.$$

Example. Let $V$ be the space of continuous functions on $[-\pi,\pi]$. Let $f(x)=\sin kx$, where $k$ is a positive integer. Then
$||f||=\sqrt{\pi}$. The component of $g(x)$ along $f(x)$ is
$$\frac{\langle g,f\rangle}{\langle f,f\rangle}=\frac{1}{\pi}\int_{-\pi}^{\pi}g(x)\sin kxdx.$$
It is called the Fourier coefficient of $g$ along $f$.

The following two inequalities are well-known for vectors in $\mathbb{R}^n$. They still hold in any vector space with a positive definite scalar product.

Theorem [Schwarz Inequality] Let $V$ be a vector space with a positive definite scalar product. For any $v,w\in V$,
$$|\langle v,w\rangle|\leq ||v||||w||.$$

Theorem [Triangle Inequality] Let $V$ be a vector space with a positive definite scalar product. For any $v,w\in V$,
$$||v+w||\leq ||v||+||w||.$$