# Congruences 1

Definition. Let $m>0$. We say that $a$ is congruent to $b$ modulo $m$ and write $a\equiv b\mod m$ if $m|b-a$.

Example. Since $6|7-1$, $1\equiv 7\mod 6$. Since $6|13-1$, $1\equiv 13\mod 6$. Since $6\not|7-2$, $7\not\equiv 2\mod 6$.

Theorem 1. Let $a,a’,b,b’,c,d$, and $m$ be integers with $d>0$ and $m>0$. Then

1. $a\equiv a\mod m$.
2. If $a\equiv b\mod m$ then $b\equiv a\mod m$.
3. If $a\equiv b\mod m$ and $b\equiv c\mod m$ then $a\equiv c\mod m$.
4. If $a\equiv b\mod m$ and $a’\equiv b’\mod m$ then $a+a’\equiv b+b’\mod m$.
5. If $a\equiv b\mod m$ and $a’\equiv b’\mod m$ then $aa’\equiv bb’\mod m$.
6. If $a\equiv b\mod m$ and $d|m$ then $a\equiv b\mod d$.

Proof.

1. Since $m|0=a-a$, $a\equiv a\mod m$.
2. Let $a\equiv b\mod m$. Then $m|b-a$ and so $b-a=mk$ for some $k\in\mathbb{Z}$. Since $a-b=m(-k)$ and $-k\in\mathbb{Z}$, then $m|a-b$, i.e., $b\equiv a\mod m$.
3. Suppose that $a\equiv b\mod m$ and $b\equiv c\mod m$. Then $m|b-a$ and $m|c-b$, so $$m|(b-a)+(c-b)=c-a.$$ Hence, $a\equiv c\mod m$.
4. Suppose that $a\equiv b\mod m$ and $a’\equiv b’\mod m$. Then $m|b-a$ and $m|b’-a’$, so $$m|(b-a)+(b’-a’)=(b+b’)-(a+a’),$$ that is, $a+a’\equiv b+b’\mod m$.
5. Suppose that $a\equiv b\mod m$ and $a’\equiv b’\mod m$. Then $m|b-a$ and $m|b’-a’$. Now, $$bb’-aa’=(b-a)a’+(b’-a’)b.$$ Since $m|b-a$ and $m|b’-a’$, $m|bb’-aa’$, i.e., $aa’\equiv bb’\mod m$.
6. If $a\equiv b\mod m$ then $m|b-a$. If $d|m$ then $d|b-a$ and so $a\equiv b\mod d$.

Remark 1. Properties 1-3 tell us that $\equiv\mod m$ is an equivalence relation on $\mathbb{Z}$. Properties 1-5 tell us that $\equiv\mod m$ is a congruence relation, an equivalence relation that preserves operations, on $\mathbb{Z}$. One could easily guess that $\equiv\mod m$ is where the name congruence relation is originated from. For more details about congruence relation, see the reference [1] below.

Definition. If $m>0$ and $r$ is the remainder when the division algorithm is used to divide $b$ by $m$, then $r$ is called the least residue of $b$ modulo $m$.

Example.

1. The least residue of $12\mod 7$ is $5$.
2. The least residue of $20\mod 4$ is $0$.
3. The least residue of $-12\mod 7$ is $2$.
4. The least residue of $-3\mod 7$ is $4$.

Theorem. Let $m>0$. Then

1. If $r$ and $b$ are integers such that $r\equiv b\mod m$ and $0\leq r<m$, then $r$ is the least residue $\mod m$.
2. Two integers are congruent $\mod m$ if and only if they have the same least residue $\mod m$.

Proof.

1. Suppose that $r\equiv b\mod m$. Then $m|b-r$ and so $b=mq+r$ for some $q\in\mathbb{Z}$. If $0\leq r<m$ then by the uniqueness of the quotient and the remainder, $r$ must be the least residue $\mod m$.
2. Suppose that $b$ and $b’$ have the same remainder when divided by $m$, say $$b=mq+r\ {\rm and}\ b’=mq’+r$$ for some $q,q’\in\mathbb{Z}$. Then $b-b’=m(q-q’)$ and so $b\equiv b’\mod m$. Conversely, suppose that $b=b’\mod m$ and $b=mq+r$ with $0\leq r<m$. Then $b’=b\mod m$ and $b\equiv r\mod m$, thus $b’\equiv r\mod m$. Therefore by part 1, $r$ is the least residue of $b’\mod m$.

Example. Find the least residue of $33\cdot 26^2\mod 31$.

Solution: $33\equiv 2\mod 31$ and $26\equiv -5\mod 31$. So, \begin{align*} 33\cdot 26^2&\equiv 2\cdot (-5)^2\mod 31\\ &\equiv 50\mod 31\\ &\equiv 19\mod 31.\end{align*} Since $0\leq 19<31$, the least residue is $19$.

Theorem [The Cancellation Theorem]. If $a,b>0$, $x$ and $x’$ are integers such that $(a,b)=1$, then $ax\equiv ax’\mod b$ implies $x\equiv x’\mod b$.

The following example shows that the Cancellation Theorem does not necessarily hold unless the condition $(a,b)=1$ is satisfied.

Example. $(2,4)=2\ne 1$. $2\cdot 1\equiv 2\cdot 3\mod 4$ but $1\not\equiv 3\mod4$.

The Cancellation Theorem is a special case of the following more general theorem.

Theorem. If $a,b>0$, $x$ and $x’$ are integers such that $(a,b)=d$, then
$ax\equiv ax’\mod b$ if and only if $x\equiv x’\mod b/d$.

Proof. \begin{align*} ax\equiv ax’\mod b&\Longrightarrow b|a(x-x’)\\ &\Longrightarrow a(x-x’)=bk\ \mbox{for some}\ k\in\mathbb{Z}\\ &\Longrightarrow \frac{a}{d}(x-x’)=\frac{b}{d}k\\ &\Longrightarrow \frac{b}{d}|\frac{a}{d}(x-x’)\\ &\Longrightarrow \frac{b}{d}|x-x’\ \mbox{since}\ \left(\frac{a}{d},\frac{b}{d}\right)=1\\ &\Longrightarrow x\equiv x’\mod\frac{b}{d}. \end{align*} Conversely, if $x\equiv x’\mod\frac{b}{d}$, then
\begin{align*} x’-x=\frac{b}{d}k\ \mbox{for some}\ k&\Longrightarrow ax’-ax=b\left(\frac{a}{d}\right)k\\ &\Longrightarrow ax\equiv ax’\mod b. \end{align*}

Theorem. If $a>0, b$, and $b’$ are integers such that
$b\equiv b’\mod a$, then $(a,b)=(a,b’)$.

Proof. \begin{align*} b\equiv b’\mod a&\Longrightarrow b’-b=aq\ \mbox{for some}\ q\in\mathbb{Z}\\ &\Longrightarrow b’=b+aq\\ &\Longrightarrow (a,b)=(a,b’) \end{align*} by Lemma 3 here.

As mentioned in Remark 1, $\equiv\mod m$ is an equivalence relation on $\mathbb{Z}$. For fixed $m$, each equivalence class with respect to $\equiv\mod m$ has one and only one representative between $0$ and $m-1$. Denote by $\mathbb{Z}/m\mathbb{Z}$ or $\mathbb{Z}_m$ the set of all equivalence classes, called the \emph{residue classes}. Then
$$\mathbb{Z}/m\mathbb{Z}={[0],[1],\cdots,[m-1]}.$$
Often $\mathbb{Z}/m\mathbb{Z}$ is written simply as
$$\mathbb{Z}/m\mathbb{Z}={0,1,\cdots,m-1},$$
i.e, those residue classes are represented by their representatives (typically those least residues $\mod m$) unless there is a confusion.

Define the binary operations $+$ and $\cdot$ on $\mathbb{Z}/m\mathbb{Z}$: For any $[a],[b]\in\mathbb{Z}/m\mathbb{Z}$,
$$[a]+[b]:=[a+b],\ [a]\cdot[b]:=[a\cdot b].$$
Then $+$ and $\cdot$ are well-defined due to properties (4) and (5), respectively in Theorem 1.

Theorem. $(\mathbb{Z}/m\mathbb{Z},+,\cdot)$ is a commutative ring with unity.

Consider $\mathbb{Z}/9\mathbb{Z}$. Its multiplication table is given by
$$\begin{array}{|c|c|c|c|c|c|c|c|c|c|}\hline\cdot & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8\\\hline 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\hline 1 & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8\\\hline 2 & 0 & 2 & 4 & 6 & 8 & 1 & 3 & 5 & 7\\\hline 3 & 0 & 3 & 6 & 0 & 3 & 6 & 0 & 3 & 6\\\hline 4 & 0 & 4 & 8 & 3 & 7 & 2 & 6 & 1 & 5\\\hline 5 & 0 & 5 & 1 & 6 & 2 & 7 & 3 & 8 & 4\\\hline 6 & 0 & 6 & 3 & 0 & 6 & 3 & 0 & 6 & 3\\\hline 7 & 0 & 7 & 5 & 3 & 1 & 8 & 6 & 4 & 2\\\hline 8 & 0 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1\\\hline\end{array}$$
As one can see clearly, not every nonzero element of $\mathbb{Z}/9\mathbb{Z}$ has a multiplicative inverse. So, $\mathbb{Z}/9\mathbb{Z}$ cannot be a field. However, there are elements of $\mathbb{Z}/9\mathbb{Z}$ that have multiplicative inverses. They are $1,2,4,5,7,8$. As representatives of residue classes, they are all relatively prime to $9$. In fact, the following theorem holds in general.

Theorem. The elements of $\mathbb{Z}/m\mathbb{Z}$ which have multiplicative inverses are those which are relatively prime to $m$, i.e., the numbers $a$ for which there exists $b$ with $ab\equiv 1\mod m$ are precisely those $a$ for which $(a,m)=1$.

Proof. Let $d=(a,m)$ and suppose that there exists $b\in\mathbb{Z}$ such that $ab\equiv 1\mod m$. Then by property 6 in Theorem 1 $ab\equiv 1\mod d$. Since $d|a$, $d$ must divide $1$, i.e., $d=1$. Conversely, if $(a,m)=1$ then by Bézout’s Lemma (Theorem 5 here), there exist $x,y\in\mathbb{Z}$ such that $ax+my=1$. Choose $b=x$. Then $ab\equiv 1\mod m$.

Definition. If $(a,m)=1$ then by negative power $a^{-n}\mod m$ we mean the $n$-th powers of the inverse residue class, i.e., it is represented by the $n$-th power of any integer $b$ for which $ab\equiv 1\mod m$.

Example. Find $160^{-1}\mod 841$, i.e., the inverse of $160\mod 841$.

Solution: First check if $(160,841)=1$ by the Euclidean algorithm.
\begin{align*} 160&=841\cdot 0+160,\\ 841&=160\cdot 5+41,\\ 160&=41\cdot 3+37,\\ 41&=37\cdot 1+4,\\ 37&=4\cdot 9+1. \end{align*} So, $(160,841)=1$. Now by working backward, let us find $b$ such that $160\cdot b\equiv 1\mod 841$. \begin{align*} 1&=37-4\cdot 9\\ &=37-(41-37)9\\ &=10\cdot 37-9\cdot 41\\ &=10(160-3\cdot 41)-9\cdot 41\\ &=10\cdot 160-39\cdot 41\\ &=10\cdot 160-39(841-5\cdot 160)\\ &=205\cdot 160-39\cdot 841.\end{align*} Hence, $b=205=160^{-1}$.

Corollary. If $p$ is a prime number then every nonzero residue class has a multiplicative inverse, i.e, $\mathbb{Z}/p\mathbb{Z}$ is a field. We often denote this finite field of $p$ elements by $\mathbb{F}_p$.

Corollary. Suppose $0\leq a,b<m$. If $(a,m)=1$ then there exists $x_0\in\mathbb{Z}$ such that $ax_0\equiv b\mod m$ and all solutions of $ax\equiv b\mod m$ are of the form $x=x_0+mn$ for $n$ an integer. If $(a,m)=d$ then there exists $x\in\mathbb{Z}$ such that $ax\equiv b\mod m$ if and only if $d|b$ and in that case our congruence is equivalent to the congruence $a’x\equiv b’\mod m’$, where $a’=a/d,b’=b/d,m’=m/d$.

Proof. If $(a,m)=1$ then there exists $a^{-1}\in\mathbb{Z}$ such that $a\cdot a^{-1}\equiv 1\mod m$, so $a\cdot (a^{-1}b)\equiv b\mod m$. Choose $x_0=a^{-1}b$. Since $(a,m)=1$, by Theorem 6 here, all solutions of equation $ax\equiv b\mod m$ or equivalently $ax+mq=b$ for some $q\in\mathbb{Z}$ are given in the form $$x=x_0+mn$$ for some $n\in\mathbb{Z}$. Let $(a,m)=d$. If there exists $x\in\mathbb{Z}$ such that $ax\equiv b\mod m$, then clearly $d|b$. Conversely, suppose that $d|b$. Since $(a/d,m/d)=1$, there exists $x\in\mathbb{Z}$ such that $\frac{a}{d}x\equiv\frac{b}{d}\mod\frac{m}{d}$ and this is clearly equivalent to $ax\equiv b\mod m$.

References:

1. Stanley Burris and H. P. Sankappanavar, A Course in Universal Algebra, Springer-Verlag, 1981

# Linear Combination

Theorem 1. Given integers $a$, $b$, and $c$ with $a$ and $b$ not both $0$, there exist $x,y\in\mathbb{Z}$ such that $ax+by=c$ if and only if $(a,b)|c$.

Proof. Left as an exercise.

Corollary 2. Let $a$ and $b$ be integers. Then there exist $x,y\in\mathbb{Z}$ such that $ax+by=1$ if and only if $(a,b)=1$ i.e. $a$ and $b$ are relatively prime.

Corollary 3. Let $a,a’,b\in\mathbb{Z}$. If $(a,b)=1$ and $(a’,b)=1$, then $(aa’,b)=1$.

Proof. Since $(a,b)=1$ and $(a’,b)=1$, there exist $x,y,x’,y’\in\mathbb{Z}$ such that $ax+by=1$ and $a’x’+by’=1$. Now, \begin{align*}1&=(ax+by)(a’x’+by’)\\&=aa’xx’+b(axy’+a’x’y+byy’)\end{align*} Hence, $(aa’,b)=1$.

Theorem 4. If $a,b$ and $c$ are integers such that $(a,b)=1$ and $a|bc$, then $a|c$.

Proof. Since $(a,b)=1$, there exist $x,y\in\mathbb{Z}$ such that $ax+by=1$. So, we obtain $acx+bcy=c$. Since $a|ac$ and $a|bc$, $a|c$.

Remark. $a|bc$ does not necessarily imply that $a|b$ or $a|c$. For example, $6|36=4\cdot 9$ but $6\not|4$ and $6\not|9$.

Theorem 5. If $a$ and $b$ are integers and $(a,b)=d$, then $\frac{a}{d}$ and $\frac{b}{d}$ are relatively prime.

Proof. Since $(a,b)=d$, there exist $x,y\in\mathbb{Z}$ such that $ax+by=d$. Dividing the equation by $d$, we obtain

$\frac{a}{d}x+\frac{b}{d}y=1$. By theorem 1, this implies that $\left(\frac{a}{d},\frac{b}{d}\right)=1$.

Example 1. Consider the equation $9x+24y=15$. Since $(9,24)=3$ and $3|15$, from theorem 1, we know that a solution exist. First, we can find a solution to $9x+24y=3$ using the Euclidean algorithm as seen before. \begin{align*}24&=9\cdot 2+6\\9&=6\cdot 1+3\\6&=3\cdot 2+0\end{align*} Thus, \begin{align*}3&=9-6\cdot 1\\&=9-(24-9\cdot 2)\cdot 1\\&=9\cdot 3+24\cdot(-1)\end{align*} Hence, $x’=3$ and $y’=-1$ is a solution to $9x+24y=3$ and thereby $x=5x’=15$ and $y=5y’=-5$ is a solution to $9x+24y=15$. Finding a solution is not a big deal. But there are other solutions. For instance, $x=-1$ and $y=1$ is also a solution to $9x+24y=15$. How do we find other solutions? We now turn our attention to this question.

Suppose that $(x_0,y_0)$ is a solution to $$\label{eq:lineqn}ax+by=c$$ Then $$\label{eq:lineqn2}ax_0+by_0=c$$ Subtracting \eqref{eq:lineqn2} from \eqref{eq:lineqn}, we obtain $$\label{eq:lineqn3}a(x-x_0)=b(y_0-y)$$ Let $d=(a,b)$. Dividing \eqref{eq:lineqn3} by $d$, we obtain $$\label{eq:lineqn4}\frac{a}{d}(x-x_0)=\frac{b}{d}(y_0-y)$$ This means that $\frac{a}{d}|\frac{b}{d}(y_0-y)$. Since $\left(\frac{a}{d},\frac{b}{d}\right)=1$, by theorem 2, $\frac{a}{d}|y_0-y$ and so, $y_0-y=\frac{a}{d}t$ for some $t\in\mathbb{Z}$. From \eqref{eq:lineqn4} we also obtain $x-x_0=\frac{b}{d}t$. Therefore, $x$ and $y$ are written as $$\label{eq:lineqnsol}x=x_0+\frac{b}{d}t,\ y=y_0-\frac{a}{d}t$$ where $t\in\mathbb{Z}$. Conversely, any $(x,y)$ in the form \eqref{eq:lineqnsol} satisfies the equation \eqref{eq:lineqn}. $$a\left(x_0+\frac{b}{d}t\right)+b\left(y_0-\frac{a}{d}t\right)=ax_0+by_0=c$$

Theorem 6. Suppose that $a\ne 0$, $b\ne 0$, and $c$ are integers. Let $(x_0,y_0)$ be a particular solution to $ax+by=c$. Then all solutions to $ax+by=c$ are given by $$x=x_0+\frac{b}{d}t,\ y=y_0-\frac{a}{d}t$$ where $t\in\mathbb{Z}$ and $(a,b)=d$.

Example. In example 1, we found $(x_0,y_0)=(15,-5)$. So by theorem 6, all solutions to $9x+24y=15$ are given by $$x=15+7t,\ y=-5-3t$$ where $t\in\mathbb{Z}$.

Example. Find all positive integers $x,y$ such that $4x+6y=100$.

Solution. $(4,6)=2$ and $2|100$, so a solution exists.

$6=4\cdot 1+2$ i.e. $2=4\cdot (-1)+6\cdot 1$. $x_0=-50$ and $y_0=50$ is a particular solution to $4x+6y=100$. By theorem 3, all solutions are given by $$x=-50+3t,\ y=50-2t$$ where $t\in\mathbb{Z}$. Since $x$ and $y$ are required to be positive, we find that $17\leq t\leq 24$. The following table shows all those solutions. $$\begin{array}{|c||c|c|c|c|c|c|c|c|}\hline t & 17 & 18 & 19 & 20 & 21 & 22 & 23 & 24\\\hline x & 1 & 4 & 7 & 10 & 13 & 16 & 19 & 22\\\hline y & 16 & 14 & 12 & 10 & 8 & 6 & 4 & 2\\\hline\end{array}$$

Example. A man paid \$11.37 for some 39-cent pens and 69-cent pens. How many of each did he buy? Solution. The problem is equivalent to solving the equation $$\label{eq:pen}39x+69y=1137$$ where$x$and$y$are nonnegative integers.$(39,69)=3$and$3|1137$, so the equation has a solution. Solving the equation \eqref{eq:pen} is equivalent to solving $$\label{eq:pen2}13x+23y=379$$ where$x$and$y$are nonnegative integers. Since$(13,23)=1$, by the Euclidean algorithm, one can find a solution$x’=-7$and$y’=4$to$13x+23y=1$.$x_0=379x’=-2653$and$y_0=379\cdot y’=1516$is then a solution to \eqref{eq:pen2}. By theorem 3, all integer solutions to \eqref{eq:pen2} are given by $$x=-2653+23t,\ y=1516-13t,\ t\in\mathbb{Z}$$ From the conditions$x\geq 0, y\geq 0$, we obtain the inequality$115\frac{8}{23}\leq t\leq 116\frac{8}{23}$. There is only one integer$t=116$that satisfies the inequality.$x=-2653+23\cdot 116=15$and$y=1516-13\cdot 116=8$. So, the man bought 15 39-cent pens and 8 69-cent pens. # The Division Algorithm Theorem 1. Suppose that$a,b\in\mathbb{Z}$with$0<a<b$. Then there exists uniquely$q,r\in\mathbb{Z}$,$0\leq r<a$, such that $$b=aq+r$$ Theorem 2. If$a$and$b$are positive integers, then $$(a,b)[a,b]=ab$$ Suppose that$0<a<b$. Then by the division algorithm, there exist uniquely$q,r\in\mathbb{Z}$such that$b=aq+r$where$0\leq r<a$. If$d=(a,b)$then$d|r$. This means$d\leq(a,r)$since$d$is a common divisor of$a$and$r$. Since$(a,r)|a$and$(a,r)|b$,$(a,r)$is a common divisor of$a$and$b$. This implies$(a,r)\leq(a,b)=d$. Thus,$d=(a,r)=(a,b)$. Therefore, we have the following lemma. Lemma 3. For any integers$a>0$,$b$,$c$and$k$, if$a=bk+c$then$(a,b)=(b,c)$. Example. Let$a=123$and$b=504$. Then $$504=123\cdot 4+12$$ By Lemma, we have $$(123,504)=(12,123)$$ $$123=12\cdot 10+3$$ By Lemma again, we have $$(12,123)=(3,12)=3$$ Hence, we obtain$(123,504)=3$. Theorem 4. (The Euclidean Algorithm) Let$a$and$b$be integers,$0<a<b. Apply the division algorithm repeatedly as follows. \begin{align*}b&=aq_1+r_1,\ 0<r_1<a\\a&=r_1q_2+r_2,\ 0<r_2<r_1\\r_1&=r_2q_3+r_3,\ 0<r_3<r_2\\&\vdots\\r_{n-2}&=r_{n-1}q_n+r_n,\ 0<r_n<r_{n-1}\\r_{n-1}&=r_nq_{n+1}\end{align*}Letr_n$be the last nonzero remainder. Then$(a,b)=r_n$Example. Compute$(158,188)using the Euclidean algorithm. \begin{align*}188&=158\cdot 1+30\\158&=30\cdot 5+8\\30&=8\cdot 3+6\\8&=6\cdot 1+2\\6&=2\cdot 3+0 \end{align*} Hence,(158,188)=2$. It is possible to use the Euclidean algorithm to write$(a,b)$in the form$ax+by. In the above example, \begin{align*}2&=8-6\cdot 1\\&=8-(30-8\cdot 3)\cdot 1\\&=-30+8\cdot 4\\&=-30+(158-30\cdot 5)\cdot 4\\&=158\cdot 4-30\cdot 21\\&=158\cdot 4-(188-158\cdot 1)\cdot 21\\&=158\cdot 25+188\cdot(-21) \end{align*} In general, the following property holds. Theorem 5. (Bézout’s Lemma) Ifa$and$b$are integers such that$(a,b)$is defined, then there exist$x,y\in\mathbb{Z}$such that \label{eq:bezout} (a,b)=ax+by \eqref{eq:bezout} is called the Bézout’s identity. Corollary 6. If$d$is any common divisor of$a$and$b$, not both of which are$0$, then$d|(a,b)$. Definition. We say$k$is a linear combination of$a$and$b$if there exist$x,y\in\mathbb{Z}$such that$k=ax+by$. Theorem 7. Given integers$a\ne 0$and$b\ne 0$, and$m$, if$a|m$and$b|m$then$[a,b]|m$. Proof. Assume that$\frac{m}{[a,b]}$is not an integer. Then there exist$q,r\in\mathbb{Z}$such that$m=[a,b]q+r$,$0<r<[a,b]$. The remainder$r$is then written as$r=m-q[a,b]$. By assumption,$a|m$, and also$a|[a,b]$. So,$a|r$. By the same argument, we also obtain$b|r$. This means that$r$is a common multiple of$a$and$b$. But it is a contradiction to the fact that$0<r<[a,b]$. Hence,$r=0$. Remark. This theorem and the preceding corollary are dual to each other. # Divisibility Definition. We say$a$divides$b$and write$a|b$if there exists$d\in\mathbb{Z}$such that$b=ad$. We say that$a$is a divisor of$b$and that$b$is a multiple of$a$. If$a|b$is false, we write$a\not|b$. Definition. We say$d$is the greatest common divisor (gcd in short) of$a$and$b$if$d$is the largest of all integers dividing both$a$and$b$. We write$d=(a,b)$. Example. Let$a=4$and$b=6$. The divisors of$4$are$1$,$-1$,$2$,$-2$,$4$,$-4$. The divisors of$6$are$1$,$-1$,$2$,$-2$,$3$,$-3$,$6$,$-6$. So the common divisors of$4$and$6$are$1$,$-1$,$2$,$-2$and$2=(4,6)$. Definition. We say$m$is the least common multiple (lcm in short) of$a$and$b$if$m$is the smallest of all the positive integers that are multiples of both$a$and$b$. We write$m=[a,b]$. Example. Let$a=4$and$b=6$. The positive multiples of$4$are$4$,$8$,$12$,$16$,$20$,$24$,$28$,$\cdots$and the positive multiples of$6$are$6$,$12$,$18$,$24$,$30$,$\cdots$. Common positive multiples of$4$and$6$are$12$,$24$,$\cdots$and$[4,6]=12$. Theorem. Given integers$a$,$b$, and$c$, 1. if$a|b$then$a|bc$. 2. if$a|b$and$b|c$then$a|c$. 3. if$a|b$and$a|c$then$a|bx+cy$for any$x,y\in\mathbb{Z}$. Proof. 1. If$a|b$then there exists$d\in\mathbb{Z}$such that$b=ad$. Now$bc=(ad)c=a(dc)$and$dc\in\mathbb{Z}$and hence$a|bc$. 2. Let$a|b$and$b|c$. Then there exist$d_1,d_2\in\mathbb{Z}$such that$b=ad_1$and$c=bd_2$. Now we have $$c=bd_2=(ad_1)d_2=a(d_1d_2)$$ and$d_1d_2\in\mathbb{Z}$. Hence,$a|c$. 3. Let$a|b$and$a|c$. Then there exist$d_1,d_2\in\mathbb{Z}$such that$b=ad_1$and$c=ad_2$. For any$x,y\in\mathbb{Z}\begin{align*} bx+cy&=(ad_1)x+(ad_2)y\\&=a(d_1x+d_2y) \end{align*} andd_1x+d_2y\in\mathbb{Z}$. Hence,$a|bx+cy$. # Introductory Probability: Random Variables and Expectation Let us begin with an example. Example. Consider an experiment of tossing 3 fair coins. Let$X$denote the number of heads appearing. Then$Xtakes on one of the values 0, 1, 2, 3 with respective probabilities: \begin{align*}P_r\{X=0\}&=P_r\{(T,T,T)\}=\frac{1}{8}\\P_r\{X=1\}&=P_r\{(T,T,H),(T,H,T),(H,T,T)\}=\frac{3}{8}\\P_r\{X=2\}&=P_r\{(T,H,H),(H,T,H),(H,H,T)\}=\frac{3}{8}\\P_r\{X=3\}&=P_r\{(H,H,H)\}=\frac{1}{8}\end{align*} ThisX$here is an example of what is called a random variable. Random variables are real-valued functions defined on the sample space$S$i.e.$X: S\longrightarrow\mathbb{R}$. In this example,$X$is the function$X: S\longrightarrow\{0,1,2,3\}$with$S=\{(T,T,T),(T,T,H),(T,H,T),(H,T,T),(T,H,H),(H,T,H),(H,H,T),(H,H,H)\}$defined by the number of heads appearing for each outcome. In probability, often we are more interested in the values of a random variable rather than the outcomes of an experiment. Remark. In more traditional mathematical notation,$P_r\{X=i\}$is denoted by$P_r\{X^{-1}(i)\}$which means $$P_r\{X^{-1}(i)\}=P_r\{s\in S: X(s)=i\}$$ But in probability,$P_r\{X=i\}$is commonly used notation. Here is another example. Example. Three marbles are randomly selected from a jar containing 3 white, 3 red, and 5 black marbles. Suppose that we win \$1 for each white marble selected and lose \$1 for each red marble selected. Let$X$denote the total winnings from the experiment. Then$X$is a random variable taking on the values$0,\pm 1,\pm 2,\pm3with respective probabilities: \begin{align*}P_r\{X=0\}&=\frac{\begin{pmatrix}5\\3\end{pmatrix}+\begin{pmatrix}3\\1\end{pmatrix}\begin{pmatrix}3\\1\end{pmatrix}\begin{pmatrix}5\\1\end{pmatrix}}{\begin{pmatrix}11\\3\end{pmatrix}}=\frac{55}{165}\\P_r\{X=1\}=P_r\{X=-1\}&=\frac{\begin{pmatrix}3\\1\end{pmatrix}\begin{pmatrix}5\\2\end{pmatrix}+\begin{pmatrix}3\\2\end{pmatrix}\begin{pmatrix}3\\1\end{pmatrix}}{\begin{pmatrix}11\\3\end{pmatrix}}=\frac{39}{165}\\P_r\{X=2\}=P_r\{X=-2\}&=\frac{\begin{pmatrix}3\\2\end{pmatrix}\begin{pmatrix}5\\1\end{pmatrix}}{\begin{pmatrix}11\\3\end{pmatrix}}=\frac{15}{165}\\P_r\{X=3\}=P_r\{X=-3\}&=\frac{\begin{pmatrix}3\\3\end{pmatrix}}{\begin{pmatrix}11\\3\end{pmatrix}}=\frac{1}{165}\end{align*} The probability that we win money is $$\sum_{i=1}^3P_r\{X=i\}=\frac{55}{165}=\frac{1}{3}$$ Definition. Let\mathrm{PMF}_X(i):=P_r\{X=i\}$.$\mathrm{PMF}_X(i)$is called the probability mass function for random variable$X$. Definition. Let$\mathrm{CDF}_X(i):=P_r\{X\leq i\}$.$\mathrm{CDF}_X(i)$is called the cumulative distribution function and it describes the probability that the value of a random variable is below a specified number.$\mathrm{CDF}_X(i)$can be expressed as $$\mathrm{CDF}_X(i)=\sum_{j\leq i}\mathrm{PMF}_X(j)$$ i.e. it is the accumulation of distribution (probability) described by the probability mass function for values up to$i$, hence the name the cumulative distribution function. Example. How likely is it that it will take no more than 3 flips for a coin to land on heads? Solution. Let$X$denote the number of heads appearing. Then what the question is asking is$P_r\{X\leq 3\}$i.e.$\mathrm{CDF}_X(3). \begin{align*}\mathrm{CDF}_X(3)&=\sum_{j=1}^3\mathrm{PMF}_X(j)\\&=\sum_{j=1}^3 P_r\{X=j\}\\&=\frac{1}{2}+\frac{1}{4}+\frac{1}{8}=\frac{7}{8}\end{align*} Definition. The expected value of a random variableX$, denoted by$E(X)$, is the weighted average of its possible values weighted according to their probabilities. More specifically, $$E(X)=\sum_i\mathrm{PMF}_X(i)\cdot i=\sum_i P_r\{X=i\}\cdot i$$ The expected value is also called the expectation or the mean. Example. What is the expected value of a roll of a die? Solution. The random variable$X$takes on values 1, 2, 3, 4, 5, 6 and each of these values has equal proprobability of$\frac{1}{6}$. Hence, the expected value of$X$is $$E(X)=\sum_{i=1}^6\frac{1}{6}i=\frac{1}{6}\frac{6\cdot 7}{2}=3.5$$ Here I used the formula $$1+2+3+\cdots+n=\frac{n(n+1)}{2}$$ to calculate$1+2+3+4+5+6$. Example. If a die is rolled three times, how many distinct values are expected to appear? Solution. Let$X$denote the number of distinct values. There are$6^3$outcomes of this experiment. Calculating$P_r\{X=1\}$and$P_r\{X=3\}are straightforward. \begin{align*}P_r\{X=1\}&=\frac{\begin{pmatrix}6\\1\end{pmatrix}}{6^3}=\frac{1}{36}\\P_r\{X=3\}&=\frac{\begin{pmatrix}6\\3\end{pmatrix}}{6^3}=\frac{5}{54}\end{align*}P_r\{X=2\}can be found by $$P_r\{X=2\}=1-P_r\{X=1\}-P_r\{X=3\}=1-\frac{1}{36}-\frac{5}{54}=\frac{95}{108}$$ The expected value is then \begin{align*}E(X)&=P_r\{X=1\}\cdot 1+P_r\{X=2\}\cdot 2+P_r\{X=3\}\cdot 3\\&=\frac{1\cdot 1}{36}+\frac{95\cdot 2}{108}+\frac{5\cdot 3}{54}=\frac{223}{108}\approx 2.06\end{align*} Example. LetX$denote the number of heads appearing in a sequence of 10 flips of a coin. What is$E(X)$? Solution. For any$i=0,1,\cdots,10$, there are$\begin{pmatrix}10\\i\end{pmatrix}$sequences of 10 flips, each of which contains$i$heads. Each such sequence happens with probability$\left(\frac{1}{2}\right)^{10}. Hence, \begin{align*}E(X)&=\sum_{i=0}^{10}P_r\{X=i\}\cdot i\\&=\sum_{i=1}^{10}P_r\{X=i\}\cdot i\\&=\sum_{i=1}^{10}\begin{pmatrix}10\\i\end{pmatrix}\left(\frac{1}{2}\right)^{10}\cdot i\\&=\left(\frac{1}{2}\right)^{10}\cdot 10\cdot 2^9\\&=5\end{align*} For the second line to the last, I used the identity $$\label{eq:binom2}\sum_{k=1}^n\begin{pmatrix}n\\k\end{pmatrix}k=n2^{n-1}$$ The equation \eqref{eq:binom2} can be easily seen. \begin{align*}\sum_{k=1}^n\begin{pmatrix}n\\k\end{pmatrix}k&=\sum_{k=1}^n\frac{n!}{(k-1)!(n-k)!}\\&=n\sum_{k=1}^n\frac{(n-1)!}{(k-1)!(n-1-(k-1))!}\\&=n\sum_{k=1}^n\begin{pmatrix}n-1\\k-1\end{pmatrix}\\&=n2^{n-1}\end{align*} The last expression is obtain using the equation (4) here. While the expected valueE(X)$provides useful information as the weighted average of the possible values of$X$. But it does not provide information on the spread of these values. The variance provides such information. Definition. If$X$is a random variable with expected value$E(X)$, then the variance of$X$, denoted by$\mathrm{Var}(X)$, is defined by $$\mathrm{Var}(X)=E[(X-E(X))^2]$$ To simplify our calculation, let us denote$E(X)$by$\mu. Then \begin{align*}\mathrm{Var}(X)&=E[(x-\mu)^2]\\&=\sum_iP_r\{X=i\}(i-\mu)^2\end{align*} This last expression shows that\mathrm{Var}(X)$measures how far apart$Xwould be from its expected value on the average. Let us continue further from the last expression above. \begin{align*}\sum_iP_r\{X=i\}(i-\mu)^2&=\sum_iP_r\{X=i\}(i^2-2\mu i+\mu^2)\\&=\sum_iP_r\{X=i\}i^2-2\mu\sum_iP_r\{X=i\}i+\mu^2\sum_iP_r\{X=i\}\\&=E(X^2)-2\mu^2+\mu^2\\&=E(X^2)-\mu^2\\&=E(X^2)-[E(X)]^2\end{align*} So we have an alternative formula for the variance $$\label{eq:variance}\mathrm{Var}(X)=E(X^2)-[E(X)]^2$$ Example. Calculate\mathrm{Var}(X)$if$Xrepresents the outcome when a die is rolled. Solution. First \begin{align*}E(X)&=1\cdot\frac{1}{6}+2\cdot\frac{1}{6}+3\cdot\frac{1}{6}+4\cdot\frac{1}{6}+5\cdot\frac{1}{6}+6\cdot\frac{1}{6}\\&=\frac{1}{6}\frac{6\cdot 7}{2}=\frac{7}{2}\end{align*} Next, \begin{align*}E(X^2)&=1^2\cdot\frac{1}{6}+2^2\cdot\frac{1}{6}+3^2\cdot\frac{1}{6}+4^2\cdot\frac{1}{6}+5^2\cdot\frac{1}{6}+6^2\cdot\frac{1}{6}\\&=\frac{1}{6}\frac{6\cdot 7\cdot 13}{6}=\frac{91}{6}\end{align*} The value in the second line to the last is obtained by the formula $$1^2+2^2+3^2+\cdots+n^2=\frac{n(n+1)(2n+1)}{6}$$ Therefore, $$\mathrm{Var}(X)=\frac{91}{6}-\left(\frac{7}{2}\right)^2=\frac{35}{12}$$ Remarks. 1. In mechanics, the center of gravity of a system of particles is indeed the expected value of the position coordinates of particles. Also the moment of inertia of a body is the variance of the position coordinates of particles that constitute the body. 2.\mathrm{SD}(X)=\sqrt{\mathrm{Var}(X)}$is called the standard deviation of$X$. I will complete this note with the following useful identities. Theorem. Let$a$and$b$be constant. Then 1.$E(aX+b)=aE(X)+b$. As a special case, if$b=0$, we obatin$E(aX)=aE(X)$. 2.$\mathrm{Var}(aX+b)=a^2\mathrm{Var}(X). Proof. 1. \begin{align*}E(aX+b)&=\sum_iP_r\{X=i\}(ai+b)\\&=a\sum_iP_r\{X=i\}i+b\sum_iP_r\{X=i\}\\&=aE(X)+b\end{align*} 2. For simplicity, let\mu=E(X)$. Then by the theorem 1 above,$E(aX+b)=a\mu+b\$. Now, \begin{align*}\mathrm{Var}(aX+b)&=E[(aX+b-a\mu-b)^2]\\&=E[a^2(X-\mu)^2]\\&=a^2E[(X-\mu)^2]\\&=a^2\mathrm{Var}(X)\end{align*}

References.

[1] Essential Discrete Mathematics for Computer Science, Harry Lewis and Rachel Zax, Princeton University Press, 2019

[2] A First Course in Probability, Sheldon Ross, 5th Edition, Prentice-Hall, 1998