Introductory Combinatorics:The Basic Principle of Counting

What is combinatorics? It can be simply characterized as:

Combinatorics = The Art of counting

Combinatorics begins with

The Basic Principle of Counting: Suppose that two operations are to be performed. If operation 1 results in any one of $m$ possible outcomes and if for each outcome of operation 1 there are $n$ possible outcomes of operation 2, then together there are $mn$ possible outcomes of the two operations.

The Basic Principle of Counting is also called the Rule of Products. The principle assumes that the choice for operation 2 does not depend on the choice of operation 1.

Example. A lunch bar serves 5 different sandwiches and 3 different drinks. How many different lunches can a person order?

Solution. For each choice of a sandwich, there are three different choices of a drink. Hence there are $5\cdot 3=15$ possible orders.

Example. Let $A$ and $B$ be finite sets. Prove that $|A\times B|=|A|\cdot|B|$ where $|A|$ denotes the cardinality (i.e. the number of elements) of the set $A$.

Proof. Let $|A|=m$ and $|B|=n$. For each choice of elements in $A$ there are $n$ possible choices for the elements in $B$ to pair with. Since there are $m$ possible choices for the elements in $A$, the total number of all possible ordered pairs is $mn$, i.e. $|A\times B|=mn=|A|\cdot|B|$.

Example. A small community consists of 10 women, each of whom has 3 children. If one woman and one of her children are to be chosen as mother and child of the year, how many different choices are possible?

Solution. There are $10\times 3=30$ possible choices.

If more than two operations are to be performed, the basic principle can be generalized as follows.

The Generalized Basic Principle of Counting

If $r$ operations are to be performed in such a way that operation 1 results in $n_1$ possible outcomes, operation 2 results in $n_2$ possible outcomes, …, operation $r$ results in $n_r$ possible outcomes, then there is a total of $n_1\cdot n_2\cdots n_r$ possible outcomes of the $r$ operations.

Example. A person is to complete a true-false questionnaire. How many different ways are there to answer the questionnaire?

Solution. $\stackrel{10\ \mathrm{fold}}{\overbrace{2\cdot 2\cdots 2}}=2^{10}=1024$.

Example. A questionnaire contains 4 questions that have 2 possible answers and three questions with 5 possible answers. How many different answers are possible?

Solution. $2\cdot 2\cdot 2\cdot 2\cdot 5\cdot 5\cdot 5=2^4\cdot 5^3=2000$ different ways to answer the questionnaire.

Example. How many different 7-place license plates are possible if the first 3 places are to be occupied by letters and the final 4 by numbers?

Solution. $26\cdot 26\cdot 26\cdot 10\cdot 10\cdot 10\cdot 10=175,760,000$.

Example. In the above example, how many license plates would be possible if repetition among letters or numbers were prohibited?

Solution. $26\cdot 25\cdot 24\cdot 10\cdot 9\cdot 8\cdot 7=78,624,000$.

Theorem. (Power Set Cardinality Theorem) If $A$ is a finite set, then $|\wp(A)|=2^{|A|}$.

Proof. The proof boils down to counting how many different ways there are to determine subsets of $A$. Let $B$ be a subset of $A$. For each $x\in A$, there are two possibilities: either $x\in B$ or $x\not\in B$. Since $A$ has $n$ elements, there are $\stackrel{n\ \mathrm{fold}}{\overbrace{2\cdot 2\cdots 2}}=2^n$ different ways to determine subsets of $A$.

References: Not in particular order.

[1] Applied Discrete Structures, Al Doerr and Ken Levasseur. This book is available for free under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License.

[2] A First Course in Probability, Sheldon Ross, 5th Edition, Prentice-Hall, 1998

Modular Arithmetic

Recall the equivalence relation $\equiv\ \mathrm{mod}\ n$ on $\mathbb{Z}$ (we discussed it here) $$p\equiv q\ \mathrm{mod}\ n\ \Longleftrightarrow\ n|(p-q)$$ Let us denote by $p\ \mathrm{mod}\ n$ the remainder when $p$ is divided by $n$. The definition of the equivalence relation implies that $$p\equiv q\ \mathrm{mod}\ n\ \Longleftrightarrow\ p\ \mathrm{mod}\ n=q\ \mathrm{mod}\ n$$ (The proof is left as an exercise.) When an integer $p$ is divided by $n$, the remainder $p\ \mathrm{mod}\ n$ is an integer satisfying the inequality $0\leq p\ \mathrm{mod}\ n\leq n-1$. Therefore, the equivalence class $[p]$ is one of $[0],[1],\cdots,[n-1]$, i.e. the quotient set $\mathbb{Z}_n=\mathbb{Z}/\equiv\mathrm{mod}\ n$ is $$\mathbb{Z}_n=\{[0],[1],[2],\cdots,[n-1]\}$$ Also recall that in here we defined $+$ and $\cdot$ on $\mathbb{Z}_n$ by \begin{align*}[a]+[b]&=[a+b]\\ [a]\cdot[b]&=[a\cdot b]\end{align*} These operations are well-defined due to the following properties (see Problem Set 12 #7): if $a\equiv b\ \mathrm{mod}\ n$ and $c\equiv d\ \mathrm{mod}\ n$, then \begin{equation}\begin{aligned}a+c&\equiv b+d\ \mathrm{mod}\ n\\a\cdot c&\equiv b\cdot d\ \mathrm{mod}\ n\end{aligned}\label{eq:congrel}\end{equation} The following is another useful property: if $a\equiv b\ \mathrm{mod}\ n$ and $c$ is an integer then \begin{equation}\label{eq:congrel2}c\cdot a\equiv c\cdot b\ \mathrm{mod}\ n\end{equation} Its proof is straightforward so it is left as an exercise. The properties \eqref{eq:congrel} and \eqref{eq:congrel2} can be used to calculate the remainder when a significantly large integer is divided by an integer $n$.

Example. What is $N\ \mathrm{mod}\ 21$ where $$N=113\times (167+484)+192\times 145?$$

Solution. $N=113\times (167+484)+192\times 145=101403$, which is not really a large number. One can simply divide 101403 by 21 and find the remainder 15. Let us now try a different way. Note that $113\equiv 8\ \mathrm{mod}\ 21$, $167\equiv 20\ \mathrm{mod}\ 21$, $484\equiv\ 1\mathrm{mod}\ 21$, $192\equiv 3\ \mathrm{mod}\ 21$, and $145\equiv 19\ \mathrm{mod}\ 21$. Hence by using \eqref{eq:congrel} we have $$N\equiv 8\times (20+1)+3\times 19=225\ \mathrm{mod}\ 21$$ 225 is a much smaller number and one can easily divide it by 21 and obtain the remainder 15. On the other hand, $21\equiv 0\ \mathrm{mod}\ 21$ and $19\equiv -2\ \mathrm{mod}\ 21$ since $19=21-2$. So by using the properties \eqref{eq:congrel} and \eqref{eq:congrel2}, we have $$N\equiv 8\times 0+3\times (-2)=-6\equiv 15\ \mathrm{mod}\ 21$$ since $-6=21(-1)+15$.

Example. What is $10^3\ \mathrm{mod}\ 7$?

Solution. $1000$ is not a large number and one can easily divide it by 7 to obtain $1000=142\cdot 7+6$ and so $10^3\equiv 6\ \mathrm{mod}\ 7$. There is a smarter way to do this though. Note that $10=7\cdot 1+3$, so $10\equiv 3\ \mathrm{mod}\ 7$. Using \eqref{eq:congrel}, we obtain $$10^3\equiv 3\cdot 3\cdot 3=27\equiv 6\ \mathrm{mod}\ 7$$

The remainder when a large number is divided by an integer sometimes can be calculated using repeated squaring as seen in the following example.

Example. Find $3^{25}\ \mathrm{mod}\ 7$.

Solution. \begin{align*}3^{25}&=(3^{12})^2\cdot 3\\&=((3^6)^2)^2\cdot 3\\&=(((3^3)^2)^2)^2\cdot 3\\&=(((3^2\cdot 3)^2)^2)^2\cdot 3\\&\equiv (((2\cdot 3)^2)^2)^2\cdot 3\\&=((36)^2)^2\cdot 3\\&\equiv 3\ \mathrm{mod}\ 7\end{align*} where we use the fact $36\equiv 1\ \mathrm{mod}\ 7$.

Definition. $[y]$ is said to be a multiplicative inverse of $[x]$ in $\mathbb{Z}_n$ if $x\cdot y\equiv 1\ \mathrm{mod}\ n$.

It is not necessarily that every non-zero element of $\mathbb{Z}_n$ has a multiplicative inverse. For example, let us consider the multiplication table of $\mathbb{Z}_4$. Here, we denoted $[a]$ by simply $a$. $$\begin{array}{|c|c|c|c|c|}\hline\cdot & 0 & 1 & 2 & 3\\\hline 0 & 0 & 0 & 0 &0\\\hline1 & 0 & 1 & 2 & 3\\\hline2 & 0 & 2 & 1 & 2\\\hline3 & 0 &3 & 0 &3\\\hline\end{array}$$ As clearly seen in the table, 3 does not have a multiplicative inverse. On the other hand, in $\mathbb{Z}_5$ all non-zero elements have multiplicative inverses as shown in the following table. $$\begin{array}{|c|c|c|c|c|c|}\hline\cdot & 0 & 1 & 2 & 3 & 4\\\hline 0 & 0 & 0 & 0 & 0 & 0\\\hline 1 & 0 & 1 & 2 & 3 & 4\\\hline 2 & 0 & 2 & 4 & 1 & 3\\\hline 3 & 0 &3 & 1 & 4 & 2\\\hline 4 & 0 & 4 & 3 & 2 & 1\\\hline\end{array}$$ Notice that 5 is a prime number. In fact, we have the following cool theorem.

Theorem. If $p$ is prime, then every non-zero element of $\mathbb{Z}_p$ has a multiplicative inverse. Therefore, $\mathbb{Z}_p$ is a finite field.

Proof. Let $a$ be an integer such that $1\leq a\leq p-1$. Then clearly $$\mathbb{Z}_p[a]=\{[0],[1\cdot a],\cdots,[(p-1)\cdot a]\}\subseteq\mathbb{Z}_p$$ If $[0],[1\cdot a],\cdots,[(p-1)\cdot a]$ are mutually distinct, $\mathbb{Z}_p[a]=\mathbb{Z}$ and therefore, $[i]\cdot[a]=[i\cdot a]=1$ for some $1\leq i\leq p-1$. That is, $[a]$ has a multiplicative inverse. So the proof of this theorem boils down to showing that $\mathbb{Z}_p[a]$ has $p$ distinct elements for any $1\leq a\leq p-1$. This is a consequence of the lemma below.

Lemma. If $p$ is a prime number such that $p\not| a$ and $i\cdot a\equiv j\cdot a\ \mathrm{mod}\ p$ then $i\equiv j\ \mathrm{mod}\ p$.

Proof. If $i\cdot a\equiv j\cdot a\ \mathrm{mod}\ p$ then $p|a(i-j)$. Since $p\not| a$, $p|i-j$ i.e. $i\equiv j\ \mathrm{mod}\ p$.


[1] Essential Discrete Mathematics for Computer Science, Harry Lewis and Rachel Zax, Princeton University Press, 2019

[2] A Short Course in Discrete Mathematics, Edward A. Bender and S. Gil Williamson, Dover Publications, 2012

Equivalence Relations

Definition. Let $A$ be a nonempty set. Then $R\subseteq A\times A$ is called a binary relation on $A$. A binary relation $R$ on $A$ is said to be an equivalence relation if

  1. $R$ is reflexive: $\forall x\in A$, $(x,x)\in R$
  2. $R$ is symmetric: $(x,y)\in R\Longrightarrow (y,x)\in R$
  3. $R$ is transitive: $(x,y)\in R\ \mathrm{and}\ (y,z)\in R\Longrightarrow (x,z)\in R$

Definition. Let $R$ be an equivalence relation on a set $A$. For each $x\in A$, let $$[x]=\{y\in A: (y,x)\in R\}$$ Then $[x]$ is called the ($R$-)equivalence class of $x\in A$.

Theorem 1. Let $R$ be an equivalence relation on a set $A$. Then

  1. $\forall x,y\in A$, $[x]\cap [y]=\emptyset$ or $[x]=[y]$.
  2. $A=\bigcup_{x\in A}[x]$.


  1. Let $x,y\in A$. If $[x]\cap [y]=\emptyset$, then we are done. Now suppose that $[x]\cap [y]\ne\emptyset$. Then $\exists z\in[x]\cap [y]$ so that \begin{align*}(z,x)\in R\ \mathrm{and}\ (z,y)\in R&\Longrightarrow (x,z)\in R\ \mathrm{and}\ (z,y)\in R\\&\Longrightarrow (x,y)\in R\end{align*} This implies that $[x]=[y]$.
  2. Left as an exercise.

Definition. Let $A$ be a set. By a partition of $A$ we mean a family $\{A_i\}_{i\in I}$of nonempty subsets of $A$ such that

  1. $\forall i,j\in I$, $A_i\cap A_j=\emptyset$ or $A_i=A_j$.
  2. $A=\bigcup_{i\in I}A_i$

Theorem 1 says that given an equivalence relation $R$ on $A$, the $R$-equivalence classes form a partition of $A$. Conversely, given a partition $\{A_i\}_{i\in I}$, one can define an equivalence relation $R$ whose equivalence classes coincide with the $A_i$´s.

Theorem 2. Let $\{A_i\}_{i\in I}$ be a partition of a set $A$. Define a relation $R$ on $A$ as follows: $$\forall x,y\in A,\ (x,y)\in R\Longleftrightarrow x,y\in A_i\ \mbox{for some}\ i\in I$$ Then $R$ is an equivalence relation on $A$ and the $R$-equivalence classes coincide with the $A_i$´s.

Proof. Left as an exercise.

Given an equivalence relation $R$ on a set $A$, the set of all equivalence classes is called the quotient set of $A$ modulo (or mod in short) $R$ and is denoted by $A/R$ i.e. $$A/R=\{[x]: x\in A\}$$

Figure 1. The Quotient Set A/R

There is a map $\gamma: A\longrightarrow A/R$ defined by $$\forall x\in A,\ \gamma(x)=[x]$$ $\gamma$ is called the canonical map from $A$ to $A/R$.

Example 1. (The Vector Space $\mathbb{R}^3$) Let $V$ be the set of all directed arrows in 3-space $\mathbb{R}^3$. Define a relation $\equiv$ on $V$ as follows: $$\forall \overrightarrow{AB},\overrightarrow{CD}\in V,\ \overrightarrow{AB}\equiv\overrightarrow{CD}\Longleftrightarrow\ \overrightarrow{AB}\ \mathrm{and}\ \overrightarrow{CD}\ \mbox{have the same direction and magnitude}$$ Then clearly $\equiv$ is an equivalence relation on $V$. Denote by $\mathcal{V}$ the quotient set $V/\equiv$ and each equivalence class $[\overrightarrow{AB}]\in\mathcal{V}$ is called a vector. Each directed arrow $\overrightarrow{AB}$ whose starting point is $A$ and terminal point is $B$ is $\equiv$-related to a directed arrow $\vec{a}$ whose starting point is the origin $O=(0,0,0)$ and terminal point is $B-A$. $\vec{a}$ can be identified with its terminal point $B-A$ which is an ordered triple $(a_1,a_2,a_3)$. Hence from here on, without loss of generality, we may assume that each equivalence class is represented by such a vector. Define the vector addition $+:\mathcal{V}\times\mathcal{V}\longrightarrow\mathcal{V}$ by $$\forall [\vec{a}],[\vec{b}]\in\mathcal{V},\ [\vec{a}]+[\vec{b}]:=[\vec{a}+\vec{b}]$$ Here, $$\vec{a}+\vec{b}=(a_1,a_2,a_3)+(b_1,b_2,b_3)=(a_1+b_1,a_2+b_2,a_3+b_3)$$ Also define the scalar multiplication $\cdot :\mathbb{R}\times\mathcal{V}\longrightarrow\mathcal{V}$ by $$\forall c\in\mathbb{R},\ \forall [\vec{a}]\in\mathbb{V},\ c[\vec{a}]:=[c\vec{a}]$$ Here, $$c\vec{a}=(ca_1,ca_2,ca_3)$$ Let $[\vec{a}]=[\vec{a}’]$ and $[\vec{b}]=[\vec{b}’]$. Then $\vec{a}=\vec{a}’$ and $\vec{b}=\vec{b}’$ i.e. $a_i=a’_i$ and $b_i=b’_i$, $i=1,2,3$. Thus $$\vec{a}+\vec{b}=\vec{a}’+\vec{b}’\Longrightarrow [\vec{a}]+[\vec{b}]=[\vec{a}+\vec{b}]=[\vec{a}’+\vec{b}’]=[\vec{a}’]+[\vec{b}’]$$ Hence, the vector addition is well-defined. It can be shown similarly that the scalar multiplication is also well-defined. We will see in the next example that there is a bijective from $\mathcal{V}$ to $\mathbb{R}^3$.

Example 2. (The Kernel of a Function) Let $f: A\longrightarrow B$ be a function. The kernel of $f$, $\ker f$ is defined by $$\ker f=\{(x,y)\in A\times A: f(x)=f(y)\}$$ It is easy to see that $\ker f$ is an equivalence relation on $A$. It turns out that the quotient set $A/\ker f$ coincides with the partition of $A$ by the pre-images $f^{-1}(z)$, $\forall z\in A$. (See Problem Set 12 #5 (b).) Now we suppose that $f: A\longrightarrow B$ is surjective (onto). Define a map $\psi: A/\ker f\longrightarrow B$ by $$\forall [x]\in A/R,\ \psi([x])=f(x)$$ Then $\psi$ is a bijective and $\gamma=\psi\circ f$, where $\gamma: A\longrightarrow A/\ker f$ is the canonical map. (Its proof is left as an exercise.)

In Example 1, we have seen that given a directed arrow $\overrightarrow{AB}$ there is a $\equiv$-related directed arrow whose starting point the origin $O$ and terminal point $B-A$ and that such a directed arrow can be identified with its terminal point $\vec{a}:=B-A=(a_1,a_2,a_3)\in\mathbb{R}^3$. This defines a surjective map $f: V\longrightarrow\mathbb{R}^3$ and $\ker f=\equiv$. Therefore, we see that there is a bijective $\psi$ from $V/\ker f=V/\equiv$ to $\mathbb{R}^3$ defined by $$\forall [\vec{a}]\in V/\equiv,\ \psi([\vec{a}])=(a_1,a_2,a_3)$$ Furthermore, this bijective is a vector space homomorphism i.e. it preserves the vector addition and the scalar multiplication: \begin{align*}\psi([\vec{a}]+[\vec{b}])&=\psi([\vec{a}])+\psi([\vec{b}])\\\psi(c[\vec{a}])&=c\psi([\vec{a}])\end{align*} Therefore $V/\equiv$ is isomorphic to $\mathbb{R}^3$ as a vector space.

Figure 2. The Fundamental Homomorphism Theorem

Example 3. (The Construction of Rational Numbers From Integers) Let $\mathbb{Z}$ be the set of all integers. Define a relation $\sim$ on $\mathbb{Z}\times(\mathbb{Z}\setminus\{0\})$ by $$\forall (a,b),(c,d)\in \mathbb{Z}\times(\mathbb{Z}\setminus\{0\}),\ (a,b)\sim (c,d)\Longleftrightarrow ad=bc$$ Then $\sim$ is an equivalence relation. (Its proof is left as an exercise.) The equivalence class of $(a,b)$ is denoted by $\frac{a}{b}$. For example, $(1,2)\sim (2,4)$ thus $\frac{1}{2}=\frac{2}{4}$. We see that the quotient set $\mathbb{Z}\times(\mathbb{Z}\setminus\{0\})/\sim$ can be identified with $\mathbb{Q}$, the set of all rational numbers.

Example 4. (Residue Classes Modulo $n$) Let $\mathbb{Z}$ be the set of all integers. Define a relation $\equiv\ \mathrm{mod}\ n$ on $\mathbb{Z}$ by $$\forall p,q\in\mathbb{Z},\ p\equiv q\ \mathrm{mod}\ n\Longleftrightarrow n|(p-q)$$ Then $\equiv\ \mathrm{mod}\ n$ is an equivalence relation. (Its proof is left as an exercise.) An integer $m$ can be written as $m=nk+r$ where $k$ is the quotient and $0\leq r<n$ is the remainder when $m$ is divided by $n$. Thus $m\equiv\ \mathrm{mod}\ n r$. This means that there are exactly $n-1$ equivalence classes $$[0],[1],\cdots,[n-1]$$ These equivalence classes are called the residue classes modulo $n$. The quotient set $\mathbb{Z}/\equiv\ \mathrm{mod}\ n$ is usually denoted by $\mathbb{Z}_n$. We can define addition and multiplication on $\mathbb{Z}_n$ as follows: $\forall [a],[b]\in\mathbb{Z}_n$, \begin{align*}[a]+[b]&=[a+b]\\ [a]\cdot [b]&=[ab]\end{align*} These operations are well-defined because the equivalence relation $\equiv\ \mathrm{mod}\ n$ preserves the operations, namely if $a\equiv b\ \mathrm{mod}\ n$ and $c\equiv d \mathrm{mod}\ n$ then \begin{align*}a+c&\equiv b+d\ \mathrm{mod}\ n\\ ac&\equiv bd\ \mathrm{mod}\ n\end{align*} An equivalence which preserves operations are called a congruence relation. Congruence relations play a crucial role in constructing quotient algebras. For more details about congruence relations, see for example the reference [3]. Note that the equivalence relations in Examples 1 and 3 are also congruence relations. $(\mathbb{Z}_n,+)$ is an abelian group and $(\mathbb{Z}_n,+,\cdot)$ is a commutative ring with unity. $(\mathbb{Z},+,\cdot)$ is a field if $n$ is a prime.

Example. (Digraphs) Let $G=(V,A)$ be a digraph. Let $\sim$ be mutual reachability relation on $V$, i.e. $\forall x,y\in V$, $x\sim y$ if and only if $x$ and $y$ are mutually reachable. It can be easily seen that mutual reachability relation is an equivalence relation. The equivalence classes are called the strongly connected components or simply strong components of the digraph $G$ and they are subgraphs of $G$.

Figure 3. A digraph G and its strong components

Theorem. Let $G=(V,A)$ be a digraph and let $T$ be the partition of $V$ into the strong components of $G$. Construct a new digraph $G’=(T,A’)$, where $\forall X,Y\in T$, $X\to Y\in A’$ if and only if $X\ne Y$ and $x\to y\in A$ for some $x\in X$ and $y\in Y$. Then $G’$ is a DAG.

Proof. Suppose that there is a cycle in $G’$, $x_0\to x_1\to\cdots x_k\to x_0$ for some $x_0,\cdots,x_k$ belonging to different strong components of $G$. But all theses vertices are mutually reachable in $G$ so this is a contradiction. Therefore $G’$ is acyclic.

Figure 4. The DAG of the strong components of G in Figure 3.


[1] Essential Discrete Mathematics for Computer Science, Harry Lewis and Rachel Zax, Princeton University Press, 2019

[2] Set Theory, Charles C. Pinter, Addison Wesley Publishing Company, 1971

[3] A Course in Universal Algebra, Stanley N. Burris and H. P. Sankappanavar, Graduate Texts in Mathematics, Springer-Verlag, 1981. The book is available online for free at the first named author’s web page here.

Directed Graphs

Directed graphs represent binary relations. They can be visualized as diagrams made up of points (called vertices or nodes) and arrows (called arcs or edges). Draw an arc from a vertex $v$ to a vertex $w$ to represent that $v$ is related to $w$ i.e. the ordered pair $(v,w)$ is in the relation.

Example 1. An example of a directed graph

Figure 1. A directed graph

Definition. A directed graph or digraph in short is an ordered pair $(V,A)$ where $V$ is an nonempty set and $A\subseteq V\times V$ (i.e. $A$ is a binary relation on $V$). The members of $V$ are called vertices or nodes and the members of $A$ are called arcs or edges. We write arcs as $v\to w$ rather than $(v,w)$.

In Example 1, $V=\{a,b,c,d,e\}$ and \begin{align*}A&=\{(a,b), (b,c), (a,c), (c,c), (c,d), (b,d),(d,b)\}\\&=\{a\to b, b\to c, a\to c, c\to c, c\to d, b\to d, d\to b\}\end{align*}

Transportation and computer networks have natural representations of digraphs.

A walk in a digraph is a way of proceeding through a sequence of vertices by following arcs i.e. a walk in a digraph $(V,A)$ is a sequence of vertices $v_0,v_1,\cdots,v_n\in V$ for some $n\geq 0$ such that $v_i\to v_{i+1}\in A$ for each $i<n$. The length of this walk is $n$ which is the number of arcs.

Example 2. In Example 1, $b\to d$, $b\to c\to d$, $b\to c\to c\to d$, $b\to d\to b\to d$ are examples of walks from vertex $b$ to vertex $d$. The length of $b\to d$ is 1, the length of $b\to c\to d$ is 2, the length of $b\to c\to c\to d$ is 3 and the length of $b\to d\to b\to d$ is also 3. There is no walk from vertex $b$ to vertex $a$.

A path is a walk that doesn’t repeat any vertex.

Example 3. Among the walks in Example 2, $b\to d$ and $b\to c\to d$ are paths from vertex $b$ to $d$.

A walk in which the first and the last vertex are the same is called a circuit. A circuit is called a cycle if the first and the last vertices are the only repeated vertex. For example, $b\to c\to c\to b\to b$ in the digraph in Example 1 is a circuit but is not a cycle. On the other hand, $b\to c\to d\to b$ is a cycle of length 3, $b\to d\to b$ is a cycle of length 2, and $c\to c$ is a cycle of length 1.

A digraph without any cycles is said to be acyclic. The digraph in Exam 1 is not acyclic as it contains cycles.

A walk can be reduced to a path by removing nontrivial cycles. Suppose that a walk from $v$ to $w$ $$v=v_0\to\cdots\to v_n=w$$ includes a cycle $v_i\to v_{i+1}\to\cdots\to v_j$ where $i<j$ and $v_i=v_j$. Then $$v=v_0\to\cdots\to v_i\to v_{j+1}\to\cdots\to v_n=w$$ is a shorter walk from $v$ to $w$.

Figure 2. Shortening of a walk by removing a cycle

For example, the walk $b\to d\to b\to d$ in Example 1 can be reduced to the path $b\to d$ by removing the cycle $b\to d\to d$.

A vertex $w$ is said to be reachable from vertex $v$ if there is a walk or a path from $v$ to $w$. The distance from vertex $v$ to vertex $w$ in a digraph $G$, denoted by $d_G(v,w)$, is the length of the shortest path from $v$ to $w$, or is defined to be $\infty$ if there is no path from $v$ to $w$. For example, the distance from $a$ to $d$ in the digraph in Example 1 is 2 because the shortest path is $a\to c\to d$.

Lemma. The distance from one vertex of a graph to another vertex is at most the length of any walk from the first to the second.

Proof. It follows from the fact that any walk from $v$ to $w$ includes among its arcs a path from $v$ to $w$.

A digraph in which every vertex is reachable from every other vertex is said to be strongly connected.

Let $G=(V,A)$ be a digraph. Let $V’\subseteq V$ and $A’\subseteq A$. Then $(V’,A’)$ is called a subgraph of $G$. $(V,\emptyset)$ is a subgraph of $G$.

Let $G=(V,A)$ be a digraph and $V’\subset V$. Then $$(V’,\{v\to w\in A: v,w\in V’\})$$ is called the subgraph induced by $V’$.

Example 4. The subgraph induced by $V\setminus\{e\}$ in Example 1 is not strongly connected. But the subgraph induced by $\{b,c,d\}$ is strongly connected.

Figure 3. The subgraph of the digraph in Example 1, that is induced by V-{e}.
Figure 4. The subgraph of the digraph in Example 1, that is induced by {b,c,d}.

An acyclic digraph is generally called a directed acyclic group or DAG in short.

The out-degree of a vertex is the number of arcs leaving it i.e. $|\{w\in V: v\to w\ \mbox{in}A\}|$. Similarly, the in-degree of a vertex is the number of arcs entering it.

Theorem. A finite DAG has at least one vertex of out-degree 0 and at least one vertex of in-degree 0.

Proof. Let $G=(V,A)$ be a finite DAG. Suppose that $G$ has no vertex of out-degree 0.

Figure 5. Example of vertices whose in-degree is 0 or whose out-degree is 0.

Pick a vertex $v_0$. Since $v_0$ has a positive out-degree, there exists an arc $v_0\to v_1$ for some vertex $v_1$. Since $v_1$ has a positive out-degree, there exists an arc $v_1\to v_2$ for some vertex $v_2$. One can continue doing this. But since $V$ is finite, some vertex will be repeated, creating a cycle. This is a contradiction to the graph $G$ being acyclic. A very similar argument can be made to show that $G$ has a vertex of in-degree 0.

In a DAG, a vertex of in-degree 0 is called a source and a vertex of out-degree 0 is called a sink.

A tournament graph

A tournament graph is a digraph in which every pair of distinct vertices are connected by an arc in one direction or the other, but not both. It is a natural representation of a round-robin tournament in which each players competes with all other plays in turn.

Example 5. The tournament graph in Figure 6 shows that $H$ beats both $P$ and $Y$, $Q$ beats $H$, $Y$ beats $P$, and $P$ beats $Q$. Hence we have a cycle $H\to P\to Q\to H$. Awkwardly, there is no champion!

Figure 6. A tournament graph

Example 6. The tournament graph in Figure 7 shows that $H$ beats $P$, $Y$, $D$, $Y$ beats $P$ and $D$, and $P$ beats $D$. Hence, $H$ is the first, $Y$ is the second, $P$ the third, and $D$ is the fourth places.

Figure 7. A tournament graph

The total number of arcs in a tournament graph with $n$ vertices is $\frac{n(n-1)}{2}$, since there is exactly one arc between each pair of distinct vertices.

A linear order, denoted by $\preceq$, is a binary relation on a finite set $S=\{s_0,s_1,\cdots,s_n\}$ such that $s_i\preceq s_j$ if and only if $i\leq j$.

Example 7. Let $S=\{\mbox{all English words}\}$. Define $s_i\preceq s_j$ to mean that $s_i$ appears before $s_j$ alphabetically or they are equal. Then $\preceq$ is a linear order. This particular linear order is called the lexicographic order or the dictionary order.

A strict order of a finite set, denoted by $\prec$, is an ordering such that $s_i\prec s_j$ if and only if $i<j$.

Theorem. A tournament graph represents a strict linear order if and only if it is a DAG.

Proof. Let $G=(V,A)$ be a tournament graph. Suppose that $G$ represents a strict order. This means that the path $$v_0\to v_1\to\cdots\to v_n$$ represents $$v_0\prec v_1\prec\cdots\prec v_n$$ so every arc goes from $v_i$ to $v_j$ if $i<j$. The graph is then acyclic because any cycle will have to include at least one arc $v_j\to v_i$ where $i<j$. Conversely suppose $G$ is a DAG. We show by induction that $G$ represents a strict order. If $G$ has only one vertex, clearly $G$ represents a strict order. If $G$ has more than one vertex, at least one of the vertices, say $v_0$, has in-degree 0. Since $G$ is a tournament graph, $G$ must contain each of the arc $v_0\to v_i$ ($i\ne 0$). Consider the subgraph induced by $V\setminus\{v_0\}$. This induced subgraph is also a DAG, since all its arcs are arcs of $G$. It is also a tournament graph. The subgraph includes all the arcs of $G$ except those entering from $v_0$. By induction hypothesis, the induced subgraph represents a strict linear order $$v_1\prec v_2\prec\cdots\prec v_n$$ Since $v_0\prec v_i$ for all $i\ne 0$, we have $$v_0\prec v_1\prec\cdots\prec v_n$$ This completes the proof.

Example 8. The tournament graph in Example 7 is a DAG and it represents a strict order $H\prec Y\prec P\prec D$, while the tournament graph in Example 6 doesn’t. (It is not a DAG.)


[1] Essential Discrete Mathematics for Computer Science, Harry Lewis and Rachel Zax, Princeton University Press, 2019

Quantificational Logic

Quantificational logic is an extension of propositional logic. It is logic of expressions such as “for any”, “for all”, “there is some”, and “there is exactly one”. Quantificational logic is also called first-order logic, predicate logic, or predicate calculus. The universal quantifier $\forall$ means “for all”, “for each”, “for any”, or “for every”. The existential quantifier $\exists$ means “there exists”.

Example. $\forall x\exists y P(x,y)$

Quantificational formulae such as the one in the above example are merely strings of symbols which do not mean anything (i.e. being true or false) as logical statements until they are accompanied by proper interpretations. An interpretation of a quantificational formula has to specify the following.

  1. The universe $U$, the nonempty set from which the values of the variables ($x$, $y$, $z$, etc.) are drawn.
  2. For each, say $k$-ary, predicate symbol $P$, which $k$-tuples of members of $U$ the predicate is true of, and
  3. What elements of the universe correspond to any constant symbols, and what functions from the universe to itself correspond to function symbols mentioned in the formula.

Example. Let $U=\{0,1\}$ and $P$ be the less-than relation i.e. $P(x,y)$ is “$x$ is less than $y$.” Then

  • $P(0,0)$ is false
  • $P(0,1)$ is true
  • $P(1,0)$ is false
  • $P(1,1)$ is false

The quantificational formula $\forall x\exists y P(x,y)$ is false because there is no value of $y$ in the universe for which $P(x,y)$ is true when $x=1$.

Example. Let $U=\{0,1\}$ and $P$ be the not-equal relation i.e. $P(x,y)$ is “$x$ is not equal to $y$.” Then

  • $P(0,0)$ is false
  • $P(0,1)$ is true
  • $P(1,0)$ is true
  • $P(1,1)$ is false

Hence $\forall x\exists y P(x,y)$ is true.

In general, the universe of an interpretation is an infinite set, so it is impossible to specify the values of the predicate for every combination of elements. For a remedy, we restate the definition in terms of relations by saying an interpretation of a quantificational formula consists of

  1. a nonempty set called the universe
  2. for each $k$-place predicate symbol, a $k$-ary relation on the universe
  3. for each $k$-place function symbol, a $k$-ary function from the universe to itself.

Example. Let $U=\mathbb{N}$, the set of natural numbers and $P$ be the less-than relation. Then the formula $\forall x\exists y P(x,y)$ is true.

Example. Let $U=\mathbb{N}$ and $P$ be the greater-than relation. Then the formula $\forall x\exists y P(x,y)$ is false.

Example. An example of a formula involving a function. Let $U=\mathbb{N}$ and $P(x,y)$ be $\forall x\exists y(x+y=0)$. The constant symbol 0 is interpreted as zero and the binary function symbol $+$ represents addition. The formula is false. For example, when $x=1$, there is no value of $y$ in the universe such that $x+y=0$. If $U=\mathbb{Z}$, the set of integers, however, the formula is true.

Two formulae are equivalent if they are true under the same interpretation.

Example. $\forall x\exists y P(x,y)$ and $\forall y\exists x P(y,x)$ are equivalent.

If two formulae $F$ and $G$ are equivalent, we write $F\equiv G$.

A model of a formula is an interpretation in which it is true. A satisfiable formula of quantificational logic is one that has a model. A valid formula, also called theorem, is a formula that is true under every interpretation. Valid formulae are the quantificational analogs of tautologies in propositional logic.

Example. $\forall x(P(x)\wedge Q(x))\Longrightarrow\forall y P(y)$ is a valid formula. $\forall x P(x)\wedge\exists y\neg P(y)$ is unsatisfiable.

Note. $\exists x (H(x)\wedge B(x))\not\equiv\exists xH(x)\wedge\exists x B(x)$. $\exists x (H(x)\wedge B(x))\not\equiv\exists x H(x)\wedge B(x)$. $x$ in $B(x)$ in the second formula is a free variable i.e. $\exists x H(x)\wedge B(x)\equiv(\exists x H(x))\wedge B(x)$$

The laws we learned in propositional logic are carried over. For example, Distributive Laws \begin{align}\forall x(P(x)\wedge(Q(x)\vee R(x)))&\equiv\forall x((P(x)\wedge Q(x))\vee(P(x)\wedge R(x)))\label{eq:distlaw1}\\\forall x(P(x)\vee(Q(x)\wedge R(x)))&\equiv\forall x((P(x)\vee Q(x))\wedge(P(x)\vee R(x)))\label{eq:distlaw2}\end{align}

Quantificational Equivalence Rule 1 (Proposition Substitutions)

Suppose $F$ and $G$ are quantificational formulae and $F’$ and $G’$ are propositional formulae that result from $F$ and $G$, respectively, by replacing each subformula by a corresponding propositional variable at all of its occurrences in both $F$ and $G$. Suppose $F’\equiv G’$ as formulae of propositional logic. Then replacing $F$ by $G$ in any formula results n an equivalent formula.

Example. $\forall x\neg\neg P(x)\equiv\forall x P(x)$ since $p\equiv\neg\neg p$ and so $\neg\neg P(x)$ can be replaced by $P(x)$.

Example. Replacing $P(x)$, $Q(x)$ and $R(x)$ by $p$, $q$ and $r$, respectively, turns $P(x)\wedge(Q(x)\vee R(x))$ into $p\wedge(q\vee r)$ and $(P(x)\wedge Q(x))\vee(P(x)\wedge R(x))$ into $(p\wedge q)\vee(p\wedge r)$. Since $p\wedge(q\vee r)\equiv (p\wedge q)\vee(p\wedge r)$, $P(x)\wedge(Q(x)\vee R(x))$ can be replaced by $(P(x)\wedge Q(x))\vee(P(x)\wedge R(x))$. Hence we have the equivalence in \eqref{eq:distlaw1}.

Quantificational Equivalence Rule 2 (Change of Variables)

Let $F$ be a formula containing a subformula $\Box x G$, where $\Box$ is either $\forall$ or $\exists$. Assume $G$ has no bound occurrence of $x$ and let $G’$ be the result of replacing $x$ by $y$ everywhere in $G$. Then replacing $\Box x G$ by $\Box y G’$ within the formula results in an equivalent formula.

Example. $\exists x (H(x)\wedge B(x))\equiv\exists y (H(y)\wedge B(y))$

Quantificational Equivalence Rule 3 (Quantifier negation)

\begin{align*}\neg\forall x F&\equiv\exists x\neg F\\\neg\exists x F&\equiv\forall x\neg F\end{align*}

Quantificational Equivalence Rule 4 (Scope change)

Suppose the variable $x$ does not appear in $G$. Let $\Box$ denote either $\forall$ or $\exists$. Let $\diamond$ denote either $\vee$ or $\wedge$. Then \begin{align*}(\Box x F\diamond G)&\equiv\Box x(F\diamond G)\\(G\diamond\Box x F)&\equiv\Box x(G\diamond F)\end{align*}

Example. Let $r$ be “it rains”, $\mathrm{outside(x)}$ “$x$ is outside” and $\mathrm{wet}(x)$ “$x$ gets wet”. Then “If it is raining, then anything that is outside will get wet” is written as the quantificational formula $$r\Longrightarrow\forall x(\mathrm{ouside}(x)\Longrightarrow\mathrm{wet}(x))$$ This is equivalent to $$\forall x(r\Longrightarrow(\mathrm{ouside}(x)\Longrightarrow\mathrm{wet}(x)))$$ as a consequence of scope change. The transformed formula says, in plain English, “Any object, if it is raining, will get wet if it is outside” which sounds less natural than the original statement.

Example. As a consequence of scope change we obtain \begin{align*}(\forall x P(x)\vee\exists y Q(y))&\equiv\forall x\exists y (P(x)\vee Q(x))\\&\equiv\exists y\forall x(P(x)\vee Q(y))\end{align*}

Example. For $(\forall x P(x)\vee\exists x Q(x))$, neither quantifiers can be moved out because the quantified variable $x$ appears in both subformulae. However, instead of scope change one can use change of variable rule to turn it into $(\forall x P(x)\vee\exists y Q(y))$ which is shown in the previous example.

Through repeated application of the quantificational equivalence rules, all quantifiers can be pulled out to the beginning of the formula. Such a formula is said to be in prenex normal form.

Example. Let $L(x,y)$ be “$x$ loves $y$”. Then the statement “everyone has a unique beloved” can be written as the quantificational formula $$\forall x\exists y (L(x,y)\wedge\forall z (L(x,z)\Longrightarrow y=z)$$ It can be transformed to the formula in prenex form $$\forall x\exists y\forall z(L(x,y)\wedge(L(x,z)\Longrightarrow y=z))$$

Example. Translate into quantificational logic and put into prenex form: “If there are any ants, then one of them is the queen.”

Solution. Let $A(x)$ be “$x$ is an ant and $Q(x)$ “$x$ is a queen.” Then the statement can be written as the quantificational formula $$\exists x A(x)\Longrightarrow\exists y (A(y)\wedge Q(y)\wedge\forall z (Q(z)\Longrightarrow z=y))$$ It’s direct translation is “If there exists an $x$ such that $x$ is an ant, then there exists a $y$ such that $y$ is an ant and $y$ is a queen and any $z$ that is a queen is equal to $y$.” The formula can be transformed to \begin{align*}\neg\exists x A(x)\vee\exists y(A(y)\wedge Q(y)\wedge\forall z (Q(z)\Longrightarrow z=y))&\equiv\forall x\neg A(x)\wedge\exists y(A(y)\wedge Q(y)\wedge\forall z (Q(z)\Longrightarrow z=y))\\&\equiv\forall x\exists y\forall z(\neg A(x)\vee(A(y)\wedge Q(y)\wedge\forall z (Q(z)\Longrightarrow z=y))\end{align*}


[1] Essential Discrete Mathematics for Computer Science, Harry Lewis and Rachel Zax, Princeton University Press, 2019