Probability answers the following question. How frequently should we expect one outcome or another to occur if we repeat some procedure many times? First let’s introduce some terminologies that are frequently used in studying probability:
- Trial: a repeatable procedure such as tossing a coin or drawing a marble from a jar
- Experiment: a sequence of individual trials
- Sample space (of a trial): the set of all possible outcomes
- An event: any set of outcomes, so it is a subset of sample space
Example. Rolling a die is a trial. It’s sample space is the set $\{1,2,3,4,5,6\}$. The subset $\{2,4,6\}$ is an event.
A sample space in general can be an infinite set but we consider only finite sample space here. Let $S$ be a sample space. A probability function $P_r: \wp(S)\longrightarrow\mathbb{R}$ assigns probabilities to events. Any probability function must satisfy the axioms of finite probability:
- $0\leq P_r(A)\leq 1$ for any event $A$.
- $P_r(\emptyset)=0$ and $P_r(S)=1$.
- For any events $A$ and $B$ with $A\cap B=\emptyset$, $$P_r(A\cup B)=P_r(A)+P_r(B)$$
Remark. Because of axiom 1, the codomain of a probability function $P_r$ can be restricted to the unit interval $[0,1]$, i.e. $P_r:\wp(S)\longrightarrow[0,1]$.
Theorem. The probability of an event $A$ not happening is $$P_r(A^c)=1-P_r(A)$$
Proof. It follows from $S=A\dot\cup A^c$ and axiom 3. Here, $\dot\cup$ means disjoint union.
Theorem. Let $A$ and $B$ be events such that $A\subseteq B$. Then $P_r(A)\leq P_r(B)$.
Proof. The set $B$ cam be written as $B=A\dot\cup(B-A)$, thus by axiom 3 $$P_r(B)=P_r(A)+P_r(B-A)\geq P_r(A)$$
This theorem is extended to
Theorem. Let $A_1,A_2,\cdots,A_n$ be events such that $A_i\cap A_j=\emptyset$ for $i\ne j$. Then $$P_r\left(\bigcup A_i\right)=\sum P_r(A_i)$$
Let $A$ and $B$ be two events and assume that $A$ and $B$ are not necessarily disjoint. How do we calculate $P_r(A\cup B)$, i.e. the probability of the event $A$ or $B$ happening? The following theorem answers that.
Theorem. (Inclusion-Exclusion Principle) Let $A$ and $B$ be two events (not necessarily disjoint). Then $$P_r(A\cup B)=P_r(A)+P_r(B)-P_r(A\cap B)$$
Proof. \begin{align*}A&=(A-B)\dot\cup(A\cap B)\\B&=(B-A)\dot\cup(A\cap B)\end{align*} Hence by axiom 3, we have \begin{align*}P_r(A-B)&=P_r(A)-P_r(A\cap B)\\P_r(B-A)&=P_r(B)-P_r(A\cap B)\end{align*} Also $A-B$, $B-A$ and $A\cap B$ are mutually disjoint, so by the previous theorem we have \begin{align*}P_r(A\cup B)&=P_r(A-B)+P_r(B-A)+P(A\cap B)\\&=P_r(A)+P_r(B)-P_r(A\cap B)\end{align*}
Consider a trial with $n$ possible outcomes. Assume that the outcomes are equally likely. Examples of trials with equally likely outcomes include coin tossing, rolling a die, drawing marbles from a jar, etc. Let $S$ be the sample space of a trial with equally likely outcomes. Define a function $P_r: \wp(S)\longrightarrow[0,1]$ by $$P_r(E)=\frac{|E|}{|S|}$$ for event $E$. Then it is straightforward to show that $P_r$ satiesfies axioms 1-3, i.e. it is a probability function.
Example. If a fair coin is flipped 4 times, what is the probability that exactly 2 of the flip’s land heads up?
Solution. $|S|=2^4=16$. Let $E$ be the set of outcomes that contain exactly 2 heads. Then $|E|=\begin{pmatrix}4\\2\end{pmatrix}=6$. Hence, $P_r(E)=\frac{6}{16}=\frac{3}{8}$.
Example. The Gambler’s Fallacy. Let $n$ be a large number. If a coin is flipped $n$ times and comes up heads the first $n-1$ times, what is the probability that the $n$th flip comes up head?
Solution. Although your intuition might tell you that the coin is due to come up tail, the coin doesn’t remember the history of previous outcomes. The probability is still $\frac{1}{2}$.
When the knowledge of whether an event $A$ occurred gives no information about whether another event $B$ occurred, $A$ and $B$ are said to be independent. Mathematically, the events $A$ and $B$ are independent if and only if $$P_r(A\cap B)=P_r(A)P_r(B)$$
Example. Let us consider $n=4$ case in the previous example. Let $A$ be the event that a coin is flipped 4 times and comes up heads first three times. Let $B$ be the event that a coin is flipped 4 times and comes up head last time. We show that $A$ and $B$ are independent events. Then \begin{align*}A&=\{HHHH,HHHT\}\\B&=\{HHHH,HHTH,HTHH,THHH,THTH,TTHH,TTTH\}\end{align*} Hence, \begin{align*}P_r(A\cap B)&=\frac{|A\cap B|}{|S|}=\frac{1}{16}\\P_r(A)P_r(B)&=\frac{|A|}{|S|}\frac{|B|}{|S|}=\frac{2}{16}\frac{8}{16}=\frac{1}{16}\end{align*} Since $P_r(A\cap B)=P_r(A)P_r(B)$, $A$ and $B$ are independent.
Example. Suppose three marbles are drawn from a jar containing six red marbles and four blue marbles. Let us consider the following events.
- $A$: The three marbles are not all the same color.
- $B$: The first marble drawn is red.
Are $A$ and $B$ independent?
Solution. The number of ways of drawing 3 marbles of the same color is $$\begin{pmatrix}6\\3\end{pmatrix}+\begin{pmatrix}4\\3\end{pmatrix}=24$$ So the number of ways of choosing 3 marbles that are not all the same color, i.e. $|A|=\begin{pmatrix}10\\3\end{pmatrix}-24=120-24=96$. Hence, $P_r(A)=\frac{96}{120}=\frac{4}{5}$. The probability that the first marble drawn is red is just $P_r(B)=\frac{6}{10}=\frac{3}{5}$. To calculate $P_r(A\cap B)$, since first marble is drawn red, the remaining two marbles drawn must contain at least one blue marble. There are $\begin{pmatrix}5\\2\end{pmatrix}=10$ ways of getting 2 more red marbles and so there are $\begin{pmatrix}9\\2\end{pmatrix}-10=36-10=26$ ways of drawing at least one blue marble. Therefore, $P_r(A\cap B)=\frac{6}{10}\frac{26}{36}=\frac{13}{30}$. On the other hand, $P_r(A)P_r(B)=\frac{4}{5}\frac{3}{5}=\frac{12}{25}$. Since $P_r(A\cap B)\ne P_r(A)P_r(B)$, the events $A$ and $B$ are not independent.
References.
[1] Essential Discrete Mathematics for Computer Science, Harry Lewis and Rachel Zax, Princeton University Press, 2019