Let $S$ be sample space and $E, F$ events. The event $E$ can be written as \begin{align*}E&=E\cap S\\&=E\cap(F\dot\cup F^c)\\&=(E\cap F)\dot\cup(E\cap F^c)\end{align*} By axiom 3 of finite probability, we have \begin{equation}\begin{aligned}P_r(E)&=P_r(E\cap F)+P_r(E\cap F^c)\\&=P_r(E|F)P(F)+P_r(E|F^c)P(F^c)\\&=P_r(E|F)P_r(F)+P_r(E|F^c)(1-P_r(F))\end{aligned}\label{eq:baye}\end{equation} This equation states that the probability of the event $E$ is a weighted average of the conditional probability of $E$ given that $F$ has happened and the conditional probability of $E$ given that $F$ has not occurred. The equation \eqref{eq:baye} is useful because often it is difficult to calculate the probability of the even $E$ directly but knowing the information on whether the other event $F$ has happened helps us to determine the probability of $E$.

*Example*. An insurance company divides people into two categories: those who are accident prone and those who are not. A statistics shows that an accident-prone person will have an accident at some time within a fixed 1-year period with probability 0.4. This probability decreases to 0.2 for a non-accident-prone person. If 30% of the population is accident prone, what is the probability that a new policyholder will have an accident within a year of purchasing a policy?

*Solution*. Let $E$ denote the event that the policyholder will have an accident within a year of purchase. Let $F$ denote the event that the policyholder is accident prone. Using the equation \eqref{eq:baye}, \begin{align*}P_r(E)&=P_r(E|F)P_r(F)+P_r(E|F^c)P_r(F^c)\\&=0.4\times 0.3+0.2\times 0.7\\&=0.26\end{align*}

Suppose that $P_r(E)$ and $P_r(F)$ are both nonzero. Then it follow from the conditional probabilities $$P_r(E|F)=\frac{P_r(E\cap F)}{P_r(F)},\ P_r(F|E)=\frac{P_r(F\cap E)}{P_r(E)}$$ that \begin{equation}\label{eq:baye2}P_r(E|F)=\frac{P_r(F|E)P_r(E)}{P_r(F)}\end{equation} The equation \eqref{eq:baye2} is usually called *Baye’s Theorem*, named after an English statistician and a philosopher Reverend Thomas Bayes (pronounced ‘beiz’). If we regard the event $E$ as a hypothesis and $F$ as an evidence, the probabilities $P_r(E)$ and $P_r(E|F)$ can be interpreted, respectively, as the initial degree of belief in $E$ and the degree of belief in $E$ after having accounted the evidence $F$. The factor $\frac{P_r(F|E)}{P_r(F)}$ can then be interpreted as the support $F$ provides for $E$.

*Example*. This is the second part of the previous example. Suppose that a new policyholder has an accident within a year of purchasing a policy. What is the probability that he or she is accident prone?

*Solution*. What the question is asking is $P_r(F|E)$. By Baye’s theorem \eqref{eq:baye2}, \begin{align*}P_r(F|E)&=\frac{P_r(E|F)P_r(F)}{P_r(E)}\\&=\frac{0.4\times 0.3}{0.26}=\frac{6}{13}\end{align*} i.e. 6 out of 13 who have an accident within a year of purchasing a policy are accident-prone people.

*Example*. A lab blood test is 95% effective in detecting a certain disease when it is present. The test also yields a false positive result for 1% of the healthy people tested. If 0.5% of the population actually has the disease, what is the probability a person has the disease given that the test result is positive.

*Solution*. Let $D$ be the event that the tested person has the disease and $E$ the event that the test result is positive. What is asked is to find $P_r(D|E)$. The available information is $P_r(E|D)=0.95$, $P_r(D)=0.005$, and $P_r(E|D^c)=0.01$. Using Baye’s theorem \eqref{eq:baye2} along with \eqref{eq:baye}, \begin{align*}P_r(D|E)&=\frac{P_r(E|D)P_r(D)}{P_r(E|D)P_r(D)+P_r(E|D^c)P_r(D^c)}\\&=\frac{0.95\times 0.005}{0.95\times 0.005+0.01\times 0.995}\\&=\frac{95}{294}\approx 0.323\end{align*} i.e. only 32% of those who tested positive actually have the disease.

*Example*. During a criminal investigation, the detective in charge is 60% convinced that a suspect is guilty. Now a new piece of evidence comes into light and it shows that the criminal has a certain characteristic (such as left-handedness, baldness, or brown hair). Suppose that 20% of the population possesses this characteristic. It turns out that the suspect does have this characteristic, how certain is the detective now that the suspect is guilty of the crime?

*Solution*. Let $G$ be the event that the suspect is guilty and $C$ the event that he possesses the characteristic of the criminal. What is asked is to find $P_r(G|C)$. The available information is then $P_r(G)=0.6$, $P_r(C|G^c)=0.2$, and $P_r(C|G)=1$ (The real criminal does have the characteristic.) Using Baye’s theorem \eqref{eq:baye2} along with \eqref{eq:baye}, \begin{align*}P_r(G|C)&=\frac{P_r(C|G)P_r(G)}{P_r(C|G)P_r(G)+P_r(C|G^c)P(G^c)}\\&=\frac{1\times 0.6}{1\times 0.6+0.2\times 0.4}\\&\approx 0.882\end{align*}

*References*.

[1] Essential Discrete Mathematics for Computer Science, Harry Lewis and Rachel Zax, Princeton University Press, 2019

[2] A First Course in Probability, Sheldon Ross, 5th Edition, Prentice-Hall, 1998