One-to-One Functions and Inverse Functions

A function $y=f(x)$ is said to be one-to-one if satisfies the property
$$f(x_1)=f(x_2) \Longrightarrow x_1=x_2$$
or equivalently
$$x_1\ne x_2 \Longrightarrow f(x_1)\ne f(x_2)$$
for all $x_1,x_2$ in the domain. In plain English what this says is no two numbers in the domain are corresponded to the same number in the range. Figure 1 is the graph of $f(x)=x^2$. It is not one-to-one.

Figure 1. The graph of y=x^2

For example, $-1\ne 1$ but $f(-1)=1=f(1)$.

Figure 2. The graph of y=x^3

Figure 2 is the graph of $f(x)=x^3$. It is one-to-one as seen clearly from the graph. But let us pretend that we don’t know the graph but want to prove that it is one-to-one following the definition. Here we go. Suppose that $f(x_1)=f(x_2)$. Then $x_1^3=x_2^3$ or $x_1^3-x_2^3=(x_1-x_2)(x_1^2+x_1x_2+x_2^2)=0$. This means $x_1=x_2$ which completes the proof.

Why do we care about one-to-one functions? The reason is that if $y=f(x)$ is one-to-one, it has an inverse function $y=f^{-1}(x)$.
\begin{align*}
x&\stackrel{f}{\longrightarrow} y\\
x&\stackrel{f^{-1}}{\longleftarrow} y
\end{align*}
Given a one-to-one function $y=f(x)$, here is how to find its inverse function $y=f^{-1}(x)$

STEP 1. Swap $x$ and $y$ in $y=f(x)$. The reason we are doing this is that $\mathrm{Dom}(f)=\mathrm{Range}(f^{-1})$ and $\mathrm{Dom}(f^{-1})=\mathrm{Range}(f)$.

STEP 2. Solve the resulting expression $x=f(y)$ for $y$. That is the inverse function $y=f^{-1}(x)$.

Example. Find the inverse function of $f(x)=\frac{2x+3}{x-1}$. (It is a one-to-one function.)

Solution. STEP 1. Let $y=\frac{2x+3}{x-1}$ and swap $x$ and $y$. Then we have
$$x=\frac{2y+3}{y-1}$$

STEP 2. Let us solve $x=\frac{2y+3}{y-1}$ for $y$. First multiply $x=\frac{2y+3}{y-1}$ by $y-1$. Then we have $x(y-1)=2y+3$ or $xy-x=2y+3$. Isolating the terms that contain $y$ in the LHS, we get $xy-2y=x+3$ or $(x-2)y=x+3$. Finally we find $y=\frac{x+3}{x-2}$. This is the inverse function.

$y=f(x)$ and its inverse $y=f^{-1}(x)$ satisfy the following properties.
$$(f\circ f^{-1})(x)=x,\ (f^{-1}\circ f)(x)=x$$
The reason for these properties to hold is clear from the definition of an inverse function. We can check the properties using the above example. I will do $(f\circ f^{-1})(x)=x$ and leave the other for an exercise.
\begin{align*}
(f\circ f^{-1})(x)&=f(f^{-1}(x))\\
&=f\left(\frac{x+3}{x-2}\right)\\
&=\frac{2\left(\frac{x+3}{x-2}\right)+3}{\left(\frac{x+3}{x-2}\right)-1}\\
&=x
\end{align*}
The graph of $y=f(x)$ and the graph of its inverse $y=f^{-1}(x)$ satisfy a nice symmetry, namely they are symmetric about the line $y=x$. This symmetry helps us obtain the graph of $y=f^{-1}(x)$ when the explicit expression for $f^{-1}(x)$ is not available. You will see such a case later when you study the logarithmic functions. Figure 3 shows the symmetry with $y=x^2$ ($x\geq 0$) and its inverse $y=\sqrt(x)$.

Figure 3. The symmetry between the graphs of y=x^2 (red) and y=sqrt(x) (blue) about y=x (green)

Combining Functions

It’s quite interesting that functions can be treated like numbers, namely you can define $+$, $-$, $\times$, and $\div$ on a collection of functions. How do we do this? For instance given two functions $f$ and $g$, we can define a new function $f+g$ by
$$(f+g)(x)=f(x)+g(x)$$
for all $x$ in the domain (for sake of simplicity we assume that $f$ and $g$ both have the same domain. If not, one can take the intersection of the domains of $f$ and $g$, no big deal). In a similar manner, we can also define $f-g$, $fg$, and $\frac{f}{g}$ respectively as
\begin{align*}
(f-g)(x)&=f(x)-g(x)\\
(fg)(x)&=f(x)g(x)\\
\left(\frac{f}{g}\right)(x)&=\frac{f(x)}{g(x)}\ \mbox{provided}\ g(x)\ne 0
\end{align*}

Example. Let $f(x)=\frac{1}{x-2}$ and $g(x)=\sqrt{x}$.

(a) Find the functions $f+g$, $f-g$, $fg$ and $\frac{f}{g}$ and their domains.

(b) Find $(f+g)(4)$, $(f-g)(4)$, $(fg)(4)$, $\left(\frac{f}{g}\right)(4)$

Solution. (a) $\mathrm{Dom}(f)=\{x|x\ne 2\}$ and $\mathrm{Dom}(g)=\{x|x\geq 0\}$. So the intersection is $\{x|0\leq x<2\}\cup\{x|x>2\}=[0,2)\cup(2,\infty)$ and this is the domain of $f+g$, $f-g$ and $fg$. For $\frac{f}{g}$ since $g$ is not defined at $x=0$, its domain should be $(0,2)\cup(2,\infty)$.
\begin{align*}
(f+g)(x)&=\frac{1}{x-2}+\sqrt{x}\\
(f-g)(x)&=\frac{1}{x-2}-\sqrt{x}\\
(fg)(x)&=\frac{\sqrt{x}}{x-2}\\
\left(\frac{f}{g}\right)(x)&=\frac{1}{(x-2)\sqrt{x}}
\end{align*}

(b) I will do only $(f+g)(4)$. One way to evaluate $(f+g)(4)$ is to use $(f+g)(x)$ we obtained in part (a) i.e. $(f+g)(4)=\frac{1}{4-2}+\sqrt{4}=\frac{5}{2}$. Another way is evaluating $f(4)$ and $g(4)$ first which are $f(4)=\frac{1}{2}$ and $g(4)=2$. Then $(f+g)(4)=f(4)+g(4)=\frac{5}{2}$.

Composite Functions

Given two functions $f$ and $g$, if the range of $f$ is a subset of the domain of $g$, then we can combine the two functions to create a new function which we will denote by $g\circ f$.
$$x\stackrel{f}{\longmapsto} f(x)\stackrel{g}{\longmapsto} g(f(x))$$
The above diagram hints us that we can define a new function $g\circ f$ by
$$(g\circ f)(x)=g(f(x))$$
We call $g\circ f$ “$f$ followed by $g$.”

Example. Let $f(x)=x^2$ and $g(x)=x-3$.

(a) Find composite functions $f\circ g$ and $g\circ f$ and their domains.

(b) Find $(f\circ g)(5)$ and $(g\circ f)(5)$.

Solution. (a) By definition $(f\circ g)(x)=f(g(x))=f(x-3)=(x-3)^2$. Also by definition $(g\circ f)(x)=g(f(x))=g(x^2)=x^2-3$. From these we can clearly see both their domains are $(-\infty,\infty)$. In general $\mathrm{Dom}(f\circ g)=\mathrm{Dom}(g)$ and $\mathrm{Dom}(g\circ f)=\mathrm{Dom}(f)$.

(b) $(f\circ g)(5)$ can be evaluated using $(f\circ g)(x)$ we obtained in part (a).
$$(f\circ g)(5)=(5-3)^2=4$$
There is another way to do this. If you don’t have to find $(f\circ g)(x)$ but only need to calculate $(f\circ g)(5)$, this may be simpler. First note $(f\circ g)(5)=f(g(5))$. $g(5)=5-3=2$, so $f(g(5))=f(2)=2^2=4$. Similarly we find $(g\circ f)(5)=22$. In general $(f\circ g)(x)\ne (g\circ f)(x)$.

Time Dilation and Time Travel

In this note, we discuss one of the relativistic effects called Time Dilation namely a clock that is moving relative to an observer will be measured to tick slower than a clock that is at rest in the observer’s reference frame. This is pretty intriguing for those who are familiar with Newtonian notion of time as being a universal parameter for motions. Let us do a thought experiment. Let us consider a frame $K$ at rest and suppose that a light ray is emitted by the light source $Q$ and after reflection by the mirror $S$ is received at $E$. See Figure 1.

Figure 1. Time Dilation

The measured time interval in the frame $K$ is $\Delta t=t_2-t_1=\frac{2l}{c}$. Now consider a frame $K’$ moving at a constant speed $v$ to the right. An observer at rest in $K’$ sees the light ray emerging from $Q$, hitting the mirror (at rest in $K$) at $M$ and reaching the $x’$-axis again at $E$. The observer measures a longer time interval as the light has to travel a longer path to reach the receiver but the speed of light is remained the same according to Einstein’s postulate. How much longer? The time $\Delta t’$ measured by an observer at rest in the frame $K’$ can be easily calculated using the Pythagorean law applied to the isosceles triangle seen in Figure 1. We find
$$\left(\frac{c\Delta t’}{2}\right)^2=l^2+\left(\frac{v\Delta t’}{2}\right)$$
Solving this for $\Delta t’$ we find
\begin{equation}
\label{eq:timedilation}
\Delta t’=\frac{\Delta t}{\sqrt{1-\frac{v^2}{c^2}}}
\end{equation}
Note that \eqref{eq:timedilation} amounts to the Lorentz transformation into the system $K’$
$$\Delta t’=t_2′-t_1′,$$
where
$$t_i’=\frac{t_i-\frac{v}{c^2}x_i}{\sqrt{1-\frac{v^2}{c^2}}},\ i=1,2$$
Since $x_1=x_2$, we obtain \eqref{eq:timedilation}. In case this whole frame thing is confusing, let us imagine that you are sitting in a train that is running at a constant speed. Since there is no acceleration, you do not feel that you are moving. So inside the train you are at rest (frame $K$). For an observer outside you are moving (frame $K’$) and the observer would measure the time ($\Delta t’$) on your clock ticking slower than what you would measure it ($\Delta t$). In physics $\Delta t$ is called proper time. Simply speaking proper time is the time measured by a clock that is moving along with inertial frame. Mathematically, proper time can be calculated from the arc length $ds^2$ of a worldline, the trajectory of a moving particle or an object in spacetime. Denote by $\tau$ the proper time. Then since the worldline is timelike (meaning leaning more toward time), $ds^2=-c^2d\tau^2$. So the proper time interval is given by
\begin{equation}
\begin{aligned}
\Delta\tau&=\frac{1}{c}\int\sqrt{-ds^2}\\
&=\frac{1}{c}\int\sqrt{c^2dt^2-dx^2-dy^2-dz^2}\\
&=\int\sqrt{1-\frac{v(t)^2}{c^2}}dt
\end{aligned}\label{eq:propertime}
\end{equation}
If $v(t)$ is constant speed $v$, \eqref{eq:propertime} becomes \eqref{eq:timedilation}.
The time dilation effect in \eqref{eq:timedilation} hints us that a time travel to the future may be possible. Here is how. The exoplanet Proxima b is interesting because it is orbiting within the habitable zone of the red dwarf star Proxima Centauri which is a part of triple star system Alpha Centauri in the Constellation of Centaurus, and also because it is relatively close to our world. It is located about 4.2 light-years or 40 trillion km from Earth. In fact, it is the closest known exoplanet to the Solar System.

Artist’s depiction of Proxima b

Let us say we are sending a manned spaceship to Proxima b. Also let us assume that the spaceship can travel at 90% of the speed of light. (It is actually impossible to achieve this due to a physical limitation. I will discuss this in my other note at a later time. In reality, the best we can achieve using nuclear propulsion is about 0.067% of the speed of light.) For people on Earth it would take $\Delta t’=\frac{4\times 10^{13}\mbox{km}}{2.7\times 10^5\mbox{km/sec}}=1.\overline{481}\times 10^8\mbox{sec}$ for the spaceship to get to Proxima b. Since $1\mbox{sec}=3.17\times 10^{-8}\mbox{years}$, it is 4.7 years. Since it would take the same time from Proxima b to Earth, the overall travel time for people on Earth is 9.4 years. In reality, we will have to take some factors into consideration: it takes time for the spaceship to accelerate to reach 90% of the speed of light, once the spaceship is near Proxima b it will have to slow down for stopping or U-turning, etc. But for the sake of simplicity we will disregard those factors. For the crew memebers it took only
\begin{align*}
\Delta t&=\sqrt{1-\frac{v^2}{c^2}}\Delta t’\\
&=\sqrt{1-(0.9)^2}\cdot 1.\overline{481}\times 10^8\mbox{sec}\\
&\approx 0.65\times 10^8\mbox{sec}\\
&\approx 2\mbox{yrs}
\end{align*}
to get to Proxima b. So when they come back home, it’s like they traveled more then 5 years forward in time. I know it is not what you probably think and yes I admit that this is a kind of boring time travel. Can one travel backward in time? This is one of the most intriguing questions. I will come back to this question at another time.

I will finish this note with an example as an application of \eqref{eq:timedilation}. This example was taken from [1].

Example. Muon Decay

The Earth is surrounded by an atmosphere of about 30 km thickness screening us off from cosmic radiation. If a proton from the consmic radiation hits the atmosphere, $\pi$-mesons are produced and several of them decay further into a muon and a neutrino. The muon has a mean lifetime of $\Delta t=2\times 10^{-6}\mbox{sec}$ in its rest system. Classically it would travel even with the speed of light (only massless particles can travel at the speed of light)
\begin{align*}
s&=c\Delta t\\
&=3\times 10^5\mbox{km/sec}\cdot 2\times 10^{-6}\mbox{sec}\\
&=0.6\mbox{km}
\end{align*}
or 600m. If this were true, muon particles would never reach the surface, but they are detected on the surface. In the relativistic approach,
$$s’=v\Delta t’=\frac{v\Delta t}{\sqrt{1-\frac{v^2}{c^2}}}$$
Muons at rest have a mass of $m_0c^2=10^8$eV (I know it is actually energy but due to mass-energy equivalence physicists customarily call it mass.) The cosmic muons are created at an altitude of about 10km with a total energy of $E=5\times 10^9$eV. In order to apply this information we rewite $S’$ as
\begin{align*}
S’&=\frac{vm_0c^2}{m_0c^2\sqrt{1-\frac{v^2}{c^2}}}\Delta t\\
&=\frac{v}{m_0c^2}E\Delta t\\
&\leq\frac{c}{m_0c^2}E\Delta t\\
&=\frac{3\times 10^5\mbox{km/sec}}{10^8\mbox{eV}}\cdot 5\times 10^9\mbox{eV}\cdot 2\times 10^{-6}\mbox{sec}\\
&=30\mbox{km}
\end{align*}
Here we used $E=mc^2=\frac{m_0c^2}{\sqrt{1-\frac{v^2}{c^2}}}$. We will discuss this later in another post. The actual measurement gives a value of 38km.

References:

[1] Walter Greiner, Classical Mechanics, Point Particles and Relativity, Springer, 2004

[2] Paul A. Tipler and Ralph A. Llewellyn, Modern Physics, 5th Edition, W. H. Freeman and Company, 2008

Heisenberg Relation

This morning I saw a seemingly random tweet from Sam Walters @SamuelGWalters, a mathematician at the University of Northern British Columbia.

I have no clue as to any motivation behind the tweet but the mathematical statement in it is interesting. It’s proof is pretty easy though. Before we prove the statement, what he referred to as Heisenberg relation (also called Heisenberg commutator) is originated from quantum mechanics where the relation exhibits noncommutativity of the position and the momentum operators $\hat x$ and $\hat p$ as
$$[\hat x,\hat p]=\hat x\hat p-\hat p\hat x=i\hbar$$
in contrast to the classical case ($\hbar\to 0$) where the position and the momentum commute.

Let $\alpha$ and $\beta$ are scalars and $x,y,z$ be vectors (as members of a module or of a Lie algebra depending on the context). Then it is straightforward to show that
$$[\alpha x+\beta y,z]=\alpha[x,z]+\beta[y,z]$$
i.e. the commutator is linear in the first slot. It is also linear in the second slot. Hence the commutator is bilinear. Therefore, it suffices to show that
$$[x^ny,yx^n]=nx^{n-1}$$
for all integers $n\geq 0$. We prove this by induction. It is trivial for $n=0$ and $n=1$. Let $n=2$. Then
\begin{align*}
[x^2y,yx^2]&=x^2y-yx^2\\
&=x^2y-(yx)x\\
&=x^2y-(xy-1)x\\
&=x^2y-xyx+x\\
&=x(xy-yx)+x\\
&=2x
\end{align*}
Now we assume that the statement is true for $n=k$ i.e.
$$[x^ky,yx^k]=kx^{k-1}$$
For $n=k+1$,
\begin{align*}
[x^{k+1}y,yx^{k+1}]&=x^{k+1}y-yx^{k+1}\\
&=x^{k+1}y-(yx)x^k\\
&=x^{k+1}y-(xy-1)x^k\\
&=x^{k+1}y-xyx^k+x^k\\
&=x(x^ky-yx^k)+x^k\\
&=x(kx^{k-1})+x^k\\
&=(k+1)x^k
\end{align*}
This completes the proof.

Nonlinear Inequalities

Nonlinear inequalities may seem more complicated and difficult to solve than linear inequalities. However it is not really the case. There is one simple way to solve a nonlinear inequality. It’s called the test point method. I will explain this with an example.

Example. Solve the inequality $x^2\leq 5x-6$.

Solution. The inequality can be rewritten $x^2-5x+6\leq 0$. First we find points at which $x^2-5x+6=0$. Since $x^2-5x+6=(x-2)(x-3)$, $x=2,3$. These two points divide the real line into 3 regions, where $x<2$, where $2<x<3$, and where $x>3$ as shown in Figure 1.

Figure 1. Quadratic Inequality

In each region we pick a test point to see if that test number satisfies the given inequality. If it does, any other number in the same region would satisfy the inequality. If not, any other number in the same region wouldn’t either. While this is pretty cool, you may wonder why this works. One number speaks for the entire numbers in the same region. It’s hard to explain here though but it is due to the continuity of the function $f(x)=x^2-5x+6$. I will leave it at that and will not delve into that any further. You will understand what I said when you learn calculus. In the region $x<2$, I would pick $x=0$ for a test point. But $x=0$ won’t satisfy the inequality as the LHS is $6>0$. Move onto the next region $2<x<3$. I pick $x=2.5=\frac{5}{2}$. Since $\left(\frac{5}{2}\right)^2-5\frac{5}{2}+6=\frac{25-50+24}{4}=-\frac{1}{4}<0$. So this means that $2<x<3$ is a solution of the inequality. In the final region $x>3$ I pick $x=4$. $(4)^2-5(4)+6=2>0$ so no number in this region would satisfy the inequality. Since $x=2$ and $x=3$ also satisfy the inequality, the overall solution is $2\leq x\leq 3$ or $[2,3]$ in interval notation.

The inequalities like one we just did is called quadratic inequalities. For quadratic inequalities we can actually classify solutions depending on inequalities without going through the test point method every time. Let us assume that the quadratic function $f(x)=ax^2+bx+c$ with $a>0$ has two real solutions $\alpha$ and $\beta$ ($\alpha<\beta$). Then the graph of $f(x)$ would look like one in Figure 2. You can find the solution of each of the following quadratic inequalities easily from the graph in Figure 2.

Figure 2. Quadratic Inequality

  1. $ax^2+bx+c>0$: $x<\alpha$ or $x>\beta$. In interval notation, $(-\infty,\alpha)\cup(\beta,\infty)$.
  2. $ax^2+bx+c\geq 0$: $x\leq\alpha$ or $x\geq\beta$. In interval notation, $(-\infty,\alpha]\cup[\beta,\infty)$.
  3. $ax^2+bx+c<0$: $\alpha<x<\beta$. In interval notation, $(\alpha,\beta)$.
  4. $ax^2+bx+c\leq 0$: $\alpha\leq x\leq\beta$. In interval notation, $[\alpha,\beta]$.

Let us go over a couple more examples of nonlinear inequalities that are not quadratic inequalities.

Example. Solve $x(x-1)^2(x-3)<0$.

Solution. The method is the same as the first example. We use the test point method. First find $x$ at which $x(x-1)^2(x-3)=0$. They are $x=0, 1, 3$. So there are 4 regions under consideration. $x<0$, $0<x<1$, $1<x<3$, and $x>3$. In the region where $x<0$, the test point $x=-1$ results the sign of the LHS is $+$. So $x<0$ is not a solution. In the region $0<x<1$, the test point $x=\frac{1}{2}$ results the sign of the LHS $-$, so $0<x<1$ is a solution. In the region $1<x<3$, the test point $x=2$ results the sign of the LHS still $-$. This is actually due to $(x-1)^2$. In general if you see an even number of repeated term $x-a$ in your polynomial inequality like the one we have the sign of the polynomial does not change at $x=a$. A little goody to know so you can save time. Let us move onto next and last one. For $x>3$ the test point $x=4$ results the sign of the LHS $+$, so $x>3$ is not a solution. Therefore, the overall solution is $0<x<1$ or $1<x<3$. In interval notation, it is $(0,1)\cup(1,3)$.

Example. Solve $\frac{1+x}{1-x}\geq 1$.

Solution. First rewrite the inequality as $\frac{1+x}{1-x}-1\geq 0$ which simplifies to $\frac{2x}{1-x}\geq 0$. Inequality like this we consider points at which the numerator is 0 and also points at which the denominator is 0. In our case they are $x=0, 1$ and these two points divide the real line into three regions: $x<0$, $0<x<1$, $x>1$. In the region $x<0$, the test point $x=-1$ results the sign of the LHS $-$, so $x<0$ is not a solution. In the region $0<x<1$, the test point $x=\frac{1}{2}$ results the sign of the LHS $+$, so $0<x<1$ is a solution. Finally in the region $x>1$ the test point $x=2$ results the sign of the LHS $-$, so $x>1$ is not a solution. Since $x=0$ also satisfies the inequality, the overall solution is $0\leq x<1$ or $[0,1)$ in interval notation.