Sequences

Definition. A succession of real numbers $$a_1,a_2,\cdots,a_n,\cdots$$ in a definite order is called a sequence. $a_n$ is called the $n$-th term or the general term. The sequence $\{a_1,a_2,\dots,a_n,\cdots\}$ is denoted by $\{a_n\}$ or $\{a_n\}_{n=1}^\infty$.

Example.

  1. The set of natural numbers $1,2,3,4,\cdots,n,\cdots$
  2. $1,-2,3,-4,\cdots,(-1)^{n-1}n,\cdots$
  3. $\frac{1}{2},-\frac{1}{4},\frac{1}{8},-\frac{1}{16},\cdots,(-1)^{n-1}\frac{1}{2^n},\cdots$
  4. $0,1,0,1,\cdots,\frac{1}{2}[1+(-1)^n],\cdots$
  5. $2,3,5,7,11,\cdots,p_n,\cdots$

It is not essential that the general term of a sequence is given by some simple formula as is the case in the first four examples above. The sequence in 5 represents the succession of prime numbers. $p_n$ stands for the $n$-th prime number. There is no formula available for the determination of $p_n$.

The following is the quantifying definition of the limit of a sequence due to Augustin-Louis Cauchy.

Definition. A sequence $\{a_n\}$ has a limit $L$ and we write $\lim_{n\to\infty}a_n=L$ or $a_n\to L$ as $n\to\infty$ if for any $\epsilon>0$ there exists a positive integer $N$ such that $$|a_n-L|<\epsilon\ \mbox{for all}\ n\geq N.$$

Example.

  1. Show that $\lim_{n\to\infty}\frac{1}{n}=0$
  2. Show that $\lim_{n\to\infty}\frac{1}{10^n}=0$
  3. Let $\{a_n\}$ be a sequence defined by $$a_1=0.3, a_2=0.33, a_3=0.333,\cdots,$$ show that $\lim_{n\to\infty}a_n=\frac{1}{3}$

Proof. I will prove 1. 2 and 3 are left as exercises. Let $\epsilon>o$ be given. Then $|a_n-L|=\frac{1}{n}<\epsilon\Longrightarrow n>\frac{1}{\epsilon}$. Choose $N$ a positive integer $\frac{1}{\epsilon}$. Then for all $n>N$ $|a_n-L|<\epsilon$.

The following limit laws allow us to break a complicated limit to simpler ones.

Theorem. Let $\lim_{n\to\infty}a_n=L$ and $\lim_{n\to\infty}b_n=M$. Then

  1. $\lim_{n\to\infty}(a_n\pm b_m)=L\pm M$
  2. $\lim_{n\to\infty}ca_n=cL$ where $c$ is a constant.
  3. $\lim_{n\to\infty}a_nb_n=LM$
  4. $\lim_{n\to\infty}\frac{a_n}{b_n}=\frac{L}{M}$ provided $M\ne 0$.

Example. Find $\lim_{n\to\infty}\frac{n}{n+1}$.

Solution. \begin{align*}\lim_{n\to\infty}\frac{n}{n+1}&=\lim_{n\to\infty}\frac{1}{1+\frac{1}{n}}=1\end{align*} since $\lim_{n\to\infty}\frac{1}{n}=0$.

The following theorem is also an important tool for calculating limits of certain sequences.

Theorem (Squeeze Theorem). If $a_n\leq b_n\leq c_n$ for $n\geq n_0$ and $\lim_{n\to\infty}a_n=\lim_{n\to\infty}c_n=L$ then $\lim_{n\to\infty}b_n=L$.

Corollary. If $\lim_{n\to\infty}|a_n|=0$ then $\lim_{n\to\infty}a_n=0$.

Proof. It follows from the inequality $-|a_n|\leq a_n\leq |a_n|$ for all $n$ and the Squeeze Theorem.

Example. Use the Squeeze Theorem to show $$\lim_{n\to\infty}\frac{n!}{n^n}=0$$

Solution. It follows from $$0\leq\frac{n!}{n^n}=\frac{1\cdot 2\cdot 3\cdots n}{n\cdot n\cdot n\cdots n}\leq\frac{1}{n}$$ for all $n$.

The following theorem enables you to use a cool formula you learned in Calculus I, L’Hôpital’s rule!

Theorem. If $\lim_{x\to\infty}f(x)=L$ and $f(n)=a_n$, then $\lim_{n\to\infty}a_n=L$.

Example. Calculate $\lim_{n\to\infty}\frac{\ln n}{n}$.

Solution. Let $f(x)=\frac{\ln x}{x}$. Then $\lim_{x\to\infty}f(x)$ is an indeterminate form of  type $\frac{\infty}{\infty}$. So by L’Hôpital’s rule \begin{align*}\lim_{x\to\infty}\frac{\ln x}{x}&=\lim_{x\to\infty}\frac{(\ln x)’}{x’}\\&=\lim_{x\to\infty}\frac{1}{x}\\&=0\end{align*} Hence by the Theorem above $\lim_{n\to\infty}\frac{\ln n}{n}=0$.

Example. Calculate $\lim_{n\to\infty}\root n\of{n}$.

Solution. Let $f(x)=x^{\frac{1}{x}}$. Then $\lim_{x\to\infty}f(x)$ is an indeterminate form of type $\infty^0$. As you learned in Calculus I, you will have to convert the limit into an indeterminate form of type $\frac{\infty}{\infty}$ or type $\frac{0}{0}$ so that you can apply L’Hôpital’s rule to evaluate the limit. Let $y=x^{\frac{1}{x}}$. Then $\ln y=\frac{\ln x}{x}$. As we calculated in the previous example, $\lim_{x\to\infty}\ln y=0$. Since $\ln y$ is continuous on $(0,\infty)$, $$\lim_{x\to\infty}\ln y=\ln(\lim_{x\to\infty} y)$$ Hence, $$\lim_{x\to\infty}x^{\frac{1}{x}}=e^0=1$$ i.e. $\lim_{n\to\infty}\root n\of{n}=1$.

Theorem. $\lim_{n\to\infty}\root n\of{a}=1$ for $a>0$.

Example. $\lim_{n\to\infty}\frac{1}{\root n\of{2}}=1$.

Definition. A sequence $\{a_n\}$ is said to diverge if it fails to converge. Divergent sequences include sequences that tend to infinity or negative infinity, for example $1,2,3,\cdots,n,\cdots$ and sequences that oscillates such as  $1,-1,1,-1,\cdots$.

Definition. A sequence $\{a_n\}$ is said to be bounded if there exists $M>0$ such that $|a_n|<M$ for every $n$.

Theorem. A convergent sequence is bounded but the converse need not be true.

Definition. A sequence $\{a_n\}$ is said to be monotone if it satisfies either $$a_n\leq a_{n+1}\ \mbox{for all}\ n$$ or $$a_n\geq a_{n+1}\ \mbox{for all}\ n$$

Equivalently, one can show that a sequence $\{a_n\}$ is monotone increasing by checking to see if it satisfies  $$\frac{a_{n+1}}{a_n}\geq 1\ \mbox{for all}\ n$$ or $$a_{n+1}-a_n\geq 0\ \mbox{for all}\ n$$

The following theorem is called the Monotone Sequence Theorem.

Theorem. A monotone sequence which is bounded is convergent.

Example.

  1. Show that the sequence $$\frac{1}{2},\frac{1}{3}+\frac{1}{4},\frac{1}{4}+\frac{1}{5}+\frac{1}{6},\cdots,\frac{1}{n+1}+\frac{1}{n+2}+\cdots+\frac{1}{2n},\cdots$$ is convergent.
  2. Show that the sequence $$1,1+\frac{1}{2},1+\frac{1}{2}+\frac{1}{4},\cdots,1+\frac{1}{2}+\frac{1}{4}+\cdots+\frac{1}{2^n},\cdots$$ is convergent.

Solution. 2 is left as an exercise. $a_{n+1}-a_n=\frac{1}{2n+2}+\frac{1}{2n+1}-\frac{1}{n+1}=\frac{4n+1}{(2n+2)(2n+1)}>0$ for all $n$. So it is monotone increasing. $$a_n=\frac{1}{n+1}+\cdots+\frac{1}{n+n}\leq\frac{1}{n}+\cdots+\frac{1}{n}=\frac{n}{n}=1$$ for all $n$. So it is bounded. Therefore, it is convergent by the monotone sequence theorem.

Why Can’t Speeds Exceed $c$?

This is a guest post by Dr. Lawrence R. Mead. Dr. Mead is a Professor Emiritus of Physics at the University of Southern Mississippi.

It is often stated in elementary books and repeated by many that the reason that an object with mass cannot achieve or exceed the speed of light in vacuum is the “mass becomes infinite”, or “time stops” or even “the object has zero size”. There are correct viewpoints for why matter or energy or any signal of any kind cannot exceed light speed and these reasons have little to do directly with mass changes or time dilations or Lorentz contractions.

Reason Number One

Consider a signal of any kind (mass, energy or just information) which travels at speed $u=\alpha c$ beginning at point $x_A$ at time $t=0$ and arriving at position $x_B$ at later time $t>0$. From elementary kinematics,
$$t=(x_B -x_A)/u = {\Delta x \over \alpha c}.$$
Now suppose the signal travels at a speed exceeding $c$, that is $\alpha > 1$. Let us calculate the elapsed time as measured by a frame going by at speed $V<c$. According to the Lorentz transformation,
\begin{equation}\label{eqno1}t’ = \gamma (t-{Vx \over c^2}),\end{equation} where $\gamma=\frac{1}{\sqrt{1-\frac{V^2}{c^2}}}$.
There are two events: the signal leaves $x=x_A$ at $t=0$, and the signal arrives at $x=x_B$ at time $t=\Delta t$. According to \eqref{eqno1}, these events in the moving frame happen at times,
$$t’_A=\gamma ( 0 -Vx_A/c^2)$$ and
$$t’_B=\gamma (\Delta t – Vx_B/c^2).$$
Thus, the interval between events in the moving frame is, \begin{equation}\begin{aligned}\Delta t’ &= t’_B-t’_A\\
&=\gamma \Delta t -\gamma \frac{V}{c^2}(x_B-x_A)\\
&=\gamma \Delta t ( 1-\alpha V/ c).\end{aligned}\label{eqno2}\end{equation}
Now suppose that $\alpha V/c > 1$, which implies that,
$$ c > V > c/\alpha .$$ Then for moving frames within that range of speeds it follows from \eqref{eqno2} that,
$$\Delta t’= t’_B-t’_A <0,$$ meaning physically that the signal arrived before it was sent! This is a logical paradox which is impossible on physical grounds; no one will argue that in any frame a person can be shot before the gun is fired, or you obtain the knowledge of the outcome of a horse race before the race has begun.

Well now what if the two events at $x_A$ and $x_B$ are not causally connected but one (say at $x_B$ for definiteness) simply happened after the other? How does the above argument change? How does the above math “know” that there is or is not a causal connection between the events? Everything goes the same up to the second line of equation \eqref{eqno2}:
\begin{equation}\label{eqno3} \Delta t’ = \gamma \{ \Delta t – V(x_B-x_A)/c^2 \}. \end{equation}
Can there be a moving frame of speed $V<c$ for which the event at $x_B$ (the later one in S) happens before the event at $x_A$ (the earlier one in S)? If so, $\Delta t’ < 0$; from \eqref{eqno3} then we find,
$$\Delta t – V(x_B-x_A)/c^2 < 0, $$ or solving for $V$,
$$ c > V > c{c\over \Delta x/ \Delta t}.$$ In order for $V$ to be less
than $c$, it must therefore be that, ${c\over \Delta x /\Delta t} < 1$, or
$${\Delta x \over \Delta t} > c.$$ This is possible for sufficiently large $\Delta x$ and/or sufficiently small $\Delta t$ because the ratio ${\Delta x \over \Delta t}$ is not the velocity of any signal, though it has the units of speed.

What is the Speed of “Light” Anyway?

Note that the Lorentz transformation contains the speed $c$ in it. What is this speed? Without referencing Maxwell’s equations of Electromagnetism, one does not know that $c$ is in fact the speed of light itself. But the above analysis shows – without reference to Maxwell – that the speed $c$ cannot be exceeded. And what is the speed talked about in the previous discussion? Well, it is the maximum speed at which one event can influence another with given (fixed) separation – thus, $c$ above isn’t really the speed of light at all; rather it is the speed of causality!

Reason Number Two

Imagine, for example, a constant force $F$ acting on a particle of (rest) mass $m$. Newton’s second Law in its relativistic form gives,
\begin{equation}\begin{aligned} F &= {dp \over dt} \\
&= {d \over dt} \, mv\gamma \\
&= m \gamma^3 \, \dot v, \end{aligned}\label{eqno4}\end{equation}
where we have assumed straight line motion. This is an autonomous differential equation whose solution, assuming the object is initially at rest, is,
$$ v(t)=at/(1+a^2t^2/c^2)^{1/2}, $$
where $a=F/m$. It is clear that as $t \to \infty$, $v(t)$
approaches $c$ and not infinity. Moreover, the differential impulse at arbitrary time $t$ on the particle can be found from taking the derivative of $v(t)$ given in the last equation,
\begin{equation}\label{eqno5}m\, dv = { F \, dt \over (1+a^2t^2/c^2)^{3/2}}. \end{equation}
From equation \eqref{eqno5}, it is clear that the incremental speed increase $dv$ over time $dt$ approaches zero as $t \to \infty$. Thus, from this point of view we see that while the force still does work, the increase in speed for a given interval of time and incremental amount of work, is less and less as time goes on which is why the speed never reaches $c$ over any finite time interval.

Reason Number Three

In the interval of time $dt$ as measured in some inertial frame observing a moving body, the clock attached to the body ticks off proper time
\begin{equation}\label{eqno6}d\tau = \sqrt{1-v^2/c^2}\, dt. \end{equation}
However, for light $v\equiv c$, and therefore $d\tau\equiv 0$. Light takes no proper time to go between two points however distantly separated in space. Thus, no object could travel faster than taking no time. This is the oft-repeated mantra of textbooks, and, while the mathematics verifies it, there are far more fundamental reasons, the best being causality as outlined above.

Is FTL (Faster-Than-Light) Possible?

Often you hear that Einstein’s relativity prevents FTL (Faster-Than-Light). Is that true? The answer is yes and no. It is not possible for a spaceship to travel faster than the speed of light. But there may be a particle that travels FTL and the existence of such a particle would not violate the principles of relativity if its speed already exceeds the speed of light when it is created. The hypothetical particle that travels FTL is called a tachyon. (tachys means fast in Greek) The name was coined by a Columbia University physicist Gerald Feinberg in 1967. When he was asked why he thought about such a particle Feinberg reportedly quoted a Jewish proverb “Everything which is not forbidden is allowed.” (Author’s note: This is from something I read more than 3 decades ago when I was a high school student so I cannot cite its source. Also I could not find any such Jewish proverb either. It is however a constitutional principle of English law.)

For a Tachyon, the Lorentz transformation is given by the complex coordinates \begin{align*}t’&=-i\frac{t-\frac{v}{c^2}x}{\sqrt{\frac{v^2}{c^2}-1}}\\x’&=-i\frac{x-vt}{\sqrt{\frac{v^2}{c^2}-1}}\end{align*} where $i=\sqrt{-1}$. Although this is a complex transformation, it is still an isometry i.e. it preserves the Minkowski metric. In order for its energy $E$ to be real one has to assume that its rest mass is purely imaginary $im_0$ where $m_0>0$ is real and hence from the relativistic energy $$E=\frac{im_0c^2}{\sqrt{1-\frac{v^2}{c^2}}}=\frac{im_0c^2}{i\sqrt{\frac{v^2}{c^2}-1}}=\frac{m_0c^2}{\sqrt{\frac{v^2}{c^2}-1}}$$ Imaginary rest mass may sound weird but rest mass is not an observable because a particle is never at rest. What’s important is energy being real as it is an observable. The following figure shows energies of a subluminal particle (in red) and a superluminal particle (in blue) with $m_0=c=1$).

Properties of Tachyons:

  1. The speed of light $c$ is the greatest lower bound of Tachyon’s speed. There is no upper limit of Tachyon’s speed.
  2. Tachyons have imaginary rest mass (as we discussed above).
  3. In order for a tachyon to slow down to the speed of light, it requires infinite amount of energy and momentum.
  4. Another peculiar nature of tachyons. If a tachyon looses energy, its speed increases. At $E=0$, $v=\infty$.

How do we detect tachyons if they exist? Since tachyons travel FTL, we can’t see them coming. However if they are charged particles, they will emit electromagnetic Mach shock waves called Tscherenkov (also spelled Cherenkov) radiations. This always happens when charged particles are passing through a medium with a higher speed than the phase speed of light in the medium. By detecting such Tscherenkov radiations we may be able to confirm the existence of tachyons.

An interesting question is “can we use tachyons for FTL communications? ” It was answered by Richard C. Tolman as negative in his book [2] (pp 54-55). In [2]. Tolman considered the following thought experiment. Suppose a signal is being sent from a point $A$ (cause) to another point $B$ (effect) with speed $u$. In an inertial frame $S$ where $A$ and $B$ are at rest, the time of arrival at $B$ is given by $$\Delta t=t_B-t_A=\frac{B-A}{u}$$ In another inertial frame $S’$ moving with speed $v$ relative to $S$ the time of arrival at $B$ is given, according to the Lorentz transformation, by \begin{align*}\Delta t’&=t’_{B}-t’_{A}\\&=\frac{t_B-\frac{v}{c^2}x_B}{\sqrt{1-\frac{v^2}{c^2}}}-\frac{t_A-\frac{v}{c^2}x_A}{\sqrt{1-\frac{v^2}{c^2}}}\\&=\frac{1-u\frac{v}{c^2}}{\sqrt{1-\frac{v^2}{c^2}}}\Delta t\end{align*} If $u>c$ then certain values of $v$ can make $\Delta t’$ negative. In other words, the effect occurs before the cause in this frame, the violation of causality!

Food for Thought: Can one possibly use tachyons to send a message (signal) to the past?

References:

  1. Walter Greiner, Classical Mechanics, Point Particles and Relativity, Springer, 2004
  2. Richard C. Tolman, The Theory of the Relativity of Motion, University of California Press, 1917. A scanned copy is available for viewing online here.

Weyl Algebra

Sam Walters twitted another inspiring math tweet. This time it is about Weyl Algebra. Roughly speaking Weyl algebra is the free algebra generated by two objects $a$ and $b$ which satisfy the Heisenberg commutation relation $$ab-ba=1$$ This commutation relation is originated from the canonical commutation relation $$qp-pq=i\hbar 1$$ in quantum mechanics. Here $p$ and $q$ represent the momentum and the position operators. Although it is commonly called Weyl algebra, its idea appears to have originated from P.A.M. Dirac [1] and for that reason it is also called Dirac’s quantum algebra. In his tweet, Sam stated three properties regarding Weyl algebra to prove as seen in the screenshot below.

So far I have been able to prove the properties (1) and (2). Their proofs follow. If/when I prove the property (3), I will included its proof here as an update.

(1) Show that $ab,a^2b^2,\cdots,a^nb^n,\cdots$ all commute with one another.

Proof.  First we show by induction on $n$ that \begin{equation}\label{eq:commut}(ab)(a^nb^n)=(a^nb^n)(ab)\end{equation} for all $n=1,2,\cdots$. Suppose that \eqref{eq:commut} is true for $n=1,\cdots,k$. \begin{align*}(ab)(a^{k+1}b^{k+1})&=(ab)a(a^kb^k)b\\&=a(ba)(a^kb^k)b\\&=a(ab+1)(a^kb^k)b\\&=a(ab)(a^kb^k)b+a(a^kb^k)b\end{align*} Similarly, we show $$(a^{k+1}b^{k+1})(ab)=a(a^kb^k)(ab)b+a(a^kb^k)b$$ By induction hypothesis, we have $$(ab)(a^{k+1}b^{k+1})=(a^{k+1}b^{k+1})(ab)$$ Hence it completes the proof of \eqref{eq:commut}.

Now we show that \begin{equation}\label{eq:commut2}(a^kb^k)(a^mb^m)=(a^mb^m)(a^kb^k)\end{equation} for all $k,m=1,2,\cdots$. Fix $k$ and we do induction on $m$. Suppose \eqref{eq:commut2} is true for all $m=1,\cdots,l$. \begin{align*}(a^{l+1}b^{l+1})(a^kb^k)&=a^l(ab)b^l(a^kb^k)\\&=a^l(1+ba)b^l(a^kb^k)\\&=(a^lb^l)(a^kb^k)+a^l(ba)b^l(a^kb^k)\end{align*} Similarly we show that $$(a^kb^k)(a^{l+1}b^{l+1})=(a^kb^k)(a^lb^l)+(a^kb^k)a^l(ba)b^l$$ Thus we are done if we can show that $$a^l(ba)b^l(a^kb^k)=(a^kb^k)a^l(ba)b^l$$ For $l=1$, this is clear by \eqref{eq:commut}. Assume that $l>1$. \begin{align*}a^l(ba)b^l(a^kb^k)&=a^lb(ab)b^{l-1}(a^kb^k)\\&=a^lb(1+ba)b^{l-1}(a^kb^k)\\&=(a^lb^l)(a^kb^k)+a^lb^2(ab)b^{l-2}(a^kb^k)\\&=(a^lb^l)(a^kb^k)+a^lb^2(1+ba)b^{l-2}(a^kb^k)\\&=2(a^lb^l)+a^lb^3ab^{l-2}(a^kb^k)\\&=(l-1)(a^lb^l)(a^kb^k)+(a^lb^l)(ab)(a^kb^k)\end{align*} Similarly we show that $$(a^kb^k)a^l(ba)b^l=(l-1)(a^kb^k)(a^lb^l)+(a^kb^k)(a^lb^l)(ab)$$ The induction hypothesis and \eqref{eq:commut} then conclude the proof of \eqref{eq:commut2}.

Lemma. \begin{align}\label{eq:commut3}[a^k,b]&=a^kb-ba^k=ka^{k-1}\\\label{eq:commut4}[a,b^k]&=kb^{k-1}\end{align}

Proof. We prove only \eqref{eq:commut3} as \eqref{eq:commut4} can be proved similarly. $k=1$ is the commutation relation. Assume that $k>1$. \begin{align*}a^kb-ba^k&=a^{k-1}(ab)-ba^k\\&=a^{-1}(1+ba)-ba^k\\&=a^{k-1}+a^{k-1}ba-ba^k\\&=a^{k-1}+a^{k-2}(ab)a-ba^k\\&=a^{k-1}+a^{k-2}(1+ba)a-ba^k\\&=2a^{k-1}+a^{k-2}ba^2-ba^k\end{align*} Continuing this process we arrive at $$a^kb-ba^k=ka^{k-1}$$

Short Proof. One can prove \eqref{eq:commut3} and \eqref{eq:commut4} straightforwardly using the formula $$p(a)b-bp(a)=p'(a)$$ for any polynomial $p$, which is discussed here.

(2) Show that $a^m\ne\lambda 1$ for any integer $m\geq 1$ and scalar $\lambda$.

Proof. We prove by contradiction. Suppose that $a^m=\lambda 1$ for some $m\geq 1$ and a scalar $\lambda\ne 0$. If $m=1$, then $ab-ba=0$ so a contradiction. Thus it must be that $m>1$. $[a^m,b]=[\lambda 1,b]=0$. But by \eqref{eq:commut3} $[a^m,b]=ma^{m-1}$. This means that $a^{m-1}=0$ which is a contradiction because $\lambda 1=a^m=a^{m-1}a=0$.

(3) If $a^mb^n=a^pb^q$ ($m,n,p,q\geq 0$) then $m=p$ and $n=q$.

Proof. Suppose that $m\ne p$. Without loss of generality one may assume that $m>p$ i.e. $m=p+k$ for some $k\geq 1$. Then $a^mb^n=a^pb^q\Longrightarrow a^p(a^kb^n-b^q)=0$. Weyl algebra has no zero divisors (see for example [2] where it is proved using degree argument) and $a^p\ne 0$ by the property (2). So $a^kb^n-b^q=0$. Since $b^q$ commutes with $b$, so does $a^kb^n$ i.e. $a^kb^{n+1}-ba^kb^n=0$. This implies that $ka^{k-1}=[a^k,b]=0$. A contradiction!. Therefore, $m=p$. One can show that $n=q$ in a similar manner.

Update: A twitter user named Long offered a brilliant proof for (1). I reproduce his/her proof here. Let $x=ab$. Then one can easily show by induction on $n$ that $a^nx=(x+n)a^n$ and $xb^n=b^n(x+n)$ for $n=1,2,\cdots$. Now \begin{align*}a^{n+1}b^{n+1}&=a^nxb^n\\&=(x+n)a^nb^n\\&=(x+n)(x+n-1)\cdots (x+1)x\end{align*} This means $a^nb^n$ for all $n=1,2,\cdots$ belong to the commutative subalgebra of Weyl algebra generated by $x=ab$.

References:

  1.  P.A.M. Dirac, The fundamental equations of quantum mechanics, Proc. Roy. Soc. A, v.109, pp.642-653, 1925
  2. S. C. Coutinho, A Primer of Algebraic D-modules, London Mathematical Society Student Texts 33, Cambridge University Press, 1995

Solving a Functional Equation $x^y=y^x$

Here is another math problem proposed by Sam Walters, one of my favorite mathematicians on Twitter.

Sam used a trigger word for me “isn’t hard.” I took it as being easy enough so any undergraduate math student can solve (turns out it actually is) which means I should be able to solve it in no time. I spent some hours to solve the functional equation $x^y=y^x$ but I was still stuck with my wounded ego until I saw a hint from another mathematician Rob Corless in his reply to the above tweet. The answer lies in Lambert W function! It is shame but I didn’t know Lambert W function though I have seen it. The function $f(x)=xe^x$ is injective (one-to-one) so it is invertible but one cannot explicitly write it’s inverse function so we denote it by $W(xe^x)$ i.e. $x=f^{-1}(xe^x)=W(xe^x)$. This $W$ is called Lambert W function. First let us take natural logarithm of the equation $x^y=y^x$. With some rearrangements, we arrive at \begin{equation}\label{eq:funeq}\frac{\ln y}{y}=\frac{\ln x}{x}\end{equation} Equation \eqref{eq:funeq} is well defined by the conditions $x,y>1$. Clearly $y=x$ is a solution. Now we want to find a less trivial solution. Let us introduce a new variable $u$ which satisfies $y=\frac{1}{u}$. Then \eqref{eq:funeq} is written in terms of $u$ as \begin{equation}\label{eq:funeq2}u\ln u=-\frac{\ln x}{x}\end{equation} Yet we introduce another variable $v$ which satisfies $u=e^v$. In terms of $v$, \eqref{eq:funeq2} is written as $f(v)=ve^v=-\frac{\ln x}{x}$ and hence $v=W\left(-\frac{\ln x}{x}\right)$ i.e. \begin{equation}\label{eq:funeq3}y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)\end{equation} The equation $v=W\left(-\frac{\ln x}{x}\right)$ above is a useful identity itself for Lambert W function \begin{equation}\label{eq:funeq4}W\left(-\frac{\ln x}{x}\right)=-\ln x\end{equation} From \eqref{eq:funeq4} we can get some special values of $W$ for example $W(0)=0$ and $W\left(-\frac{1}{e}\right)=-1$. The graphs of $y=x$ and $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ can be seen in Figure 1.

Figure 1. The graphs of y=x (in red) and y=-xW(-ln(x)/x)/ln(x) (in blue)

It appears that $y=x$ and $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ coincide on $(1,e)$. $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ has a kink at $x=e$ and is decreasing on $(e,\infty)$. Differentiating \eqref{eq:funeq} with respect to $x$ results in \begin{equation}\label{eq:funeq5}\frac{1-\ln y}{y^2}\frac{dy}{dx}=\frac{1-\ln x}{x^2}\end{equation} Let $y=f(x)$. Recall $\frac{df^{-1}(x)}{dx}=\frac{1}{\frac{df(x)}{dx}}$. But $f(x)$ is an involution i.e. $f^{-1}=f$ and so $\frac{df(x)}{dx}=\pm 1$. By \eqref{eq:funeq5} with $f(x)$ being an involution, we see that $y=f(x)$ is an increasing function on $(1,e)$ thus $\frac{df(x)}{dx}=1$ and with $f(e)=e$, $f(x)=x$ on $(1,e)$ as we have speculated from Figure 1. $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ parts ways with $y=x$ at $x=e$ and it becomes decreasing on $(e,\infty)$. As for Sam’s last question, first rewrite \eqref{eq:funeq} as $\ln y=y\frac{\ln x}{x}$. Since $y$ is decreasing on $(e,\infty)$ and is bounded below by 1, $\lim_{x\to\infty}y$ must exist, so $\lim_{x\to\infty}\ln y=0$. (This limit is obtained with $\lim_{x\to\infty}\frac{\ln x}{x}=0$.) Since $\lim_{x\to\infty}\ln y=\ln(\lim_{x\to\infty}y)$, $\lim_{x\to\infty}y=1$.

Update:

  1. I realized that I was sloppy in the definition of $W$. In general $W$ is defined as the collection of the branches of $f(z)=ze^z$ where $z$ is the complex variable $z=x+iy$. $f(z)=ze^z$ is not injective so $W$ is multivalued. I considered the real version which is the inverse of $f(x)=xe^x$ here and I said it is injective. The reason is that I was restricting its domain to $x\geq 0$ though I didn’t mention it. But $f(x)=xe^x$ is in general not injective as seen in  Figure 2.

    Figure 2. The graph of f(x)=xe^x.

    One can also easily show that it is not injective without a graph. $f'(x)=e^x(x+1)$ so $x=-1$ is a critical point. $f'(x)<0$ on $(-\infty,-1)$ i.e. $f(x)$ is decreasing on $(-\infty,-1)$ and $f'(x)>0$ on $(-1,\infty)$ i.e. $f(x)$ is increasing on $(-1,\infty)$. Hence $W$ is still multivalued. The upper branch $W\geq -1$ is denoted $W_0$ and is defined to be the principal branch of $W$ and the lower branch $W\leq -1$ is denoted by $W_{-1}$. Now one can easily see why there is a kink for $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ at $x=e$. Because that is where $W$ is failed to be single-valued. In fact, $y=-\frac{x}{\ln x}W_{-1}\left(-\frac{\ln x}{x}\right)$ has no kink as seen in the nice graphics made by Greg Egan here. Also see Figure 3 for the graph of $y=-\frac{x}{\ln x}W_{-1}\left(-\frac{\ln x}{x}\right)$ alone.

    Figure 3. The graphs of y=x (in red) and y=-xW_{-1}(pointed out by Sam-ln(x)/x)/ln(x) (in blue)

    By the way, in case you don’t know, Greg is a well-known science fiction writer and his novels have lots of interesting math and physics stuff. For information on his novels visit his website. There is so much interesting stuff to learn about Lambert W function. To learn more about it, for starter, see its Wikipedia entry and also a survey paper on Lambert W function here. Note that one of the authors is Rob Corless. I thought he was just a knowledgeable passerby who threw a hint at others but it turns out he is an expert of Lambert W function.

  2.  By substituting $z_0=ze^z$ in $z=W(ze^z)$ we obtain $z_0=W(z_0)e^{W(z_0)}$ for any complex number $z_0$. This can be used as the defining equation for Lambert $W$ function.

Update:

  1. After reading Maple information on the command LambertW, I realized that Figure 1 is actually the graph of the principal branch $W_0$.
  2. The substitutions $u$ and $v$ I used to find the solution of $x^y=y^x$ can be combined into one as pointed out by Sam. Let $v$ be a variable satisfying $y=e^{-v}$. Then \eqref{eq:funeq} turns into $ve^v=-\frac{\ln x}{x}$ as before. In order for $ve^v$ to be injective we require that $-1<v<\infty$ or equivalently $0<y<\frac{1}{e}$.

Update: Sam twitted that the equation $m^n=n^m$ where $m, n$ are integers with $1<m<n$ has only one solution $m=2,n=4$. This can be easily seen from Figure 1. Can we see that without a graph? Yes we can.  Since $m\ne n$ and $\frac{\ln m}{m}=\frac{\ln n}{n}$ then, it must be that $m,n\geq 3$ and that $n$ needs to be of the form $k^r$ where both $k,r$ are integers. The first such $n$ is $n=4$ and $\frac{\ln m}{m}=\frac{\ln 4}{4}=\frac{\ln 2}{2}$, thus $m=2$. Since $y$ is decreasing, so is $m$ and hence we see that $(m,n)=(2,4)$ is the only solution to $m^n=n^m$.