Category Archives: Math on Twitter

Weyl Algebra

Sam Walters twitted another inspiring math tweet. This time it is about Weyl Algebra. Roughly speaking Weyl algebra is the free algebra generated by two objects $a$ and $b$ which satisfy the Heisenberg commutation relation $$ab-ba=1$$ This commutation relation is originated from the canonical commutation relation $$qp-pq=i\hbar 1$$ in quantum mechanics. Here $p$ and $q$ represent the momentum and the position operators. Although it is commonly called Weyl algebra, its idea appears to have originated from P.A.M. Dirac [1] and for that reason it is also called Dirac’s quantum algebra. In his tweet, Sam stated three properties regarding Weyl algebra to prove as seen in the screenshot below.

So far I have been able to prove the properties (1) and (2). Their proofs follow. If/when I prove the property (3), I will included its proof here as an update.

(1) Show that $ab,a^2b^2,\cdots,a^nb^n,\cdots$ all commute with one another.

Proof.  First we show by induction on $n$ that \begin{equation}\label{eq:commut}(ab)(a^nb^n)=(a^nb^n)(ab)\end{equation} for all $n=1,2,\cdots$. Suppose that \eqref{eq:commut} is true for $n=1,\cdots,k$. \begin{align*}(ab)(a^{k+1}b^{k+1})&=(ab)a(a^kb^k)b\\&=a(ba)(a^kb^k)b\\&=a(ab+1)(a^kb^k)b\\&=a(ab)(a^kb^k)b+a(a^kb^k)b\end{align*} Similarly, we show $$(a^{k+1}b^{k+1})(ab)=a(a^kb^k)(ab)b+a(a^kb^k)b$$ By induction hypothesis, we have $$(ab)(a^{k+1}b^{k+1})=(a^{k+1}b^{k+1})(ab)$$ Hence it completes the proof of \eqref{eq:commut}.

Now we show that \begin{equation}\label{eq:commut2}(a^kb^k)(a^mb^m)=(a^mb^m)(a^kb^k)\end{equation} for all $k,m=1,2,\cdots$. Fix $k$ and we do induction on $m$. Suppose \eqref{eq:commut2} is true for all $m=1,\cdots,l$. \begin{align*}(a^{l+1}b^{l+1})(a^kb^k)&=a^l(ab)b^l(a^kb^k)\\&=a^l(1+ba)b^l(a^kb^k)\\&=(a^lb^l)(a^kb^k)+a^l(ba)b^l(a^kb^k)\end{align*} Similarly we show that $$(a^kb^k)(a^{l+1}b^{l+1})=(a^kb^k)(a^lb^l)+(a^kb^k)a^l(ba)b^l$$ Thus we are done if we can show that $$a^l(ba)b^l(a^kb^k)=(a^kb^k)a^l(ba)b^l$$ For $l=1$, this is clear by \eqref{eq:commut}. Assume that $l>1$. \begin{align*}a^l(ba)b^l(a^kb^k)&=a^lb(ab)b^{l-1}(a^kb^k)\\&=a^lb(1+ba)b^{l-1}(a^kb^k)\\&=(a^lb^l)(a^kb^k)+a^lb^2(ab)b^{l-2}(a^kb^k)\\&=(a^lb^l)(a^kb^k)+a^lb^2(1+ba)b^{l-2}(a^kb^k)\\&=2(a^lb^l)+a^lb^3ab^{l-2}(a^kb^k)\\&=(l-1)(a^lb^l)(a^kb^k)+(a^lb^l)(ab)(a^kb^k)\end{align*} Similarly we show that $$(a^kb^k)a^l(ba)b^l=(l-1)(a^kb^k)(a^lb^l)+(a^kb^k)(a^lb^l)(ab)$$ The induction hypothesis and \eqref{eq:commut} then conclude the proof of \eqref{eq:commut2}.

Lemma. \begin{align}\label{eq:commut3}[a^k,b]&=a^kb-ba^k=ka^{k-1}\\\label{eq:commut4}[a,b^k]&=kb^{k-1}\end{align}

Proof. We prove only \eqref{eq:commut3} as \eqref{eq:commut4} can be proved similarly. $k=1$ is the commutation relation. Assume that $k>1$. \begin{align*}a^kb-ba^k&=a^{k-1}(ab)-ba^k\\&=a^{-1}(1+ba)-ba^k\\&=a^{k-1}+a^{k-1}ba-ba^k\\&=a^{k-1}+a^{k-2}(ab)a-ba^k\\&=a^{k-1}+a^{k-2}(1+ba)a-ba^k\\&=2a^{k-1}+a^{k-2}ba^2-ba^k\end{align*} Continuing this process we arrive at $$a^kb-ba^k=ka^{k-1}$$

Short Proof. One can prove \eqref{eq:commut3} and \eqref{eq:commut4} straightforwardly using the formula $$p(a)b-bp(a)=p'(a)$$ for any polynomial $p$, which is discussed here.

(2) Show that $a^m\ne\lambda 1$ for any integer $m\geq 1$ and scalar $\lambda$.

Proof. We prove by contradiction. Suppose that $a^m=\lambda 1$ for some $m\geq 1$ and a scalar $\lambda\ne 0$. If $m=1$, then $ab-ba=0$ so a contradiction. Thus it must be that $m>1$. $[a^m,b]=[\lambda 1,b]=0$. But by \eqref{eq:commut3} $[a^m,b]=ma^{m-1}$. This means that $a^{m-1}=0$ which is a contradiction because $\lambda 1=a^m=a^{m-1}a=0$.

(3) If $a^mb^n=a^pb^q$ ($m,n,p,q\geq 0$) then $m=p$ and $n=q$.

Proof. Suppose that $m\ne p$. Without loss of generality one may assume that $m>p$ i.e. $m=p+k$ for some $k\geq 1$. Then $a^mb^n=a^pb^q\Longrightarrow a^p(a^kb^n-b^q)=0$. Weyl algebra has no zero divisors (see for example [2] where it is proved using degree argument) and $a^p\ne 0$ by the property (2). So $a^kb^n-b^q=0$. Since $b^q$ commutes with $b$, so does $a^kb^n$ i.e. $a^kb^{n+1}-ba^kb^n=0$. This implies that $ka^{k-1}=[a^k,b]=0$. A contradiction!. Therefore, $m=p$. One can show that $n=q$ in a similar manner.

Update: A twitter user named Long offered a brilliant proof for (1). I reproduce his/her proof here. Let $x=ab$. Then one can easily show by induction on $n$ that $a^nx=(x+n)a^n$ and $xb^n=b^n(x+n)$ for $n=1,2,\cdots$. Now \begin{align*}a^{n+1}b^{n+1}&=a^nxb^n\\&=(x+n)a^nb^n\\&=(x+n)(x+n-1)\cdots (x+1)x\end{align*} This means $a^nb^n$ for all $n=1,2,\cdots$ belong to the commutative subalgebra of Weyl algebra generated by $x=ab$.


  1.  P.A.M. Dirac, The fundamental equations of quantum mechanics, Proc. Roy. Soc. A, v.109, pp.642-653, 1925
  2. S. C. Coutinho, A Primer of Algebraic D-modules, London Mathematical Society Student Texts 33, Cambridge University Press, 1995

Solving a Functional Equation $x^y=y^x$

Here is another math problem proposed by Sam Walters, one of my favorite mathematicians on Twitter.

Sam used a trigger word for me “isn’t hard.” I took it as being easy enough so any undergraduate math student can solve (turns out it actually is) which means I should be able to solve it in no time. I spent some hours to solve the functional equation $x^y=y^x$ but I was still stuck with my wounded ego until I saw a hint from another mathematician Rob Corless in his reply to the above tweet. The answer lies in Lambert W function! It is shame but I didn’t know Lambert W function though I have seen it. The function $f(x)=xe^x$ is injective (one-to-one) so it is invertible but one cannot explicitly write it’s inverse function so we denote it by $W(xe^x)$ i.e. $x=f^{-1}(xe^x)=W(xe^x)$. This $W$ is called Lambert W function. First let us take natural logarithm of the equation $x^y=y^x$. With some rearrangements, we arrive at \begin{equation}\label{eq:funeq}\frac{\ln y}{y}=\frac{\ln x}{x}\end{equation} Equation \eqref{eq:funeq} is well defined by the conditions $x,y>1$. Clearly $y=x$ is a solution. Now we want to find a less trivial solution. Let us introduce a new variable $u$ which satisfies $y=\frac{1}{u}$. Then \eqref{eq:funeq} is written in terms of $u$ as \begin{equation}\label{eq:funeq2}u\ln u=-\frac{\ln x}{x}\end{equation} Yet we introduce another variable $v$ which satisfies $u=e^v$. In terms of $v$, \eqref{eq:funeq2} is written as $f(v)=ve^v=-\frac{\ln x}{x}$ and hence $v=W\left(-\frac{\ln x}{x}\right)$ i.e. \begin{equation}\label{eq:funeq3}y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)\end{equation} The equation $v=W\left(-\frac{\ln x}{x}\right)$ above is a useful identity itself for Lambert W function \begin{equation}\label{eq:funeq4}W\left(-\frac{\ln x}{x}\right)=-\ln x\end{equation} From \eqref{eq:funeq4} we can get some special values of $W$ for example $W(0)=0$ and $W\left(-\frac{1}{e}\right)=-1$. The graphs of $y=x$ and $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ can be seen in Figure 1.

Figure 1. The graphs of y=x (in red) and y=-xW(-ln(x)/x)/ln(x) (in blue)

It appears that $y=x$ and $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ coincide on $(1,e)$. $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ has a kink at $x=e$ and is decreasing on $(e,\infty)$. Differentiating \eqref{eq:funeq} with respect to $x$ results in \begin{equation}\label{eq:funeq5}\frac{1-\ln y}{y^2}\frac{dy}{dx}=\frac{1-\ln x}{x^2}\end{equation} Let $y=f(x)$. Recall $\frac{df^{-1}(x)}{dx}=\frac{1}{\frac{df(x)}{dx}}$. But $f(x)$ is an involution i.e. $f^{-1}=f$ and so $\frac{df(x)}{dx}=\pm 1$. By \eqref{eq:funeq5} with $f(x)$ being an involution, we see that $y=f(x)$ is an increasing function on $(1,e)$ thus $\frac{df(x)}{dx}=1$ and with $f(e)=e$, $f(x)=x$ on $(1,e)$ as we have speculated from Figure 1. $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ parts ways with $y=x$ at $x=e$ and it becomes decreasing on $(e,\infty)$. As for Sam’s last question, first rewrite \eqref{eq:funeq} as $\ln y=y\frac{\ln x}{x}$. Since $y$ is decreasing on $(e,\infty)$ and is bounded below by 1, $\lim_{x\to\infty}y$ must exist, so $\lim_{x\to\infty}\ln y=0$. (This limit is obtained with $\lim_{x\to\infty}\frac{\ln x}{x}=0$.) Since $\lim_{x\to\infty}\ln y=\ln(\lim_{x\to\infty}y)$, $\lim_{x\to\infty}y=1$.


  1. I realized that I was sloppy in the definition of $W$. In general $W$ is defined as the collection of the branches of $f(z)=ze^z$ where $z$ is the complex variable $z=x+iy$. $f(z)=ze^z$ is not injective so $W$ is multivalued. I considered the real version which is the inverse of $f(x)=xe^x$ here and I said it is injective. The reason is that I was restricting its domain to $x\geq 0$ though I didn’t mention it. But $f(x)=xe^x$ is in general not injective as seen in  Figure 2.

    Figure 2. The graph of f(x)=xe^x.

    One can also easily show that it is not injective without a graph. $f'(x)=e^x(x+1)$ so $x=-1$ is a critical point. $f'(x)<0$ on $(-\infty,-1)$ i.e. $f(x)$ is decreasing on $(-\infty,-1)$ and $f'(x)>0$ on $(-1,\infty)$ i.e. $f(x)$ is increasing on $(-1,\infty)$. Hence $W$ is still multivalued. The upper branch $W\geq -1$ is denoted $W_0$ and is defined to be the principal branch of $W$ and the lower branch $W\leq -1$ is denoted by $W_{-1}$. Now one can easily see why there is a kink for $y=-\frac{x}{\ln x}W\left(-\frac{\ln x}{x}\right)$ at $x=e$. Because that is where $W$ is failed to be single-valued. In fact, $y=-\frac{x}{\ln x}W_{-1}\left(-\frac{\ln x}{x}\right)$ has no kink as seen in the nice graphics made by Greg Egan here. Also see Figure 3 for the graph of $y=-\frac{x}{\ln x}W_{-1}\left(-\frac{\ln x}{x}\right)$ alone.

    Figure 3. The graphs of y=x (in red) and y=-xW_{-1}(pointed out by Sam-ln(x)/x)/ln(x) (in blue)

    By the way, in case you don’t know, Greg is a well-known science fiction writer and his novels have lots of interesting math and physics stuff. For information on his novels visit his website. There is so much interesting stuff to learn about Lambert W function. To learn more about it, for starter, see its Wikipedia entry and also a survey paper on Lambert W function here. Note that one of the authors is Rob Corless. I thought he was just a knowledgeable passerby who threw a hint at others but it turns out he is an expert of Lambert W function.

  2.  By substituting $z_0=ze^z$ in $z=W(ze^z)$ we obtain $z_0=W(z_0)e^{W(z_0)}$ for any complex number $z_0$. This can be used as the defining equation for Lambert $W$ function.


  1. After reading Maple information on the command LambertW, I realized that Figure 1 is actually the graph of the principal branch $W_0$.
  2. The substitutions $u$ and $v$ I used to find the solution of $x^y=y^x$ can be combined into one as pointed out by Sam. Let $v$ be a variable satisfying $y=e^{-v}$. Then \eqref{eq:funeq} turns into $ve^v=-\frac{\ln x}{x}$ as before. In order for $ve^v$ to be injective we require that $-1<v<\infty$ or equivalently $0<y<\frac{1}{e}$.

Update: Sam twitted that the equation $m^n=n^m$ where $m, n$ are integers with $1<m<n$ has only one solution $m=2,n=4$. This can be easily seen from Figure 1. Can we see that without a graph? Yes we can.  Since $m\ne n$ and $\frac{\ln m}{m}=\frac{\ln n}{n}$ then, it must be that $m,n\geq 3$ and that $n$ needs to be of the form $k^r$ where both $k,r$ are integers. The first such $n$ is $n=4$ and $\frac{\ln m}{m}=\frac{\ln 4}{4}=\frac{\ln 2}{2}$, thus $m=2$. Since $y$ is decreasing, so is $m$ and hence we see that $(m,n)=(2,4)$ is the only solution to $m^n=n^m$.

Topologizing a Set by Continuous Functions

Let $f: X\longrightarrow Y$ be a function. The continuity of $f$ is then determined by the topologies on $X$ and $Y$. To be precise, $f: X\longrightarrow Y$ is continuous on $X$ if for every open set $U\subset Y$ $f^{-1}(U)$ is open in $X$. But if you have a clear idea of what class of functions, say $\{f_\alpha:X\longrightarrow Y|\alpha\in\mathscr{A}\}$, should be continuous, you can also define a topology on $X$ that makes $f_\alpha :X\longrightarrow Y$ continuous for all $\alpha\in\mathscr{A}$ as Sam Walters mentioned on his tweet. Here is how. Let $$\mathscr{S}=\{f_{\alpha}^{-1}(U)| \mbox{$U$ is open in $Y$}, \alpha\in\mathscr{A}\}$$
Since $\bigcup\{G|G\in\mathscr{S}\}=X$, $\mathscr{S}$ is a subbase for a topology in $X$. (Note: $\mathscr{S}$ is not a base for a topology in $X$ unless $\mathscr{A}$ is a singleton set.) Denote by $\tau(\mathscr{S})$ the topology generated by the subbase $\mathscr{S}$. Then $\tau(\mathscr{S})$ is the smallest topology on $X$ that makes $f_\alpha :X\longrightarrow Y$ continuous for all $\alpha\in\mathscr{A}$. The reason is any topology on $X$ that makes $f_\alpha :X\longrightarrow Y$ continuous for all $\alpha\in\mathscr{A}$ would include $\mathscr{S}$. We have in fact seen a similar idea which is Tychonoff product topology. Consider the cartesian product
of topological spaces $(X_\alpha,\tau_\alpha)$, $\alpha\in\mathscr{A}$. The topology we want is the one that makes the projection maps $\pi_\alpha :\prod_{\alpha\in\mathscr{A}}X_\alpha\longrightarrow X_\alpha$ for all $\alpha\in\mathscr{A}$ continuous in particular the smallest one. Let $$\mathscr{S}=\{\pi_\alpha^{-1}(U_\alpha)|U_\alpha\in\tau_\alpha,\forall\alpha\in\mathscr{A}\}$$
Then $\mathscr{S}$ is a subbase for a topology called the Tychonoff product topology. This is the smallest topology that makes the projection maps continuous. The projection maps are also open.

Given a surjective function $p: X\longrightarrow Y$ from a topological space $X$ onto a set $Y$, we can also define a topology on $Y$ that makes $p$ continuous: $U\subset Y$ is open if and only if $p^{-1}(U)$ is open in $X$. This topology on $Y$ automatically makes $p$ continuous. Moreover it is the largest topology on $Y$ that makes $p$ continuous. This topology is called the identification topology. The identification topology can be used to get a new topological space (identification space) from an old one. More specifically, it can be defined on a partition of $X$ or equivalently a quotient set of $X$ modulo an equivalence relation and it makes the canonical projection $\pi$ continuous. For example, let $X$ be the unit square $[0,1]\times[0,1]$ in $\mathbb{E}^2$ with the subspace topology. Partition $X$ into the following subsets:

  1. $\{(0,0),(1,0),(0,1),(1,1)\}$ i.e. the set of four corner points.
  2. $\{(x,0),(x,1)|0<x<1\}$
  3. $\{(0,y),(1,y)|0<y<1\}$
  4. $\{(x,y)\}$ where $0<x<1$, $0<y<1$.

The resulting identification space is the torus which is homeomorphic to $S^1\times S^1$ as shown in Figure 1.

Figure 1. Clifford Torus S^1 x S^1


  1. M. A. Armstrong, Basic Topology, Springer-Verlag, 1983
  2. Benjamin T. Sims, Fundamentals of Topology, Collier Macmillan

Kepler’s Law

In his recent tweet, Sam obtained Kepler’s (second) law simply by using polar coordinates, integrals and conservation law of angular momentum. In this note I discuss basic physics about conservation law of angular momentum and Kepler’s second law as its consequence.

What is angular momentum?

Let $r$ be a vector from a fixed point (called the pivot).

Then the angular momentum is given by $$L=r\times p$$ where $p=mv$ is the linear momentum of the mass $m$. \begin{align*}\frac{dL}{dt}&=\frac{d}{dt}(r\times mv)\\&=\frac{dr}{dt}\times mv+r\times\frac{d(mv)}{dt}\\&=v\times mv+r\times\frac{dp}{dt}\\&=r\times\frac{dp}{dt}\end{align*} since $v\times mv=0$. That is, $\frac{dL}{dt}=r\times F$ and this is called torque. If torque $r\times F=0$ then $L$ is constant. This is conservation law of angular momentum. $r\times F=0$ if and only if $r$ and $F$ are parallel or antiparallel except for the trivial cases $r=0$ or $F=0$. A force that acts exclusively parallel or antiparallel to the position vector is called a central force. That is to say, central forces obey conservation law of angular momentum.

Conservation law of angular momentum implies Kepler’s second law

The area $dA$ spanned by $r$ and $dr$ is $$dA=\frac{1}{2}|r\times dr|$$

Figure 2. The area dA spanned by r and dr.

\begin{align*}\frac{dA}{dt}&=\frac{1}{2}|r\times v|\\&=\frac{1}{2m}|r\times mv|\\&=\frac{1}{2m}|L|\end{align*} $\frac{dA}{dt}$ is the area velocity of the radial vector $r$. It measures how fast area is covered per unit time. For the planetary motion gravitational force is a central force so $L$ is constant which means $\frac{dA}{dt}$ is constant. Hence conservation law of angular momentum implies the second Kepler law: The radial vector $r$ of a planet sweeps equal areas in equal time.


Walter Greiner, Classical Mechanics, Point Particles and Relativity, Springer-Verlag, 2004

A Linear Algebra Problem on Twitter

Problem: Let $A$ and $B$ be $n\times n$ matrices such that their sum $A+B$ is invertible. Then show that $$A(A+B)^{-1}B=B(A+B)^{-1}A$$ (Hat tip: Sam Walters)

Solution. \begin{equation}\begin{aligned}I&=(A+B)(A+B)^{-1}\\&=A(A+B)^{-1}+B(A+B)^{-1}\end{aligned}\label{eq:matrix}\end{equation} Multiply \eqref{eq:matrix} by $B$ from the right \begin{equation}\label{eq:matrix2}B=A(A+B)^{-1}B+B(A+B)^{-1}B\end{equation} Also multiply \eqref{eq:matrix} by $A$ from the left \begin{equation}\label{eq:matrix3}A=A(A+B)^{-1}A+B(A+B)^{-1}A\end{equation} Subtract \eqref{eq:matrix3} from \eqref{eq:matrix2}. \begin{equation}\label{eq:matrix4}B-A=A(A+B)^{-1}B-B(A+B)^{-1}A+B(A+B)^{-1}B-A(A+B)^{-1}A\end{equation} In a similar manner from $I=(A+B)^{-1}(A+B)$, we obtain \begin{equation}\label{eq:matrix5}A-B=A(A+B)^{-1}B-B(A+B)^{-1}A+A(A+B)^{-1}A-B(A+B)^{-1}B\end{equation} \eqref{eq:matrix4}+\eqref{eq:matrix5} results $$A(A+B)^{-1}B=B(A+B)^{-1}A$$

A mathematician who Twitter username is Manifoldless beat me to it by a few minutes :). But not just that. His solution is shorter (so better) than mine: \begin{align*}A(A+B)^{-1}B&=(A+B-B)(A+B)^{-1}(A+B-A)\\&=[I-B(A+B)^{-1}](A+B-A)\\&=A+B-A-B+B(A+B)^{-1}A\\&=B(A+B)^{-1}A\end{align*}