The Schrödinger equation $$i\hbar\frac{\partial\psi}{\partial t}=\hat H\psi$$ is a non-relativistic approximation of what is supposed to be more realistic a relativistic equation. The first place one would look at to find a relativisitic generalization is the relativistic energy $$E=\sqrt{c^2p^2+m^2c^4}$$ Replacing $E$ and $p$ by operators $i\hbar\frac{\partial}{\partial t}$ and $-i\hbar\nabla$, respectively, we obtain the square-root Klein-Gordon equation $$-\hbar\frac{\partial\psi(t,x)}{\partial t}=\sqrt{-c^2\hbar^2\nabla^2+m^2c^4}\psi(t,x)$$ This equation is however not a desirable one. Due to the appearance of the radical in the right hand side, it is impossible to include external electromagnetic fields in a relativistically invariant way.

P.A.M. Dirac considered a linearization of the relativistic energy by writing \begin{equation}\label{eq:linenergy}E=c\sum_{i=1}^3\alpha_ip_i+\beta mc^2=c\alpha\cdot p+\beta mc^2\end{equation} where $\alpha=(\alpha_1,\alpha_2,\alpha_3)$ and $\beta$ have to be determined by comparing it with the relativistic energy.

Squaring \eqref{eq:linenergy}, we have \begin{equation}\begin{aligned}E^2=&c^2[\alpha_1^2p_1^2+\alpha_2^2p_2^2+\alpha_3^3p_3^2+(\alpha_1\alpha_2+\alpha_2\alpha_1)p_1p_2+\\&(\alpha_2\alpha_3+\alpha_3\alpha_2)p_2p_3+(\alpha_3\alpha_1+\alpha_1\alpha_3)p_3p_1]+\\&mc^3[(\alpha_1\beta+\beta\alpha_1)p_1+(\alpha_2\beta+\beta\alpha_2)p_2+(\alpha_3\beta+\beta\alpha_3)p_3]+\\&\beta^2m^2c^4\end{aligned}\label{eq:linenergy2}\end{equation} \eqref{eq:linenergy2} must coincide with $c^2p^2+m^2c^4$. For that to happen we must require that \begin{align*}\alpha_1^2p_1^2+\alpha_2^2p_2^2+\alpha_3^2p_3^2&=p^2\\\alpha_i\alpha_j+\alpha_j\alpha_i&=0\ \mbox{for $i\ne j$}\\\alpha_i\beta+\beta\alpha_i&=0\\\beta^2m^2c^4&=m^2c^4\end{align*}If the $\alpha_i$’s and $\beta$ were numbers, we would have $\alpha_1=\alpha_2=\alpha_3=\beta=0$ which is not desirable. Since $\alpha_i$’s and $\beta$ are anticommuting, we may assume that they are $n\times n$ matrices. Now the $\alpha_i$’s and $\beta$, as $n\times n$ matrices, are required to satisfy \begin{equation}\begin{aligned}\alpha_i\alpha_j+\alpha_j\alpha_i&=2\delta_{ij}{\bf 1},\ i,j=1,2,3\\\alpha_i\beta+\beta\alpha_i&=0,\ i=1,2,3\\\beta^2&={\bf 1}\end{aligned}\label{eq:linenergy3}\end{equation}where ${\bf 1}$ denotes the $n\times n$ identity matrix. In order for the Hamiltonian to be Hermitian, the $\alpha_i$’s and $\beta$ are required to be Hermitian. From \eqref{eq:linenergy3}, $$\mathrm{tr}\alpha_i=\mathrm{tr}\beta^2\alpha_i=\mathrm{tr}\beta(\beta\alpha_i)=-\mathrm{tr}\beta\alpha_i\beta=-\mathrm{tr}\alpha_i$$ Thus, $\mathrm{tr}\alpha_i=0$. Since $\alpha_i^2={\bf 1}$, $\alpha_i$ has eigenvalues $1,-1$. Together, we see that $n$ has to be an even number. The smallest $n$ is $n=2$, but this can’t be right as there are only three linearly independent anticommuting Hermitian matrices. For example, the Pauli matrices $$\sigma_1=\begin{pmatrix}0 & 1\\1 & 0\end{pmatrix},\ \sigma_2=\begin{pmatrix}0 & -i\\i & 0\end{pmatrix},\ \sigma_3=\begin{pmatrix}1 & 0\\0 & -1\end{pmatrix}$$ together with ${\bf 1}$ form a basis for the space of $2\times 2$ Hermitian matrices. For $n=4$, if we choose \begin{equation}\label{eq:diracmat}\beta=\begin{pmatrix}{\bf 1} & {\bf 0}\\{\bf 0} & -{\bf 1}\end{pmatrix},\ \alpha_i=\begin{pmatrix}{\bf 0} & \sigma_i\\\sigma_i & {\bf 0}\end{pmatrix},\ i=1,2,3\end{equation} then \eqref{eq:linenergy3} is satisfied.

Now replacing $E$ and $p$ by operators $i\hbar\frac{\partial}{\partial t}$ and $-i\hbar\nabla$, respectively, we obtain the *Dirac equation* $$i\hbar\frac{\partial\psi(t,x)}{\partial t}=H_o\psi(t,x)$$ where \begin{align*}H_0&=-i\hbar c\alpha\cdot\nabla+\beta mc^2\\&=\begin{pmatrix}mc^2{\bf 1} & -i\hbar c\sigma\cdot\nabla\\-i\hbar c\sigma\cdot\nabla & -mc^2{\bf 1}\end{pmatrix}\end{align*} Here, $\alpha=(\alpha_1,\alpha_2,\alpha_3)$ and $\sigma=(\sigma_1,\sigma_2,\sigma_3)$ are triplets of matrices. The Dirac equation acts of $\mathbb{C}^4$-valued wave functions $$\psi(t,x)=\begin{pmatrix}\psi_1(t,x)\\\psi_2(t,x)\\\psi_3(t,x)\\\psi_4(t,x)\end{pmatrix},\ \psi_i\in\mathbb{C},\ i=1,2,3,4$$

If $m=0$, then only three anticommuting $\alpha_i$ are needed, so it would be sufficient to use $2\times 2$ matrices. For example, one may choose $\alpha_i=\sigma_i$, $i=1,2,3$. Then we obtain the equation $$i\hbar\frac{\partial\psi(t,x)}{\partial t}=c\sigma\cdot\nabla\psi(t,x)$$ This equation is called the *Weyl equation*. The Weyl equation is thought to describe neutrinos. We will discuss more about this later.

If the space dimension is two, then we can also use Pauli matrices instead of Dirac matrices \eqref{eq:diracmat}. In this case, $H$ has the form $$H=-i\hbar c\left(\sigma_1\frac{\partial}{\partial x_1}+\sigma_2\frac{\partial}{\partial x_2}\right)+\sigma_3 mc^2$$

*References*:

[1] Walter Greiner, Relativistic Quantum Mechanics, 3rd Edition, Springer-Verlag, 2000

[2] Bernd Thaller, The Dirac Equation, Springer-Verlag, 1992