Legendre Functions III: Special Values, Parity, Orthogonality

Special Values

From the generating function
$$g(x,t)=\frac{1}{(1-2xt+t^2)^{1/2}},$$
when $x=1$ we obtain
\begin{align*}
g(1,t)&=\frac{1}{(1-2t+t^2)^{1/2}}\\
&=\frac{1}{1-t}\\
&=\sum_{n=0}^\infty t^n,
\end{align*}
since $|t|<1$. On the other hand,
$$g(1,t)=\sum_{n=0}^\infty P_n(1)t^n.$$
So by comparison we get
$$P_n(1)=1.$$ Similarly, if we let $x=-1$,
$$P_n(-1)=(-1)^n.$$
For $x=0$, the generating function results
$$(1+t^2)^{-1/2}=1-\frac{1}{2}t^2+\frac{3}{8}t^4+\cdots+(-1)^n\frac{1\cdot 3\cdots (2n-1)}{2^nn!}t^{2n}+\cdots.$$
Thus we obtain
\begin{align*}
P_{2n}(0)&=(-1)^n\frac{1\cdot 3\cdots (2n-1)}{2^nn!}=(-1)^n\frac{(2n-1)!!}{(2n)!!},\\
P_{2n+1}(0)&=0,\ n=0,1,2,\cdots.
\end{align*}
Recall that the double factorial !! is defined by
\begin{align*}
(2n)!!&=2\cdot 4\cdot 6\cdots (2n),\\
(2n-1)!!&=1\cdot 3\cdot 5\cdots (2n-1).
\end{align*}

Parity

$g(t,x)=g(-t,-x)$, that is
$$\sum_{n=0}^\infty P_n(x)t^n=\sum_{n=0}^\infty P_n(-x)(-t)^n$$
which results the parity
$$P_n(-x)=(-1)^nP_n(x).\ \ \ \ \ (1)$$
(1) tells that if $n$ is even, $P_n(x)$ is an even function and if $n$ is odd, $P_n(x)$ is an odd function.

Orthogonality

Multiply the Legendre’s diferential equation
$$\frac{d}{dx}[(1-x^2)P_n'(x)]+n(n+1)P_n(x)=0\ \ \ \ \ (2)$$ by $P_m(x)$.
$$P_m(x)\frac{d}{dx}[(1-x^2)P_n'(x)]+n(n+1)P_m(x)P_n(x)=0.\ \ \ \ \ (3)$$
Replace $n$ by $m$ in (2) and then multiply the resulting equation by $P_n(x)$.
$$P_n(x)\frac{d}{dx}[(1-x^2)P_m'(x)]+m(m+1)P_m(x)P_n(x)=0.\ \ \ \ \ (4)$$
Subtract (4) from (3) and integrate the resulting equation with respect to $x$ from $-1$ to 1.
\begin{align*}
\int_{-1}^1&\left\{P_m(x)\frac{d}{dx}[(1-x^2)P_n'(x)]-P_n(x)\frac{d}{dx}[(1-x^2)P_m'(x)]\right\}dx\\
&=[m(m+1)P_m(x)P_n(x)-n(n+1)P_m(x)P_n(x)].\end{align*}
Using integration by parts,
\begin{align*}
\int_{-1}^1P_m(x)\frac{d}{dx}[(1-x^2)P_n'(x)]dx&=\\&(1-x^2)P_m(x)P_n'(x)|_{-1}^1-\int_{-1}^1P_m(x)P_n(x)dx\\
&=-\int_{-1}^1P_m(x)P_n(x)dx.
\end{align*}
Since the integration of the second term inside $\{\ \ \}$ would have the same value, the LHS vanishes.
Hence for $m\ne n$,
$$\int_{-1}^1P_m(x)P_n(x)dx=0.\ \ \ \ \ (5)$$
That is, $P_m(x)$ and $P_n(x)$ are orthogonal for the interval $[-1,1]$.
For $x=\cos\theta$, the orthogonality (5) is given by
$$\int_0^\pi P_n(\cos\theta)P_m(\cos\theta)\sin\theta d\theta=0.$$

Integrate
$$(1-2xt+t^2)^{-1}=\left[\sum_{n=0}^\infty P_n(x)t^n\right]^2$$
with respect to $x$ from $-1$ to $1$. Due to the orthogonality (5), the integration of all the crossing terms in the RHS will vanish, and so we obtain
$$\int_{-1}^1\frac{dx}{1-2xt+t^2}=\sum_{n=0}^\infty \left\{\int_{-1}^1[P_n(x)]^2dx\right\}t^{2n}.$$
\begin{align*}
\int_{-1}^1\frac{dx}{1-2xt+t^2}&=\frac{1}{2t}\int_{(1-t)^2}^{(1+t)^2}\frac{dy}{y}\\
&=\frac{1}{t}\ln\left(\frac{1+t}{1-t}\right)\\
&=\sum_{n=0}^\infty\frac{2}{2n+1}t^{2n}\ (\mbox{since $|t|<1$}).
\end{align*}
Therefore we have the normalizer of Legendre polynomial $P_n(x)$
$$\int_{-1}^1[P_n(x)]^2dx=\frac{2}{2n+1}.$$

Expansion of Functions

Suppose that
$$\sum_{n=0}^\infty a_nP_n(x)=f(x).\ \ \ \ \ (6)$$
Multiply (6) by $P_m(x)$ and integrate with respect to $x$ from $-1$ to 1:
$$\sum_{n=0}^\infty a_n\int_{-1}^1 P_n(x)P_m(x)dx=\int_{-1}^1f(x)P_m(x)dx.$$
By the orthogonality (5), we obtain
$$\frac{2}{2m+1}a_m=\int_{-1}^1f(x)P_m(x)dx\ \ \ \ \ (7)$$
and hence $f(x)$ can be written as
$$f(x)=\sum_{n=0}^\infty\frac{2n+1}{2}\left(\int_{-1}^1 f(t)P_m(t)dt\right)P_n(x).\ \ \ \ \ (8)$$
This expansion in a series of Legendre polynomials is called a Legendre series. Clearly if $f(x)$ is continuous (or integrable) on the interval $[-1,1]$, it can be expanded as a Legendre series.

(7) can be considered as an integral transform, a finite Legendre transform and (8) can be considered as the inverse transform.

Let us consider the integral operator
$$\mathcal{P}_m:=P_m(x)\frac{2m+1}{2}\int_{-1}^1P_m(t)[\ \cdot\ ]dt.\ \ \ \ \ (9)$$
Then
$$\mathcal{P}_mf(t)=a_mP_m(x).$$
The operator (9) projects out the $m$th component of the function $f(x)$.

Structural Equations

Definition. The dual 1-forms $\theta_1,\theta_2,\theta_3$ of a frame $E_1,E_2,E_3$ on $\mathbb{E}^3$ are defined by
$$\theta_i(v)=v\cdot E_i(p),\ v\in T_p\mathbb{E}^3.$$
Clearly $\theta_i$ is linear.

Example. The dual 1-forms of the natural frame $U_1,U_2,U_3$ are $dx_1$, $dx_2$, $dx_3$ since
$$dx_i(v)=v_i=v\cdot U_i(p)$$
for each $v\in T_p\mathbb{E}^3$.

For any vector field $V$ on $\mathbb{E}^3$,
$$V=\sum_i\theta_i(V)E_i.$$
To see this, let us calculate for each $V(p)\in T_p\mathbb{E}^3$
\begin{align*}
\sum_i\theta_i(V(p))E_i(p)&=\sum_i(V(p)\cdot E_i(p))E_i(p)\\
&=\sum_iV_i(p)E_i(p)\\
&=V(p).
\end{align*}

Lemma. Let $\theta_1,\theta_2,\theta_3$ be the dual 1-forms of a frame $E_1, E_2, E_3$. Then any 1-form $\phi$ on $\mathbb{E}^3$ has a unique expression
$$\phi=\sum_i\phi(E_i)\theta_i.$$

Proof. Let $V$ be any vector field on $\mathbb{E}^3$. Then
\begin{align*}
\sum_i\phi(E_i)\theta_i(V)&=\sum_i\phi(E_i)\theta_i(V)\\
&=\phi(\sum_i\theta_i(V)E_i)\ \mbox{by linearity of $phi$}\\
&=\phi(V).
\end{align*}
Let $A=(a_{ij})$ be the attitude matrix of a frame field $E_1$, $E_2$, $E_3$, i.e.
\begin{equation}\label{eq:frame}E_i=\sum_ja_{ij}U_j,\ i=1,2,3.\end{equation}
Clearly $\theta_i=\sum_j\theta_i(U_j)dx_j$. On the other hand,
$$\theta_i(U_j)=E_i\cdot U_j=\left(\sum_ka_{ik}U_k\right)\cdot U_j=a_{ij}.$$ Hence the dual formulation of \eqref{eq:frame} is
\begin{equation}\label{eq:dualframe}\theta_i=\sum_ja_{ij}dx_j.\end{equation}

Theorem. [Cartan Structural Equations] Let $E_1$, $E_2$, $E_3$ be a frame field on $\mathbb{E}^3$ with dual 1-forms $\theta_1$, $\theta_2$, $\theta_3$ and connection forms $\omega_{ij}$, $i,j=1,2,3$. Then

  1. The First Structural Equations: $$d\theta_i=\sum_j\omega_{ij}\wedge\theta_j.$$
  2. The Second Structural Equations: $$d\omega_{ij}=\sum_k\omega_{ik}\wedge\omega_{kj}.$$

Proof. The exterior derivative of \eqref{eq:dualframe} is
$$d\theta_i=\sum_jda_{ij}\wedge dx_j.$$ Since $\omega=dA\cdot{}^tA$ and ${}^tA=A^{-1}$ (recall that $A$ is an orthogonal matrix), $dA=\omega\cdot A$, i.e.
$$da_{ij}=\sum_k\omega_{ik}a_{kj}.$$
So,
\begin{align*}
d\theta_i&=\sum_j\left\{\left(\sum_k\omega_{ik}a_{kj}\right)\wedge dx_j\right\}\\
&=\sum_k\left\{\omega_{ik}\wedge\sum_j a_{kj}dx_j\right\}\\
&=\sum_k\omega_{ik}\wedge\theta_k.
\end{align*}

From $\omega=dA\cdot{}^tA$,
\begin{equation}\label{eq:connectform}\omega_{ij}=\sum_kda_{ik}a_{jk}.\end{equation}
The exterior derivative of \eqref{eq:connectform} is
\begin{align*}
d\omega_{ij}&=\sum_k da_{jk}\wedge d_{ik}\\
&=-\sum_k da_{ik}\wedge da_{jk},
\end{align*}
i.e.
\begin{align*}
d\omega&=-dA\wedge{}^t(dA)\\
&=-(\omega\cdot A)\cdot({}^tA\cdot{}^t\omega)\\
&=-\omega\cdot (A\cdot{}^tA)\cdot{}^t\omega\\
&=-\omega\cdot{}^t\omega\ \ \ (A\cdot{}^tA=I)\\
&=\omega\cdot\omega.\ \ \ (\mbox{$\omega$ is skew-symmetric.})
\end{align*}
This is equivalent to the second structural equations.

Example. [Structural Equations for the Spherical Frame Field] Let us first calculate the dual forms and connection forms.

From the spherical coordinates
\begin{align*}
x_1&=\rho\cos\varphi\cos\theta,\\
x_2&=\rho\cos\varphi\sin\theta,\\
x_3&=\rho\sin\varphi,
\end{align*}
we obtain differentials
\begin{align*}
dx_1&=\cos\varphi\cos\theta d\rho-\rho\sin\varphi\cos\theta d\varphi-\rho\cos\varphi\sin\theta d\theta,\\
dx_2&=\cos\varphi\sin\theta d\rho-\rho\sin\varphi\sin\theta d\varphi+\rho\cos\varphi\cos\theta d\theta,\\
dx_3&=\sin\varphi d\rho+\rho\cos\varphi d\varphi.
\end{align*}
From the spherical frame field $F_1$, $F_2$, $F_3$ discussed here, we find its attitude matrix
$$A=\begin{pmatrix}
\cos\varphi\cos\theta & \cos\varphi\sin\theta & \sin\varphi\\
-\sin\theta & \cos\theta & 0\\
-\sin\varphi\cos\theta & -\sin\varphi\sin\theta & \cos\varphi
\end{pmatrix}.$$
Thus by (2) we find the dual 1-forms
\begin{align*}
\begin{pmatrix}
\theta_1\\
\theta_2\\
\theta_3
\end{pmatrix}&=\begin{pmatrix}
\cos\varphi\cos\theta & \cos\varphi\sin\theta & \sin\varphi\\
-\sin\theta & \cos\theta & 0\\
-\sin\varphi\cos\theta & -\sin\varphi\sin\theta & \cos\varphi
\end{pmatrix}\begin{pmatrix}
dx_1\\
dx_2\\
dx_3
\end{pmatrix}\\
&=\begin{pmatrix}
d\rho\\
\rho\cos\theta d\theta\\
\rho d\varphi
\end{pmatrix}.
\end{align*}
\begin{align*}
&dA=\\
&\begin{bmatrix}
-\sin\varphi\cos\theta d\varphi-\cos\varphi\sin\theta d\theta & -\sin\varphi\sin\theta d\varphi+\cos\varphi\cos\theta d\theta & \cos\varphi d\varphi\\
-\cos\theta d\theta & -\sin\theta d\theta & 0\\
-\cos\varphi\cos\theta d\varphi+\sin\varphi\sin\theta d\theta & -\cos\varphi\sin\theta d\varphi-\sin\varphi\sin\theta d\theta & -\sin\varphi d\varphi
\end{bmatrix}\end{align*}
and so,
\begin{align*}
\omega&=\begin{pmatrix}
0 & \omega_{12} & \omega_{13}\\
-\omega_{12} & 0 & \omega_{23}\\
-\omega_{13} & -\omega_{23} & 0
\end{pmatrix}\\
&=dA\cdot{}^tA\\
&=\begin{pmatrix}
0 & \cos\varphi d\theta & d\varphi\\
-\cos\varphi d\theta & 0 & \sin\varphi d\theta\\
-d\varphi & -\sin\varphi d\theta & 0
\end{pmatrix}.
\end{align*}
From these dual 1-forms and connections forms one can immediately verify the first and the second structural equations.

Polynomial Functions and Models

A polynomial is a function of the form
$$P(x)=a_nx^n+a_{n-1}x^{n-1}+\cdots+a_1x+a_0.$$ The number $n$ is called the degree of the polynomial $P(x)$. The term $a_nx^n$ is called the leading term and $a_n$ is called the leading coefficient. The number $a_0$ is called the constant term. $P(x)$ is called linear if $n=1$, quadratic if $n=2$, cubic if $n=3$, quartic if $n=4$, quintic if $n=5$, sextic if $n=6$, septic if $n=7$, and so on so forth. You don’t really have to worry about memorizing these jargons. Names are less important. However you need to remember at least what the degree is and what the leading coefficient is.

The Leading Term Test

There is a pattern for the long term behavior of a polynomial, i.e. the behavior of a polynomial when $x\to\infty$ or $x\to -\infty$. The behavior can be characterized as follows.

  • $n=\mbox{even}$ and $a_n>0$:

Example. $f(x)=3x^4-2x^3+3$

  • $n=\mbox{even}$ and $a_n<0$:

Example. $f(x)=-x^6+x^5-4x^3$

  • $n=\mbox{odd}$ and $a_n>0$:

Example. $f(x)=x^5+\frac{1}{4}x+1$

  • $n=\mbox{odd}$ and $a_n<0$:


Example. $f(x)=-5x^3-x^2+4x+2$


Finding zeros of a polynomial $P(x)$

By factoring, solve the equation $P(x)=0$. The solutions are the zeros of $P(x)$.

Example. Find the zeros of $P(x)=x^3+2x^2-5x-6$.

Solution. \begin{align*}
P(x)&=x^3+2x^2-5x-6\\
&=(x^3+x^2)+(x^2-5x-6)\ \mbox{(grouping)}\\
&=x^2(x+1)+(x-6)(x+1)\\
&=(x+1)(x^2+x-6)\\
&=(x+1)(x+3)(x-2).
\end{align*}
Hence, $P(x)$ has zeros $x=-3,-1,2$.

How do we determine whether $x=a$ is a zero of a polynomial $P(x)$?

To only check whether $x=a$ is a zero of $P(x)$, you don’t really have to factor $P(x)$. This is what you need to know. If $P(a)=0$, then $x=a$ is a zero of the polynomial $P(x)$.

Example. Consider $P(x)=x^3+x^2-17x+15$. Determine whether each of numbers 2 and $-5$ is a zero of $P(x)$.

Solution. $P(2)=(2)^3+(2)^2-17(2)+15=-7$, so $x=2$ is not a zero. $P(-5)=(-5)^3+(-5)^2-17(-5)+15=0$, so $x=-5$ is a zero of $P(x)$.

Even and Odd Multiplicity

Even and odd multiplicity is an important property for sketching the graph of a polynomial function. Suppose that $k$ is the largest integer such that $(x-c)^k$ is a factor of $P(x)$. The number $k$ is called the multiplicity of the factor $x-c$.

  1. If $k$ is odd, the graph of $P(x)$ crosses the $x$-axis at $(c,0)$.
  2. If $k$ is even, then the graph of $P(x)$ is tangent to the $x$-axis, i.e. touches the $x$-axis without crossing at $(c,0)$.

Example. Consider $f(x)=x^2(x+3)^2(x-4)(x+1)^4$. The factors $x$ and $x+3$ have multiplicity 2 and the factor $x+1$ has multiplicity 4. Hence the graph of $f(x)$ touches the $x$-axis without crossing at $x=0$, $x=-3$ and $x=-1$. The factor $x-4$ has multiplicity 1, so the graph crosses the $x$-axis at $x=4$. This is also shown in the following figure.

Legendre Functions II: Recurrence Relations and Special Properties

In this lecture, we derive some important recurrence relations of Legendre functions and use them to show that Legendre functions are indeed solutions of a differential equation, called Legendre’s differential equation.

Differentiating the generating function
$$g(x,t)=(1-2xt+t^2)^{-1/2}=\sum_{n=0}^\infty P_n(x)t^n,\ |t|<1\ \ \ \ \ \mbox{(1)}$$
with respect to $t$, we get
\begin{align*}
\frac{\partial g(x,t)}{\partial t}&=\frac{x-t}{(1-2xt+t^2)^{3/2}}\ \ \ \ \ \mbox{(2)}\\&=\sum_{n=0}^\infty nP_n(x)t^{n-1}.\ \ \ \ \ \mbox{(3)}\end{align*}
(2) can be written as
$$\frac{x-t}{(1-2xt+t^2)(1-2xt+t^2)^{1/2}}=\frac{(x-t)(1-2xt+t^2)^{-1/2}}{1-2xt+t^2}.$$
By (1) and (3), we obtain
$$(x-t)\sum_{n=0}^\infty P_n(x)t^n=(1-2xt+t^2)\sum_{n=0}^\infty nP_n(x) t^{n-1}$$ or
$$(1-2xt+t^2)\sum_{n=0}^\infty nP_n(x) t^{n-1}+(t-x)\sum_{n=0}^\infty P_n(x)t^n=0$$
which can be written out as
\begin{align*}
\sum_{n=0}^\infty nP_n(x)t^{n-1}-\sum_{n=0}^\infty &2xnP_n(x)t^n+\sum_{n=0}^\infty nP_n(x)t^{n+1}\\&+\sum_{n=0}^\infty P_n(x)t^{n+1}-\sum_{n=0}^\infty xP_n(x)t^n=0.\ \ \ \ \ \mbox{(4)}\end{align*}
In (4) replace $n$ by $n+1$ in the first term, and replace $n$ by $n-1$ in the third and fourth term. Then (4) becomes
\begin{align*}
\sum_{n=0}^\infty (n+1)P_{n+1}(x)t^n-\sum_{n=0}^\infty &2xnP_n(x)t^n+\sum_{n=0}^\infty (n-1)P_{n-1}(x)t^n\\&+\sum_{n=0}^\infty P_{n-1}(x)t^n-\sum_{n=0}^\infty xP_n(x)t^n=0.
\end{align*}
This can be simplified to
$$\sum_{n=0}^\infty[(n+1)P_{n+1}(x)-(2n+1)xP_n(x)+nP_{n-1}(x)]t^n=0$$
which implies that
$$(2n+1)xP_n(x)=(n+1)P_{n+1}(x)+nP_{n-1}(x).\ \ \ \ \ \mbox{(5)}$$
The recurrence relation (5) can be used to calculate Legendre polynomials. For example, we found $P_0(x)=1$ and $P_1(x)=x$ here. For $n=1$, (5) is
$$3xP_1(x)=2P_2(x)+P_0(x)$$
i.e.
$$P_2(x)=\frac{1}{2}(3x^2-1).$$
Continuing this using the recurrence relation (5), we obtain
\begin{align*}
P_3(x)&=\frac{1}{2}(5x^3-3x),\\
P_4(x)&=\frac{1}{8}(35x^4-30x^2+3),\\
P_5(x)&=\frac{1}{8}(63x^5-70x^3+15x),\\
\cdots.
\end{align*}
A great advantage of having the recurrence relation (5) is that one can easily calculate Legendre polynomials using a computer with a simple programming. This can be easily done for instance in Maxima.

Let us load the following simple program to run the recurrence relation (5).

(%i1) Legendre(n,x):=block ([],
if n = 0 then 1
else
if n = 1 then x
else  ((2*n – 1)*x*Legendre(n – 1, x)-(n – 1)*Legendre(n – 2,x))/n);

(%o1) Legendre(n, x) := block([], if n = 0 then 1
else (if n = 1 then x else ((2 n – 1) x Legendre(n – 1, x)
– (n – 1) Legendre(n – 2, x))/n))

Now we are ready to calculate Legendre polynomials. For example, let us calculate $P_3(x)$.

(%i2) Legendre(3,x);

The output is not exactly what we may like because it is not simplified.

In Maxima, simplification can be done by the command ratsimp.

(%i3) ratsimp(Legendre(3,x));

The output is

That looks better. Let us calculate one more, say $P_4(x)$.

Now we differentiate $g(x,t)$ with respect to $x$.
$$\frac{\partial g(x,t)}{\partial x}=\frac{t}{(1-2xt+t^2)^{3/2}}=\sum_{n=0}^\infty P_n'(x)t^n.$$
From this we obtain
$$(1-2xt+t^2)\sum_{n=0}^\infty P_n'(x)t^n-t\sum_{n=0}^\infty P_n(x)t^n=0$$
which leads to
$$P_{n+1}'(x)+P_{n-1}'(x)=2xP_n'(x)+P_n(x).\ \ \ \ \ \mbox{(6)}$$
Add 2 times $\frac{d}{dx}(5)$ to $2n+1$ times (6). Then we get
$$(2n+1)P_n=P_{n+1}'(x)-P_{n-1}'(x).\ \ \ \ \ \mbox{(7)}$$
$\frac{1}{2}[(6)+(7)]$ results
$$P_{n+1}'(x)=(n+1)P_n(x)+xP_n'(x).\ \ \ \ \ \mbox{(8)}$$
$\frac{1}{2}[(6)-(7)]$ results
$$P_{n-1}'(x)=-nP_n(x)+xP_n'(x).\ \ \ \ \ \mbox{(9)}$$
Replace $n$ by $n-1$ in (7) and add the result to $x$ times (9):
$$(1-x^2)P_n'(x)=nP_{n-1}(x)-nxP_n(x).\ \ \ \ \ \mbox{(10)}$$
Differentiate (10) with respect to $x$ and add the result to $n$ times (9):
$$(1-x^2)P_n^{\prime\prime}(x)-2xP_n'(x)+n(n+1)P_n(x)=0.\ \ \ \ \ \mbox{(11)}$$
The linear second-order differential equation (11) is called Legendre’s differential equation and as seen $P_n(x)$ satisfies (11). This is why $P_n(x)$ is called a Legendre polynomial.

In physics (11) is often expressed in terms of differentiation with respect to $\theta$. Let $x=\cos\theta$. Then by the chain rule,
\begin{align*}
\frac{dP_n(\cos\theta)}{d\theta}&=-\sin\theta\frac{dP_n(x)}{dx},\ \ \ \ \ \mbox{(12)}\\ \frac{d^2P_n(\cos\theta)}{d\theta^2}&=-x\frac{dP_n(x)}{dx}+(1-x^2)\frac{d^2P_n(x)}{dx^2}.\ \ \ \ \ \mbox{(13)}
\end{align*}
Using (12) and (13), Legendre’s differential equation (11) can be written as
$$\frac{1}{\sin\theta}\frac{d}{d\theta}\left[\sin\theta\frac{dP_n(\cos\theta)}{d\theta}\right]+n(n+1)P_n(\cos\theta)=0.$$

Tensors I

Tensors may be considered as a generalization of vectors and covectors. They are extremely important quantities for studying differential geometry and physics.

Let $M^n$ be an $n$-dimensional differentiable manifold. For each $x\in M^n$, let $E_x=T_xM^n$, i.e. the tangent space to $M^n$ at $x$. We denote the canonical basis of $E$ by $\partial=\left(\frac{\partial}{\partial x^1},\cdots,\frac{\partial}{\partial x^n}\right)$ and its dual basis by $\sigma=dx=(dx^1,\cdots,dx^n)$, where $x^1,\cdots,x^n$ are local coordinates. The canonical basis $\frac{\partial}{\partial x^1},\cdots,\frac{\partial}{\partial x^1}$ also simply denoted by $\partial_1,\cdots,\partial_n$.

Covariant Tensors

Definition. A covariant tensor of rank $r$ is a multilinear real-valued function
$$Q:E\times E\times\cdots\times E\longrightarrow\mathbb{R}$$
of $r$-tuples of vectors. A covariant tensor of rank $r$ is also called a tensor of type $(0,r)$ or shortly $(0,r)$-tensor. Note that the values of $Q$ must be independent of the basis in which the components of the vectors are expressed. A covariant vector (also called covector or a 1-form) is a covariant tensor of rank 1. An important of example of covariant tensor of rank 2 is the metric tensor $G$:
$$G(v,w)=\langle v,w\rangle=\sum_{i,j}g_{ij}v^iw^j.$$

In componenents, by multilinearity
\begin{align*}
Q(v_1\cdots,v_r)&=Q\left(\sum_{i_1}v_1^{i_1}\partial_{i_1},\cdots,\sum_{i_r}v_r^{i_r}\partial_{i_r}\right)\\
&=\sum_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}Q(\partial_{i_1},\cdots,\partial_{i_r}).
\end{align*}
Denote $Q(\partial_{i_1},\cdots,\partial_{i_r})$ by $Q_{i_1,\cdots,i_r}$. Then
$$Q(v_1\cdots,v_r)=\sum_{i_1,\cdots,i_r}Q_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}.\ \ \ \ \ \mbox{(1)}$$
Using the Einstein’s convention, (1) can be shortly written as
$$Q(v_1\cdots,v_r)=Q_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}.$$
The set of all covariant tensors of rank $r$ forms a vector space over $\mathbb{R}$. The number of components in such a tensor is $n^r$. The vector space of all covariant $r$-th rank tensors is denoted by
$$E^\ast\otimes E^\ast\otimes\cdots\otimes E^\ast=\otimes^r E^\ast.$$

If $\alpha,\beta\in E^\ast$, i.e. covectors, we can form the 2nd rank covariant tensor, the tensor product $\alpha\otimes\beta$ of $\alpha$ and $\beta$: Define $\alpha\otimes\beta: E\times E\longrightarrow\mathbb{R}$ by
$$\alpha\otimes\beta(v,w)=\alpha(v)\beta(w).$$
If we write $\alpha=a_idx^i$ and $\beta=b_jdx^j$, then
$$(\alpha\otimes\beta)_{ij}=\alpha\otimes\beta(\partial_i,\partial_j)=\alpha(\partial_i)\beta(\partial_j)=a_ib_j.$$

Contravariant Tensors

A contravariant vector, i.e. an element of $E$ can be considered as a linear functional $v: E^\ast\longrightarrow\mathbb{R}$ defined by
$$v(\alpha)=\alpha(v)=a_iv^i,\ \alpha=a_idx^i\in E^\ast.$$

Definition. A contravariant tensor of rank $s$ is a multilinear real-valued function $T$ on $s$-tuples of covectors
$$T:E^\ast\times E^\ast\times\cdots\times E^\ast\longrightarrow\mathbb{R}.$$ A contravariant tensor of rank $s$ is also called a tensor of type $(s,0)$ or shortly $(s,0)$-tensor.
For 1-forms $\alpha_1,\cdots,\alpha_s$
$$T(\alpha_1,\cdots,\alpha_s)=a_{1_{i_1}}\cdots a_{s_{i_s}}T^{i_1\cdots i_s}$$
where
$$T^{i_1\cdots i_s}:=T(dx^{i_1},\cdots,dx^{i_s}).$$
The space of all contravariant tensors of rank $s$ is denoted by
$$E\otimes E\otimes\cdots\otimes E:=\otimes^s E.$$
Contravariant vectors are contravariant tensors of rank 1. An example of a contravariant tensor of rank 2 is the inverse of the metric tensor $G^{-1}=(g^{ij})$:
$$G^{-1}(\alpha,\beta)=g^{ij}a_ib_j.$$

Given a pair $v,w$ of contravariant vectors, we can form the tensor product $v\otimes w$ in the same manner as we did for covariant vectors. It is the 2nd rank contravariant tensor with components $(v\otimes w)^{ij}=v^jw^j$. The metric tensor $G$ and its inverse $G^{-1}$ may be written as
$$G=g_{ij}dx^i\otimes dx^j\ \mbox{and}\ G^{-1}=g^{ij}\partial_i\otimes\partial_j.$$

Mixed Tensors

Definition. A mixed tensor, $r$ times covariant and $s$ times contravariant, is a real multilinear function $W$
$$W: E^\ast\times E^\ast\times\cdots\times E^\ast\times E\times E\times\cdots\times E\longrightarrow\mathbb{R}$$
on $s$-tuples of covectors and $r$-tuples of vectors. It is also called a tensor of type $(s,r)$ or simply $(s,r)$-tensor. By multilinearity
$$W(\alpha_1,\cdots,\alpha_s, v_1,\cdots, v_r)=a_{1_{i_1}}\cdots a_{s_{i_s}}W^{i_1\cdots i_s}{}_{j_1\cdots j_r}v_1^{j_1}\cdots v_r^{j_r}$$
where
$$W^{i_1\cdots i_s}{}_{j_1\cdots j_r}:=W(dx^{i_1},\cdots,dx^{i_s},\partial_{j_1},\cdots,\partial_{j_r}).$$

A 2nd rank mixed tensor may arise from a linear operator $A: E\longrightarrow E$. Define $W_A: E^\ast\times E\longrightarrow\mathbb{R}$ by $W_A(\alpha,v)=\alpha(Av)$. Let $A=(A^i{}_j)$ be the matrix associated with $A$, i.e. $A(\partial_j)=\partial_i A^i{}_j$. Let us calculate the component of $W_A$:
$$W_A^i{}_j=W_A(dx^i,\partial_j)=dx^i(A(\partial_j))=dx^i(\partial_kA^k{}_j)=\delta^i_kA^k{}_j=A^i{}_j.$$
So the matrix of the mixed tensor $W_A$ is just the matrix associated with $A$. Conversely, given a mixed tensotr $W$, once convariant and once contravariant, we can define a linear transformation $A$ such that $W(\alpha,v)=\alpha(A,v)$. We do not distinguish between a linear transformation $A$ and its associated mixed tensor $W_A$. In components, $W(\alpha,v)$ is written as
$$W(\alpha,v)=a_iA^i{}_jv^j=aAv.$$

The tensor product $w\otimes\beta$ of a vector and a covector is the mixed tensor defined by
$$(w\otimes\beta)(\alpha,v)=\alpha(w)\beta(v).$$ The associated transformation is can be written as
$$A=A^i{}_j\partial_i\otimes dx^j=\partial_i\otimes A^i{}_jdx^j.$$

For math undergraduates, different ways of writing indices (raising, lowering, and mixed) in tensor notations can be very confusing. Main reason is that in standard math courses such as linear algebra or elementary differential geometry (classical differential geometry of curves and surfaces in $\mathbb{E}^3$) the matrix of a linear transformation is usually written as $A_{ij}$. Physics undergraduates don’t usually get a chance to learn tensors in undergraduate physics courses. In order to study more advanced differential geometry or physics such as theory of special and general relativity, and field theory one must be able to distinguish three different ways of writing matrices $A_{ij}$, $A^{ij}$, and $A^i{}_j$. To summarize, $A_{ij}$ and $A^{ij}$ are bilinear forms on $E$ and $E^\ast$, respectively that are defined by
$$A_{ij}v^iv^j\ \mbox{and}\ A^{ij}a_ib_j\ (\mbox{respectively}).$$ $A^i{}_j$ is the matrix of a linear transformation $A: E\longrightarrow E$.

Let $(E,\langle\ ,\ \rangle)$ be an inner product space. Given a linear transformation $A: E\longrightarrow E$ (i.e. a mixed tensor), one can associate a bilinear covariant bilinear form $A’$ by
$$A'(v,w):=\langle v,Aw\rangle=v^ig_{ij}A^j{}_k w^k.$$ So we see that the matrix of $A’$ is
$$A’_{ik}=g_{ij}A^j{}_k.$$ The process can be said as “we lower the index $j$, making it a $k$, by mans of the metric tensor $g_{ij}$.” In tensor analysis one uses the same letter, i.e. instead of $A’$, one writes
$$A_{ik}:=g_{ij}A^j{}_k.$$ This is clearly a covariant tensor. In general, the components of the associated covariant tensor $A_{ik}$ differ from those of the mixed tensor $A^i{}_j$. But if the basis is orthonormal, i.e. $g_{ij}=\delta^i_j$ then they coincide. That is the reason why we simply write $A_{ij}$ without making any distiction in linear algebra or in elementary differential geometry.

Similarly, one may associate to the linear transformation $A$ a contravariant bilinear form
$$\bar A(\alpha,\beta)=a_iA^i{}_jg^{jk}b_k$$ whose matrix components can be written as
$$A^{ik}=A^i{}_jg^{jk}.$$

Note that the metric tensor $g_{ij}$ represents a linear map from $E$ to $E^\ast$, sending the vector with components $v^j$ into the covector with components $g_{ij}v^j$. In quantum mechanics, the covector $g_{ij}v^j$ is denoted by $\langle v|$ and called a bra vector, while the vector $v^j$ is denoted by $|v\rangle$ and called a ket vector. Usually the inner product on $E$
$$\langle\ ,\ \rangle:E\times E\longrightarrow\mathbb{R};\ \langle v,w\rangle=g_{ij}v^iw^j$$ is considered as a covariant tensor of rank 2. But in quantum mechanics $\langle v,w\rangle$ is not considered as a covariant tensor $g_{ij}$ of rank 2 acting on a pair of vectors $(v,w)$, rather it is regarded as the braket $\langle v|w\rangle$, a bra vector $\langle v|$ acting on a ket vector $|w\rangle$.