The Curvature

In this note, we study different notions of curvatures of a Riemannian or a pseudo-Riemannian $n$-manifold $M$ with metric tensor $g_{ij}$. This note is intended mainly for students of physics. Hence, we will discuss only local expressions of curvatures as those are the ones we mostly use for doing physics in general relativity.

First we need to introduce the Christoffel symbols $\Gamma_{ij}^k$. The Christoffel symbols are associated with the differentiation of vector fields in a Riemannian or a pseudo Riemannian manifold $M$, called the Levi-Civita connection. The Levi-Civita connection $\nabla$ is a generalization of the covariant derivative of vector fields in the Euclidean space. Locally the Levi-civita connection is defined by $$\nabla_{\frac{\partial}{\partial x^i}}\frac{\partial}{\partial x^j}=\sum_{k}\Gamma_{ij}^k\frac{\partial}{\partial x^k}$$ and the Christoffel symbol is given by $$\Gamma_{ij}^k=\frac{1}{2}\sum_\ell g^{k\ell}\left\{\frac{\partial g_{j\ell}}{\partial x^i}+\frac{\partial g_{\ell i}}{\partial x^j}-\frac{\partial g_{ij}}{\partial x^\ell}\right\}$$ where $g^{k\ell}$ is the inverse of the metric tensor.

Locally the Riemann curvature tensor $R_{ijk}^\ell$ is given by $$R_{ijk}^\ell=\frac{\partial}{\partial x^j}\Gamma_{ik}^\ell-\frac{\partial}{\partial x^k}\Gamma_{ij}^\ell+\sum_p\left\{\Gamma_{jp}^\ell\Gamma_{ik}^p-\Gamma_{kp}^\ell\Gamma_{ij}^p\right\}$$

Locally the sectional curvature $K(X,Y)$ of $M$ with respect to the plane spanned by tangent vectors $X,Y\in T_pM$ is given by \begin{equation}\label{eq:sectcurv}K_p(X,Y)=g^{ii}R_{iji}^j\end{equation} assuming that $X,Y\in\mathrm{span}\left\{\frac{\partial}{\partial x^i},\frac{\partial}{\partial x^j}\right\}$. The sectional curvature is a generalization of the Gaußian curvature of a surface in 3-space. To see this, let $\varphi: M^2\longrightarrow M^3$ be a conformal parametric surface $M^2$ immersed in 3-space $M^3$ with metric $e^{u(x,y)}(dx^2+dy^2)$. The Gaußian curvature $K$ of $\varphi$ can be calculated using the formula (due to Karl Friedrich Gauß) $$K=\frac{\ell n-m^2}{EG-F^2}$$ where \begin{align*}E&=\langle\varphi_x,\varphi_x\rangle,\ F=\langle\varphi_x,\varphi_y\rangle,\ G=\langle\varphi_y,\varphi_y\rangle,\\\ell&=\langle\varphi_{xx},N\rangle,\ m=\langle\varphi_{xy},N\rangle,\ n=\langle\varphi_{yy},N\rangle\end{align*} Here, $\langle\ ,\ \rangle$ stands for the inner product induced by the conformal metric $e^{u(x,y)}(dx^2+dy^2)$ and $N$ is the unit normal vector field on $\varphi$. The Gaußian curvature is then obtained as the Liouville’s partial differential equation \begin{equation}\label{eq:liouville}\nabla^2 u=-2Ke^u\end{equation} On the other hand, using \eqref{eq:sectcurv} we find the sectional curvature of $\varphi$ to be $$g^{11}R_{121}^2=-\frac{e^{u(x,y)}}{2}\nabla^2u$$ which coincides with the Gaußian curvature $K$ from \eqref{eq:liouville}

Example. Let us compute the sectional curvature of the hyperbolic plane $$\mathbb{H}^2=\{(x,y)\in\mathbb{R}^2: y>0\}$$ with metric $$ds^2=\frac{dx^2+dy^2}{y^2}$$

The metric tensor is $(g_{ij})=\begin{pmatrix}\frac{1}{y^2} & 0\\0 & \frac{1}{y^2}\end{pmatrix}$. The Riemann curvature tensor $R_{121}^2$ is \begin{align*}R_{121}^2&=\frac{\partial}{\partial y}\Gamma_{11}^2-\frac{\partial}{\partial x}\Gamma_{12}^2+\sum_p\{\Gamma_{2p}^p\Gamma_{11}^p-\Gamma_{1p}^2\Gamma_{12}^p\}\\&=\frac{\partial}{\partial y}\Gamma_{11}^2-\frac{\partial}{\partial x}\Gamma_{12}^2+\Gamma_{21}^2\Gamma_{11}^1-\Gamma_{11}^2\Gamma_{12}^1+\Gamma_{22}^2\Gamma_{11}^2-\Gamma_{12}^2\Gamma_{12}^2\end{align*} We find the Christoffel symbols $$\Gamma_{11}^2=\frac{1}{y},\ \Gamma_{12}^1=-\frac{1}{y},\ \Gamma_{12}^2=0,\ \Gamma_{21}^2=0,\ \Gamma_{22}^2=-\frac{1}{y}$$ Thus we obtain $R_{121}^2=-\frac{1}{y^2}$ and hence $\mathbb{H}^2$ has the constant negative sectional curvature $$K=g^{11}R_{121}^2=y^2\left(-\frac{1}{y^2}\right)=-1$$ What is the shortest path connecting two points $(x_1,y_1)$ and $(x_2,y_2)$ in $\mathbb{H}^2$? Such shortest paths are called geodesics in differential geometry. To find out what a geodesic in $\mathbb{H}^2$ looks like, let $$J=\int_{(x_1,y_1)}^{(x_2,y_2)}ds=\int_{(x_1,y_1)}^{(x_2,y_2)}\frac{\sqrt{1+y_x^2}}{y}dx$$ where $y_x=\frac{dy}{dx}$. The shortest path would satisfy the Euler-Lagrange equation \begin{equation}\label{eq:E-L}\frac{\partial f}{\partial x}-\frac{d}{dx}\left(f-y_x\frac{\partial f}{\partial y_x}\right)=0\end{equation}with $f(y,y_x,x)=\frac{\sqrt{1+y_x^2}}{y}$. Since $f$ does not depend on $x$, $\frac{\partial f}{\partial x}=0$ and the Euler-Lagrange equation \eqref{eq:E-L} becomes $$\frac{d}{dx}\left[\frac{1}{y\sqrt{1+y_x^2}}\right]=0$$ i.e. \begin{equation}\label{eq:E-L2}\frac{1}{y\sqrt{1+y_x^2}}=C\end{equation} where $C$ is a constant. The equation \eqref{eq:E-L2} results in a separable differential equation $$\frac{dy}{dx}=\frac{\sqrt{r^2-y^2}}{y}$$ where $r^2=\frac{1}{C}$. The solution of this equation is $$(x-a)^2+y^2=r^2$$ where $a$ is a constant. Since $y>0$, the solution represents an equation of upper semi circle centered at $(a,0)$ with radius $r$, that is the shortest path (geodesic) between two points $(x_1,y_1)$ and $(x_2,y_2)$ in $\mathbb{H}^2$ is a part of an upper semicircle joining them. In particular, if $x_1=x_2$, the geodesic between $(x_1,y_1)$ and $(x_2,y_2)$ is the vertical line passing through the two points. Such a vertical line can still be considered as an upper semicircle with radius $\infty$.

Geodesics in Hyperbolic Plane

Two other notions of curvatures are Ricci and scalar curvatures. The Ricci curvature tensor is given by $$\mathrm{Ric}_p\left(\frac{\partial}{\partial x^i},\frac{\partial}{\partial x^j}\right)=\sum_kR_{ikj}^k$$ We usually denote $\mathrm{Ric}_p\left(\frac{\partial}{\partial x^i},\frac{\partial}{\partial x^j}\right)$ simply by $R_{ij}$. The scalar curvature $\mathrm{Scal}(p)$ is given by $$\mathrm{Scal}(p)=\sum_{i}g^{ii}R_{ii}$$ The scalar curvature can be given, in terms of the sectional curvature, by $$\mathrm{Scal}(p)=\sum_{i\ne j}K_p\left(\frac{\partial}{\partial x^i},\frac{\partial}{\partial x^j}\right)$$ The scalar curvature is usually denoted by $R$ in general relativity.

Definition. A Riemannian or a pseudo-Riemannian manifold $(M,g)$ is said to be maximally symmetric if $(M,g)$ has constant sectional curvature $\kappa$.

Theorem. If a Riemannian or a pseudo-Riemannian manifold $(M,g)$ is maximally symmetric, then $$R_{ii}=\kappa(n-1)g_{ii}$$ where $\kappa$ is the constant sectional curvature of $(M,g)$ and $n=\dim M$.

Corollary. If $(M,g)$ has the constant sectional curvature $\kappa$, then $$\mathrm{Scal}(p)=n(n-1)\kappa$$ where $n=\dim M$.

Riemannian and Pseudo-Riemannian Manifolds

This note is intended particularly for students of physics who have never had any prior encounter with differential geometry. Hence, I try to maintain mathematical rigor and technicalities at a minimum when I discuss differential geometric concepts, instead mostly use hand-waving and rudimentary arguments with emphases on physical ideas and intuition.

The notion of (pseudo-)Riemannian manifolds plays an important role in studying general relativity. But first, what is a manifold? A manifold is, very roughly speaking, a space which locally looks like our space (Euclidean space). In other words, for any point $p$ in a manifold $M$ there exists a neighborhood (called a coordinate neighborhood) $U$ of $p$ such that $U\cong\mathbb{R}^n$. Here $\cong$ means they are homeomorphic i.e. topologically indistinguishable. Such a property is said to be locally Euclidean and a space which is locally $\mathbb{R}^n$ is called an $n$-dimensional manifold. Actually being locally Euclidean is not the only condition for a space to be a manifold but that is the most important property of a manifold for physicists.

Figure 1, A manifold

Figure 1 shows a manifold $M$, two coordinate neighborhood $U$ and $V$ with homeomorphisms $\phi$ and $\psi$, respectively. Why do we need manifolds by the way? In order to do physics, we need coordinates. Without coordinates we can’t write equations of motion. Unfortunately, even for a simple familiar space there is no guarantee that there will be a global coordinate system. Here is an example.

Example. The points $(x,y,z)$ on the 2-sphere $S^2$ are represented in terms of the spherical coordinates $(\theta,\phi)$ as $$x=\sin\phi\cos\theta,\ y=\sin\phi\sin\theta,\ z=\cos\phi,\ 0\leq\phi\leq\pi,\ 0\leq\theta\leq 2\pi$$ Using the chain rule, we can write the standard basis $\frac{\partial}{\partial\theta}$, $\frac{\partial}{\partial\phi}$ for the tangent space $T_\ast S^2$ in spherical coordinates in terms of the standard basis $\frac{\partial}{\partial x}$, $\frac{\partial}{\partial y}$, $\frac{\partial}{\partial y}$ in rectangular coordinates as \begin{align*}\frac{\partial}{\partial\theta}&=-\sin\phi\sin\theta\frac{\partial}{\partial x}+\sin\phi\cos\theta\frac{\partial}{\partial y}\\\frac{\partial}{\partial\phi}&=\cos\phi\cos\theta\frac{\partial}{\partial x}+\cos\phi\sin\theta\frac{\partial}{\partial y}-\sin\phi\frac{\partial}{\partial z}\end{align*} This frame field is not globally defined on $S^2$ because $\frac{\partial}{\partial\theta}=0$ at $\phi=0,\pi$ i.e. at the north pole $N=(0,0,1)$ and at the south pole $S=(0,0,-1)$ as also seen in Figure 2.

Figure 2. The 2-sphere with frame field

The 2-sphere $S^2$ is covered by two coordinates neighborhoods $U=S^2\setminus{N}$ and $V=S^2\setminus{S}$, each of which is identified with $\mathbb{R}^2$, the Euclidean plane via the stereographic projection. Figure 3 shows the stereographic projection from the north pole $N$, which is a one-to-one correspondence from $U$ to $\mathbb{R}^2$.

Figure 3. The Stereographic Projection

A global coordinate system exists in the flat Euclidean space (or a flat pseudo-Euclidean space including Minkowski spacetime), however general relativity has taught us that a physical space is not necessarily a flat space (vaccum spacetime). This is where a manifold comes in. A manifold guarantees the existence coordinate system at least locally and for most cases that is good enough to do physics in particular we write physical equations in a coordinate independent way, so that if a physical equation holds in on coordinate neighborhood, it should also hold in another coordinate neighborhood in the same way.

We would be needing more than topological manifolds to do physics. For an obvious reason we need differentiable manifolds. I am not going to delve into this except for just saying that a differentiable manifold is a manifold on which the differentiability of functions and vector fields can be defined and also to which tangent space at each point can be considered. (If we can’t differentiate fields, we cannot do physics.) In addition, we need Riemannian manifolds. A Riemannian manifold is a differentiable manifold with a Riemannian metric. So what is a Riemannian metric? A Riemannian metric $g$ is a positive definite bilinear symmetric form $g_p: T_pM\times T_pM\longrightarrow\mathbb{R}$, which induces a positive definite inner product on each tangent space $T_pM$. In a  coordinate neighborhood, the metric $g$ can be locally given by \begin{equation}\label{eq:metric}g=g_{ij}dx^i\otimes dx^j\end{equation}Here we are using the Einstein’s summation convention. The  $n\times n$ matrix $(g_{ij})$ is called a metric tensor and physicists often simply write $g_{ij}$ for the metric tensor, not for the component. Since $g_{ij}$ is a symmetric tensor, it can be diagonalized. Since the metric is preserved under diagonalization (which amounts to a change of coordinates), without loss of generality we may assume that $g_{ij}=0$ if $i\ne j$ so that the metric tensor \eqref{eq:metric} is written as\begin{equation}\label{eq:metric2}g=g_{ii}dx^i\otimes dx^i\end{equation}Let the dimension of $M$ be $n$. Then each tangent space $T_pM$ is an $n$-dimensional vector space with the canonical orthonormal basis $\left(\frac{\partial}{\partial x^1}\right)_p,\cdots,\left(\frac{\partial}{\partial x^n}\right)_p$. Thus any tangent vector $v\in T_pM$ can be written as $$v=v^j\left(\frac{\partial}{\partial x^j}\right)_p$$ The differential 1-forms $d^i$ are the duals of $\frac{\partial}{\partial x^i}$, respectively. $$dx^i\left(\frac{\partial}{\partial x^j}\right)=\delta_{ij}$$ and hence $$dx^i(v)=v^i$$ For any two tangent vectors $v,w\in T_pM$ using \eqref{eq:metric2} we obtain \begin{equation}\label{eq:metric3}g(v,w)=g_{ii}dx^i\otimes dx^i(v,w)=g_{ii}dx^i(v)dx^i(w)=g_{ii}v^iw^i\end{equation}\eqref{eq:metric3} shows how the metric $g$ induces an inner product on each tangent space $T_pM$. In doing physics, in particular general relativity, the physical space is often a pseudo-Riemannian manifold rather than a Riemannian manifold. A pseudo-Riemannian manifold is equipped with a pseudo-Riemannian metric which is an indefinite symmetric bilinear form. So the induced inner product is indefinite. A good example is the Minkowski spacetime $\mathbb{R}^{3+1}$ which is $\mathbb{R}^4$ with the Minkowski metric or the Lorentz-Minkowski metric \begin{equation}\label{eq:minkowski}g=-dt^2+dx^2+dy^2+dz^2\end{equation} The Minkowski metric \eqref{eq:minkowski} induces the inner product on $\mathbb{R}^{3+1}$ (The Minkowski spacetime has a single coordinate neighborhood $\mathbb{R}^{3+1}$ itself and every tangent space $T_p\mathbb{R}^{3+1}$ is isomorphic to $\mathbb{R}^{3+1}$, hence $\mathbb{R}^{3+1}$ is a manifold and at the same time it is also a vector space.) $$\langle v,w\rangle=-v^0w^0+v^1w^1+v^2w^2+v^3w^3$$ where $v=(v^0,v^1,v^2,v^3)$ and $w=(w^0,w^1,w^2,w^3)$ are four-vectors in $\mathbb{R}^{3+1}$.

In conclusion, I would like to emphasize that the metric tensor $g_{ij}$ is the most important ingredient of a Riemannian or a pseudo-Riemannian manifold. You can literally find out everything about the geometry of a Riemannian or a pseudo-Riemannian manifold with the metric tensor. With the metric tensor, you can also find out about what gravity does when there is matter (the source of gravity) present in the manifold.

Arc Length and Reparametrization

We have already discussed the length of a plane curve represented by the parametric equation ${\bf r}(t)=(x(t),y(t))$, $a\leq t\leq b$ here. The same goes for a space curve. Namely, if ${\bf r}(t)=(x(t),y(t),z(t))$, $a\leq t\leq b$, then its arc length $L$ is given by \begin{equation}\begin{aligned}L&=\int_a^b|{\bf r}'(t)|dt\\&=\int_a^b\sqrt{\left(\frac{dx}{dt}\right)^2+\left(\frac{dy}{dt}\right)^2+\left(\frac{dz}{dt}\right)^2}dt\end{aligned}\label{eq:spacearclength}\end{equation}

Example. Find the length of the arc of the circular helix $${\bf r}(t)=\cos t{\bf i}+\sin t{\bf j}+t{\bf k}$$ from the point $(1,0,0)$ to the point $(1,0,2\pi)$.

Solution. ${\bf r}'(t)=-\sin t{\bf i}+\cos t{\bf j}+{\bf k}$ so we have $$|{\bf r}'(t)|=\sqrt{(-\sin t)^2+(\cos t)^2+1^2}=\sqrt{2}$$ The arc is going from $(1,0,0)$ to $(1,0,2\pi)$ and the $z$-component of ${\bf r}(t)$ is $t$, so $0\leq t\leq 2\pi$. Now, using \eqref{eq:spacearclength}, we obtain $$L=\int_0^{2\pi}|{\bf r}'(t)|dt=\int_0^{2\pi}\sqrt{2}dt=2\sqrt{2}\pi$$ Figure 1 shows the circular helix from $t=0$ to $t=2\pi$.

Figure 1, A circular helix

Given a curve ${\bf r}(t)$, $a\leq t\leq b$, sometimes we need to reparametrize it by another parameter $s$ for various reasons. Imagine that the curve represents the path of a particle moving in space. A reparametrization does not change the path of the particle (hence nor the distance it traveled) but it changes the particle’s speed! To see this, let $t=t(s)$, $\alpha\leq s\leq\beta$, $a=t(\alpha)$, $b=t(\beta)$ be an increasing and differentiable function. Since $t=t(s)$ is one-to-one and onto, ${\bf r}(t)$ and ${\bf r}(t(s))$, its reparametrization by the parameter $s$, represent the same path. By the chain rule, \begin{equation}\label{eq:reparametrization}\frac{d}{ds}{\bf r}(t(s))=\frac{d}{dt}{\bf r}(t)\frac{dt}{ds}\end{equation} Thus we see that the speed of the reparametrization $\left|{\bf r}(t(s))\right|$ differs from that of ${\bf r}(t)$ by a factor of $\left|\frac{dt}{ds}\right|=\frac{dt}{ds}$ (since $\frac{dt}{ds}>0$). However, the arc length of the reparametrization is \begin{align*}\int_{\alpha}^{\beta}\left|\frac{d}{ds}{\bf r}(t(s))\right|ds&=\int_{\alpha}^{\beta}\left|\frac{d}{dt}{\bf r}(t)\right|\frac{dt}{ds}ds\\&=\int_a^b\left|\frac{d}{dt}{\bf r}(t)\right|dt=L\end{align*} That is, no change of the distance.

There is a particular reparametrization we are interested. To discuss that, suppose ${\bf r}(t)$, $a\leq t\leq b$ be a differentiable curve in space such that ${\bf r}'(t)\ne 0$ for all $t$. Such a curve is said to be regular or smooth. Let us now define the arc length function \begin{equation}\label{eq:arclengthfunction}s(t)=\int_a^t|{\bf r}'(u)|du\end{equation} By the Fundamental Theorem of Calculus, we have \begin{equation}\label{eq:arclengthfunction2}\frac{ds}{dt}=|{\bf r}'(t)|>0\end{equation} and so the arc length function $s=s(t)$ is increasing. This means that $s(t)$ is one-to-one and onto, so it is invertible. It’s inverse function can be written as $t=t(s)$ and ${\bf r}(t(s))$ is called the reparamerization by arc length. The reason we are interested in this particular reparametrization is that it results in the unit speed: From \eqref{eq:reparametrization} and \eqref{eq:arclengthfunction2}, $$\left|\frac{d}{ds}{\bf r}(t(s))\right|=|{\bf r}'(t)|\left|\frac{dt}{ds}\right|=|{\bf r}'(t)|\frac{1}{|{\bf r}'(t)|}=1$$ So it is also called the unit-speed reparametrization. The reparametrization by arc length plays an important role in defining the curvature of a curve. This will be discussed elsewhere.

Example. Reparametrize the helix ${\bf r}(t)=\cos t{\bf i}+\sin t{\bf j}+t{\bf k}$ by arc length measured from $(1,0,0)$in the direction of increasing $t$.

Solution. The initial point $(1,0,0)$ corresponds to $t=0$. From the previous example, we know that the helix has the constant speed $|{\bf r}'(t)|=\sqrt{2}$. Thus, $$s(t)=\int_0^t|{\bf r}'(u)|du=\sqrt{2}t$$ Hence, we obtain $t=\frac{s}{\sqrt{2}}$. The reparametrization is then given by $${\bf r}(t(s))=\cos\left(\frac{s}{\sqrt{2}}\right){\bf i}+\sin\left(\frac{s}{\sqrt{2}}\right){\bf j}+\frac{s}{\sqrt{2}}{\bf k}$$

Examples in this note have been taken from [1].

References.

[1] Calculus, Early Transcendentals, James Stewart, 6th Edition, Thompson Brooks/Cole

Derivatives and Integrals of Vector-Valued Functions

The derivative $\frac{d{\bf r}}{dt}={\bf r}'(t)$ of a vector-valued function ${\bf r}(t)=(x(t),y(t),z(t))$ is defined by \begin{equation}\label{eq:vectorderivative}\frac{d{\bf r}}{dt}=\lim_{\Delta t\to 0}\frac{{\bf r}(t+\Delta t)-{\bf r}(t)}{\Delta t}\end{equation} In case of a scalar-valued function or a real-valued function, the geometric meaning of derivative is that it is the slope of tangent line. In case of a vector-valued function, the geometric meaning of derivative is that it is the tangent vector. It can be easily seen from Figure 1. As $\Delta t$ gets smaller and smaller, $\frac{{\bf r}(t+\Delta t)-{\bf r}(t)}{\Delta t}$ gets closer to a line tangent to ${\bf r}(t)$.

Figure 1. The derivative of a vector-valued function

From the definition \eqref{eq:vectorderivative}, it is straightforward to show \begin{equation}\label{eq:vectorderivative2}{\bf r}'(t)=(x'(t),y'(t),z'(t))\end{equation}

If ${\bf r}'(t)\ne 0$, the unit tangent vector ${\bf T}(t)$ of ${\bf r}(t)$ is given by \begin{equation}\label{eq:unittangent}{\bf T}(t)=\frac{{\bf r}'(t)}{|{\bf r}'(t)|}\end{equation}

Example.

  1. Find the derivative of ${\bf r}(t)=(1+t^3){\bf i}+te^{-t}{\bf j}+\sin 2t{\bf k}$.
  2. Find the unit tangent vector when $t=0$.

Solution.

  1. Using \eqref{eq:vectorderivative2}, we have $${\bf r}'(t)=3t^2{\bf i}+(1-t)e^{-t}{\bf j}+2\cos 2t{\bf k}$$
  2. ${\bf r}'(0)={\bf j}+2{\bf k}$ and $|{\bf r}'(0)|=\sqrt{5}$. So by \eqref{eq:unittangent}, we have $${\bf T}(0)=\frac{{\bf r}'(0)}{|{\bf r}'(0)|}=\frac{1}{\sqrt{5}}{\bf j}+\frac{2}{\sqrt{5}}{\bf k}$$
Figure 2. The vector-valued function r(t) (in blue) and r'(0) (in red)

The following theorem is a summary of differentiation rules for vector-valued functions. We omit the proofs of these rules. They can be proved straightforwardly from differentiation rules for real-valued functions. Note that there are three different types of the product rule or the Leibniz rule for vector-valued functions (rules 3, 4, and 5).

Theorem. Let ${\bf u}(t)$ and ${\bf v}(t)$ be differentiable vector-valued functions, $f(t)$ a scalar function, and $c$ a scalar. Then

  1. $\frac{d}{t}[{\bf u}(t)+{\bf v}(t)]={\bf u}'(t)+{\bf v}'(t)$
  2. $\frac{d}{dt}[c{\bf u}(t)]=c{\bf u}'(t)$
  3. $\frac{d}{dt}[f(t){\bf u}(t)]=f'(t){\bf u}(t)+f(t){\bf u}'(t)$
  4. $\frac{d}{dt}[{\bf u}(t)\cdot{\bf v}(t)]={\bf u}'(t)\cdot{\bf v}(t)+{\bf u}(t)\cdot{\bf v}'(t)$
  5. $\frac{d}{dt}[{\bf u}(t)\times{\bf v}(t)]={\bf u}'(t)\times{\bf v}(t)+{\bf u}(t)\times{\bf v}'(t)$
  6. $\frac{d}{dt}[{\bf u}(f(t))]=f'(t){\bf u}'(f(t))$ (Chain Rule)

Example. Show that if $|{\bf r}(t)|=c$ (a constant), then ${\bf r}'(t)$ is orthogonal to ${\bf r}(t)$ for all $t$.

Proof. Differentiating ${\bf r}(t)\cdot{\bf r}(t)=|{\bf r}(t)|^2=c^2$ using the Leibniz rule 5, we obtain $$0=\frac{d}{dt}[{\bf r}(t)\cdot{\bf r}(t)]={\bf r}'(t){\bf r}(t)+{\bf r}(t){\bf r}'(t)=2{\bf r}'(t)\cdot{\bf r}(t)$$ This implies that ${\bf r}'(t)$ is orthogonal to ${\bf r}(t)$.

As seen in \eqref{eq:vectorderivative2}, the derivative of a vector-valued functions is obtained by differentiating component-wise. The indefinite and definite integral of a vector-valued function are done similarly by integrating component-wise, namely\begin{equation}\label{eq:vectorintegral}\int{\bf r}(t)dt=\left(\int x(t)dt\right){\bf i}+\left(\int y(t)dt\right){\bf j}+\left(\int z(t)dt\right){\bf k}\end{equation} and \begin{equation}\label{eq:vectorintegral2}\int_a^b{\bf r}(t)dt=\left(\int_a^b x(t)dt\right){\bf i}+\left(\int_a^b y(t)dt\right){\bf j}+\left(\int_a^b z(t)dt\right){\bf k}\end{equation}, respectively. When evaluate the definite integral of a vector-valued function, one can use \eqref{eq:vectorintegral2} but it would be easier to first find the indefinite integral using \eqref{eq:vectorintegral} and then evaluate the definite integral the Fundamental Theorem of Calculus (and yes, the Fundamental Theorem of Calculus still works for vector-valued functions).

Example. Find $\int_0^{\frac{\pi}{2}}{\bf r}(t)dt$ if ${\bf r}(t)=2\cos t{\bf i}+\sin t{\bf j}+2t{\bf k}$.

Solution. \begin{align*}\int{\bf r}(t)dt&=\left(\int 2\cos tdt\right){\bf i}+\left(\int \sin tdt\right){\bf j}+\left(\int 2tdt\right){\bf k}\\&=2\sin t{\bf i}-\cos t{\bf j}+t^2{\bf k}+{\bf C}\end{align*} where ${\bf C}$ is a vector-valued constant of integration. Now, $$\int_0^{\frac{\pi}{2}}{\bf r}(t)dt=[2\sin t{\bf i}-\cos t{\bf j}+t^2{\bf k}]_0^{\frac{\pi}{2}}=2{\bf i}+{\bf j}+\frac{\pi^2}{4}{\bf k}$$

Examples in this note have been taken from [1].

References.

[1] Calculus, Early Transcendentals, James Stewart, 6th Edition, Thompson Brooks/Cole

Lines and Planes in Space

You remember from algebra that a line in the plane can be determined by its slope and a point on the line or two points on the line (in which case those two points determine the slope). For a space line, slope is not a suitable ingredient. It’s alternative ingredient is a vector parallel to the line. As shown in Figure 1, with a point ${\bf r}_0=(x_0,y_0,z_0)$ on the line $L$ and a vector ${\bf v}=(a,b,c)$ parallel to $L$, we can determine any point ${\bf r}=(x,y,z)$ on $L$ by vector addition of ${\bf r}_0$ and $t{\bf v}$, a dilation of ${\bf v}$: \begin{equation}\label{eq:spaceline}{\bf r}={\bf r}_0+t{\bf v}\end{equation} where $-\infty<t<\infty$. The equation \eqref{eq:spaceline} is called a vector equation of $L$.

Figure 1, A space line

In terms of the components, \eqref{eq:spaceline} can be written as \begin{equation}\begin{aligned}x&=x_0+at\\y&=y_0+bt\\z&=z_0+ct\end{aligned}\label{eq:spaceline2}\end{equation} where $-\infty<t<\infty$. The equations in \eqref{eq:spaceline2} are called parametric equations of $L$. Solving the parametric equations in \eqref{eq:spaceline2} for $t$, we also obtain \begin{equation}\label{eq:spaceline3}\frac{x-x_0}{a}=\frac{y-y_0}{b}=\frac{z-z_0}{c}\end{equation} The equations in \eqref{eq:spaceline3} are called symmetric equations of $L$. Often we need to work with a line segment. The vector equation \eqref{eq:spaceline} can be used to figure out an equation of a line segment. Consider the line segment from ${\bf r}_0$ to ${\bf r}_1$. Then the difference ${\bf r}_1-{\bf r}_0$ is a vector parallel to the line segment and by \eqref{eq:spaceline}, we have \begin{align*}{\bf r}(t)&={\bf r}_0+t({\bf r}_1-{\bf r}_0)\\&=(1-t){\bf r}_0+t{\bf r}_1\end{align*} This still represents an infinite line through ${\bf r}_0$ and ${\bf r}_1$. By limiting the range of $t$ to represent only the line segment from ${\bf r}_0$ to ${\bf r}_1$, we obtain an equation of the line segment \begin{equation}\label{eq:linesegment}{\bf r}(t)=(1-t){\bf r}_0+t{\bf r}_1\end{equation} where $0\leq t\leq 1$.

Example.

  1. Find a vector equation and parametric equations for the line that passes through the point $(5,1,3)$ and is parallel to the vector ${\bf i}+4{\bf j}-2{\bf k}$.
  2. Find two other points on the line.

Solution.

  1. ${\bf r}_0=(5,1,3)$ and ${\bf v}=(1,4,-2)$. Hence by \eqref{eq:spaceline}, we have $${\bf r}(t)=(5,1,3)+t(1,4,-2)=(5+t,1+4t,3-2t)$$ Parametric equations are $$x(t)=5+t,\ y(t)=1+4t,\ z(t)=3-2t$$
  2. From the parametric equations, for example, $t=1$ gives $(6,5,1)$ and $t=-1$ gives $(4,-3,5)$.

Example.

  1. Find parametric equations and symmetric equations of the line that passes through the points $A(2,4,-3)$ and $B(3,-1,1)$.
  2. At what point does this line intersect the $xy$-plane?

Solution.

  1. Note that the vector ${\bf v}=\overrightarrow{AB}=(1, -5, 4)$ is parallel to the line. So $a=1$, $b=-5$, and $c=4$. By taking ${\bf r}_0=(x_0,y_0,z_0)=(2,4,-3)$, we have the parametric equations $$x=2+t,\ y=4-5t,\ z=-3+4t$$ Symmetric equations are then $$\frac{x-2}{1}=\frac{y-4}{-5}=\frac{z+3}{4}$$
  2. The line intersects the $xy$-plane when $z=0$. By setting $z=0$ in the symmetric equations from part 1, we get $$\frac{x-2}{1}=\frac{y-4}{-5}=\frac{3}{4}$$ Solving these equations for $x$ and $y$ respectively, we find $x=\frac{11}{4}$ and $y=\frac{1}{4}$.

As we have seen, a line is determined by a point on the line and a vector parallel to the line. A plane, on the other hand, can be determined by a point ${\bf r}_0$ on the plane and a vector ${\bf n}$ perpendicular to the plane (such a vector is called a normal vector to the plane). See Figure 2.

Figure 2. A plane

From Figure 2, we see that \begin{equation}\label{eq:plane}{\bf n}\cdot({\bf r}-{\bf r}_0)=0\end{equation} The equation \eqref{eq:plane} is called a vector equation of the plane. If ${\bf n}=(a,b,c)$, ${\bf r}=(x,y,z)$, and ${\bf r}_0=(x_0,y_0,z_0)$, then the equation \eqref{eq:plane} can be written as \begin{equation}\label{eq:plane2}a(x-x_0)+b(y-y_0)+c(z-z_0)=0\end{equation}

Example. Find an equation of the plane through the point $(2,4,-1)$ with normal vector ${\bf n}=(2,3,4)$. Find the intercepts and sketch the plane.

Solution. $a=2$, $b=3$, $c=4$, $x_0=2$, $y_0=4$, and $z_0=-1$. So by the equation \eqref{eq:plane2}, we have $$2(x-2)+3(y-4)+4(z+1)=0$$ or $$2x+3y+4z=12$$ To find the $x$-intercept, set $y=z=0$ in the equation and we get $x=6$. Similarly, we find the $y$- and $z$-intercepts $y=4$ and $z=3$, respectively. Figure 3 shows the plane.

Figure 3. Plane 2x+3y+4z=12

Example. Find an equation of the plane that passes through the points $P(1,3,2)$, $Q(3,-1,6)$, and $R(5,2,0)$.

Solution. The vectors $\overrightarrow{PQ}=(2,-4,4)$ and $\overrightarrow{PR}=(4,-1,-2)$ are on the plane, so the cross product $$\overrightarrow{PQ}\times\overrightarrow{PR}=\begin{vmatrix}{\bf i} & {\bf j} & {\bf k}\\2 & -4 & 4\\4 & -1 & -2\end{vmatrix}=12{\bf i}+20{\bf j}+14{\bf k}$$ is a normal vector to the plane. With $(x_0,y_0,z_0)=(1,3,2)$ and $(a,b,c)=(12,20,14)$, we find an equation of the plane $$12(x-1)+20(y-3)+14(z-2)=0$$ or $$6x+10y+7z=50$$

Using basic geometry, we see that the angle between two planes $P_1$ and $P_2$ is the same as the angle between their respective normal vectors ${\bf n}_1$ and ${\bf n}_2$. See Figure 4 where cross sections of two planes $P_1$ and $P_2$ are shown.

Figure 4. The angle between two planes P1 and P2

Example.

  1. Find the angle between the planes $x+y+z=1$ and $x-2y+3z=1$.
  2. Find the symmetric equations for the line of intersection $L$ of these two planes

Solution.

  1. The normal vectors of these planes are ${\bf n}_1=(1,1,1)$ and ${\bf n}_2=(1,-2,3)$. Let $\theta$ be the angle between ${\bf n}_1$ and ${\bf n}_2$ . Then $$\cos\theta=\frac{{\bf n}_1\cdot{\bf n }_2}{|{\bf n}_1||{\bf n}_2|}=\frac{1(1)+1(-2)+1(3)}{\sqrt{1^2+1^2+1^2}\sqrt{1^2+(-2)^2+3^2}}=\frac{2}{\sqrt{42}}$$ Thus, $$\theta=\cos^{-1}\left(\frac{2}{\sqrt{42}}\right)\approx 72^\circ$$
  2. First, let us find a point on $L$. Set $z=0$. Then we have $x+y=1$ and $x-2y=1$. Solving these two equations simultaneously we find $x=1$ and $y=0$. So $(1,1,0)$ is on the line $L$. Now we need a vector parallel to the line $L$. The cross product $${\bf n}_1\times{\bf n}_2=\begin{vmatrix}{\bf i} & {\bf j} & {\bf k}\\1 & 1 & 1\\1 & -2 & 3\end{vmatrix}=5{\bf i}-2{\bf j}-3{\bf k}$$ is perpendicular to both ${\bf n}_1$ and ${\bf n}_2$, hence it is parallel to $L$. Therefore, the symmetric equations of $L$ are $$\frac{x-1}{5}=\frac{y}{-5}=\frac{z}{-3}$$

Let us find the distance $D$ from a point $Q(x_1,y_1,z_1)$ to the plane $ax+by+cz+d=0$. See Figure 5.

Figure 5. The distance from a point to a plane

Let $P(x_0,y_0,z_0)$ be a point in the plane and let ${\bf b}=\overrightarrow{PQ}=(x_1-x_0,y_1-y_0,z_1-z_0)$. Then from Figure 5, we see that the distance from $Q$ to the plane is given by the scalar projection of ${\bf b}$ onto the normal vector ${\bf n}$: \begin{align*}D&=|\mathrm{comp}_{\bf n}{\bf b}|=\frac{|{\bf b}\cdot{\bf n}|}{|{\bf n}|}\\&=\frac{|a(x_1-x_0)+b(y_1-y_0)+c(z_1-z_0)|}{\sqrt{a^2+b^2+c^2}}\\&=\frac{|ax_1+by_1+cz_1+d|}{\sqrt{a^2+b^2+c^2}}\end{align*} The last expression is obtained by the fact that $ax_0+by_0+cz_0=-d$. Therefore, the distance $D$ from a point $Q(x_1,y_1,z_1)$ to the plane $ax+by+cz+d=0$ is \begin{equation}\label{eq:distance2plane}D=\frac{|ax_1+by_1+cz_1+d|}{\sqrt{a^2+b^2+c^2}}\end{equation}

Example. Find the distance between the parallel planes $10x+2y-2z=5$ and $5x+y-z=1$.

Solution. One can easily see that the two planes are parallel because their respective normal vectors $(10,2,-2)$ and $(5,1,-1)$ are parallel. To find the distance between the planes, first one will have to find a point in one plane and then use the formula \eqref{eq:distance2plane} to find the distance. Let us find a point in the plane $10x+2y-2z=5$. One can, for instance, use the $x$-intercept, so let $y=z=0$. Then $10x=5$ i.e. $x=\frac{1}{2}$. The distance from $\left(\frac{1}{2},0,0\right)$ to the plane $5x+y-z=1$ is $$D=\frac{\left|5\left(\frac{1}{2}\right)+1(0)-(0)\right|}{\sqrt{5^2+1^2+(-1)^2}}=\frac{\sqrt{3}}{6}$$

Examples in this note have been taken from [1].

References.

[1] Calculus, Early Transcendentals, James Stewart, 6th Edition, Thompson Brooks/Cole