Mercator: A Connection with Torsion


In most presentations of Riemannian geometry, e.g. O’Neill (1983) and Wikipedia, the fundamental theorem of Riemannian geometry (“the miracle of Riemannian geometry”) is given: that for any semi-Riemannian manifold there is a unique torsion-free metric connection. I assume partly because of this and partly because the major application of Riemannian geometry is General Relativity, connections with torsion are given little if any attention.

It turns out we are all very familiar with a connection with torsion: the Mercator projection. Some mathematical physics texts, e.g. Nakahara (2003), allude to this but leave the details to the reader. Moreover, this connection respects the metric induced from Euclidean space.

We use SageManifolds to assist with the calculations. We hint at how this might be done more slickly in Haskell.

A Cartographic Aside

%matplotlib inline
/Applications/SageMath/local/lib/python2.7/site-packages/traitlets/ DeprecationWarning: A parent of InlineBackend._config_changed has adopted the new @observe(change) API
  clsname, change_or_name), DeprecationWarning)
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import cartopy
import as ccrs
from cartopy.mpl.ticker import LongitudeFormatter, LatitudeFormatter
plt.figure(figsize=(8, 8))

ax = plt.axes(




We can see Greenland looks much broader at the North than in the middle. But if we use a polar projection (below) then we see this is not the case. Why then is the Mercator projection used in preference to e.g. the polar projection or the once controversial Gall-Peters – see here for more on map projections.

plt.figure(figsize=(8, 8))

bx = plt.axes(

bx.set_extent([-180, 180, 90, 50], ccrs.PlateCarree())





This is written as an Jupyter notebook. In theory, it should be possible to run it assuming you have installed at least sage and Haskell. To publish it, I used

jupyter-nbconvert --to markdown Mercator.ipynb
pandoc -s -t markdown+lhs -o Mercator.lhs \
       --filter pandoc-citeproc --bibliography DiffGeom.bib
BlogLiteratelyD --wplatex Mercator.lhs > Mercator.html

Not brilliant but good enough.

Some commands to jupyter to display things nicely.

%display latex
viewer3D = 'tachyon'

Warming Up With SageManifolds

Let us try a simple exercise: finding the connection coefficients of the Levi-Civita connection for the Euclidean metric on \mathbb{R}^2 in polar co-ordinates.

Define the manifold.

N = Manifold(2, 'N',r'\mathcal{N}', start_index=1)

Define a chart and frame with Cartesian co-ordinates.

ChartCartesianN.<x,y> = N.chart()
FrameCartesianN = ChartCartesianN.frame()

Define a chart and frame with polar co-ordinates.

ChartPolarN.<r,theta> = N.chart()
FramePolarN = ChartPolarN.frame()

The standard transformation from Cartesian to polar co-ordinates.

cartesianToPolar = ChartCartesianN.transition_map(ChartPolarN, (sqrt(x^2 + y^2), arctan(y/x)))
Change of coordinates from Chart (N, (x, y)) to Chart (N, (r, theta))

\displaystyle       \left\{\begin{array}{lcl} r & = & \sqrt{x^{2} + y^{2}} \\ \theta & = & \arctan\left(\frac{y}{x}\right) \end{array}\right.

cartesianToPolar.set_inverse(r * cos(theta), r * sin(theta))
Check of the inverse coordinate transformation:
   x == x
   y == y
   r == abs(r)
   theta == arctan(sin(theta)/cos(theta))

Now we define the metric to make the manifold Euclidean.

g_e = N.metric('g_e')
g_e[1,1], g_e[2,2] = 1, 1

We can display this in Cartesian co-ordinates.


\displaystyle       g_e = \mathrm{d} x\otimes \mathrm{d} x+\mathrm{d} y\otimes \mathrm{d} y

And we can display it in polar co-ordinates


\displaystyle       g_e = \mathrm{d} r\otimes \mathrm{d} r + \left( x^{2} + y^{2} \right) \mathrm{d} \theta\otimes \mathrm{d} \theta

Next let us compute the Levi-Civita connection from this metric.

nab_e = g_e.connection()

\displaystyle       \nabla_{g_e}

If we use Cartesian co-ordinates, we expect that \Gamma^k_{ij} = 0, \forall i,j,k. Only non-zero entries get printed.


Just to be sure, we can print out all the entries.


\displaystyle       \left[\left[\left[0, 0\right], \left[0, 0\right]\right], \left[\left[0, 0\right], \left[0, 0\right]\right]\right]

In polar co-ordinates, we get


\displaystyle       \begin{array}{lcl} \Gamma_{ \phantom{\, r } \, \theta \, \theta }^{ \, r \phantom{\, \theta } \phantom{\, \theta } } & = & -\sqrt{x^{2} + y^{2}} \\ \Gamma_{ \phantom{\, \theta } \, r \, \theta }^{ \, \theta \phantom{\, r } \phantom{\, \theta } } & = & \frac{1}{\sqrt{x^{2} + y^{2}}} \\ \Gamma_{ \phantom{\, \theta } \, \theta \, r }^{ \, \theta \phantom{\, \theta } \phantom{\, r } } & = & \frac{1}{\sqrt{x^{2} + y^{2}}} \end{array}

Which we can rew-rewrite as

\displaystyle   \begin{aligned}  \Gamma^r_{\theta,\theta} &= -r \\  \Gamma^\theta_{r,\theta} &= 1/r \\  \Gamma^\theta_{\theta,r} &= 1/r  \end{aligned}

with all other entries being 0.

The Sphere

We define a 2 dimensional manifold. We call it the 2-dimensional (unit) sphere but it we are going to remove a meridian to allow us to define the desired connection with torsion on it.

S2 = Manifold(2, 'S^2', latex_name=r'\mathbb{S}^2', start_index=1)

\displaystyle       \mathbb{S}^2

To start off with we cover the manifold with two charts.

polar.<th,ph> = S2.chart(r'th:(0,pi):\theta ph:(0,2*pi):\phi'); print(latex(polar))

\displaystyle       \left(\mathbb{S}^2,({\theta}, {\phi})\right)

mercator.<xi,ze> = S2.chart(r'xi:(-oo,oo):\xi ze:(0,2*pi):\zeta'); print(latex(mercator))

\displaystyle       \left(\mathbb{S}^2,({\xi}, {\zeta})\right)

We can now check that we have two charts.


\displaystyle       \left[\left(\mathbb{S}^2,({\theta}, {\phi})\right), \left(\mathbb{S}^2,({\xi}, {\zeta})\right)\right]

We can then define co-ordinate frames.

epolar = polar.frame(); print(latex(epolar))

\displaystyle       \left(\mathbb{S}^2 ,\left(\frac{\partial}{\partial {\theta} },\frac{\partial}{\partial {\phi} }\right)\right)

emercator = mercator.frame(); print(latex(emercator))

\displaystyle       \left(\mathbb{S}^2 ,\left(\frac{\partial}{\partial {\xi} },\frac{\partial}{\partial {\zeta} }\right)\right)

And define a transition map and its inverse from one frame to the other checking that they really are inverses.

xy_to_uv = polar.transition_map(mercator, (log(tan(th/2)), ph))
xy_to_uv.set_inverse(2*arctan(exp(xi)), ze)
Check of the inverse coordinate transformation:
   th == 2*arctan(sin(1/2*th)/cos(1/2*th))
   ph == ph
   xi == xi
   ze == ze

We can define the metric which is the pullback of the Euclidean metric on \mathbb{R}^3.

g = S2.metric('g')
g[1,1], g[2,2] = 1, (sin(th))^2

And then calculate the Levi-Civita connection defined by it.

nab_g = g.connection()

\displaystyle       \begin{array}{lcl} \Gamma_{ \phantom{\, {\theta} } \, {\phi} \, {\phi} }^{ \, {\theta} \phantom{\, {\phi} } \phantom{\, {\phi} } } & = & -\cos\left({\theta}\right) \sin\left({\theta}\right) \\ \Gamma_{ \phantom{\, {\phi} } \, {\theta} \, {\phi} }^{ \, {\phi} \phantom{\, {\theta} } \phantom{\, {\phi} } } & = & \frac{\cos\left({\theta}\right)}{\sin\left({\theta}\right)} \\ \Gamma_{ \phantom{\, {\phi} } \, {\phi} \, {\theta} }^{ \, {\phi} \phantom{\, {\phi} } \phantom{\, {\theta} } } & = & \frac{\cos\left({\theta}\right)}{\sin\left({\theta}\right)} \end{array}

We know the geodesics defined by this connection are the great circles.

We can check that this connection respects the metric.


\displaystyle       \nabla_{g} g = 0

And that it has no torsion.


A New Connection

Let us now define an orthonormal frame.

ch_basis = S2.automorphism_field()
ch_basis[1,1], ch_basis[2,2] = 1, 1/sin(th)
e = S2.default_frame().new_frame(ch_basis, 'e')

\displaystyle       \left(\mathbb{S}^2, \left(e_1,e_2\right)\right)

We can calculate the dual 1-forms.

dX = S2.coframes()[2] ; print(latex(dX))

\displaystyle       \left(\mathbb{S}^2, \left(e^1,e^2\right)\right)

print(latex((dX[1], dX[2])))

\displaystyle       \left(e^1, e^2\right)

print(latex((dX[1][:], dX[2][:])))

\displaystyle       \left(\left[1, 0\right], \left[0, \sin\left({\theta}\right)\right]\right)

In this case it is trivial to check that the frame and coframe really are orthonormal but we let sage do it anyway.

print(latex(((dX[1](e[1]).expr(), dX[1](e[2]).expr()), (dX[2](e[1]).expr(), dX[2](e[2]).expr()))))

\displaystyle       \left(\left(1, 0\right), \left(0, 1\right)\right)

Let us define two vectors to be parallel if their angles to a given meridian are the same. For this to be true we must have a connection \nabla with \nabla e_1 = \nabla e_2 = 0.

nab = S2.affine_connection('nabla', latex_name=r'\nabla')

Displaying the connection only gives the non-zero components.


For safety, let us check all the components explicitly.


\displaystyle       \left[\left[\left[0, 0\right], \left[0, 0\right]\right], \left[\left[0, 0\right], \left[0, 0\right]\right]\right]

Of course the components are not non-zero in other frames.


\displaystyle       \begin{array}{lcl} \Gamma_{ \phantom{\, {\phi} } \, {\phi} \, {\theta} }^{ \, {\phi} \phantom{\, {\phi} } \phantom{\, {\theta} } } & = & \frac{\cos\left({\theta}\right)}{\sin\left({\theta}\right)} \end{array}


\displaystyle       \begin{array}{lcl} \Gamma_{ \phantom{\, {\xi} } \, {\xi} \, {\xi} }^{ \, {\xi} \phantom{\, {\xi} } \phantom{\, {\xi} } } & = & 2 \, \cos\left(\frac{1}{2} \, {\theta}\right)^{2} - 1 \\ \Gamma_{ \phantom{\, {\zeta} } \, {\zeta} \, {\xi} }^{ \, {\zeta} \phantom{\, {\zeta} } \phantom{\, {\xi} } } & = & \frac{2 \, \cos\left(\frac{1}{2} \, {\theta}\right) \cos\left({\theta}\right) \sin\left(\frac{1}{2} \, {\theta}\right)}{\sin\left({\theta}\right)} \end{array}

This connection also respects the metric g.


\displaystyle       \nabla g = 0

Thus, since the Levi-Civita connection is unique, it must have torsion.


\displaystyle       \frac{\cos\left({\theta}\right)}{\sin\left({\theta}\right)} e_2\otimes e^1\otimes e^2 -\frac{\cos\left({\theta}\right)}{\sin\left({\theta}\right)} e_2\otimes e^2\otimes e^1

The equations for geodesics are

\displaystyle   \ddot{\gamma}^k + \Gamma_{ \phantom{\, {k} } \, {i} \, {j} }^{ \, {k} \phantom{\, {i} } \phantom{\, {j} } }\dot{\gamma}^i\dot{\gamma}^j = 0

Explicitly for both variables in the polar co-ordinates chart.

\displaystyle   \begin{aligned}  \ddot{\gamma}^\phi & + \frac{\cos\theta}{\sin\theta}\dot{\gamma}^\phi\dot{\gamma}^\theta &= 0 \\  \ddot{\gamma}^\theta & &= 0  \end{aligned}

We can check that \gamma^\phi(t) = \alpha\log\tan t/2 and \gamma^\theta(t) = t are solutions although sage needs a bit of prompting to help it.

t = var('t'); a = var('a')
print(latex(diff(a * log(tan(t/2)),t).simplify_full()))

\displaystyle       \frac{a}{2 \, \cos\left(\frac{1}{2} \, t\right) \sin\left(\frac{1}{2} \, t\right)}

We can simplify this further by recalling the trignometric identity.

print(latex(sin(2 * t).trig_expand()))

\displaystyle       2 \, \cos\left(t\right) \sin\left(t\right)

print(latex(diff (a / sin(t), t)))

\displaystyle       -\frac{a \cos\left(t\right)}{\sin\left(t\right)^{2}}

In the mercator co-ordinates chart this is

\displaystyle   \begin{aligned}  \gamma^\xi(t) &= \alpha\log\tan t/2 \\   \gamma^\zeta(t) &= \log\tan t/2  \end{aligned}

In other words: straight lines.

Reparametersing with s = \alpha\log\tan t/2 we obtain

\displaystyle   \begin{aligned}  \gamma^\phi(s) &= s \\  \gamma^\theta(s) &= 2\arctan e^\frac{s}{\alpha}  \end{aligned}

Let us draw such a curve.

R.<t> = RealLine() ; print(R)
Real number line R
c = S2.curve({polar: [2*atan(exp(-t/10)), t]}, (t, -oo, +oo), name='c')

\displaystyle       \begin{array}{llcl} c:& \mathbb{R} & \longrightarrow & \mathbb{S}^2 \\ & t & \longmapsto & \left({\theta}, {\phi}\right) = \left(2 \, \arctan\left(e^{\left(-\frac{1}{10} \, t\right)}\right), t\right) \\ & t & \longmapsto & \left({\xi}, {\zeta}\right) = \left(-\frac{1}{10} \, t, t\right) \end{array}


\displaystyle       \mathrm{Hom}\left(\mathbb{R},\mathbb{S}^2\right)

c.plot(chart=polar, aspect_ratio=0.1)


It’s not totally clear this is curved so let’s try with another example.

d = S2.curve({polar: [2*atan(exp(-t)), t]}, (t, -oo, +oo), name='d')

\displaystyle       \begin{array}{llcl} d:& \mathbb{R} & \longrightarrow & \mathbb{S}^2 \\ & t & \longmapsto & \left({\theta}, {\phi}\right) = \left(2 \, \arctan\left(e^{\left(-t\right)}\right), t\right) \\ & t & \longmapsto & \left({\xi}, {\zeta}\right) = \left(-t, t\right) \end{array}

d.plot(chart=polar, aspect_ratio=0.2)


Now it’s clear that a straight line is curved in polar co-ordinates.

But of course in Mercator co-ordinates, it is a straight line. This explains its popularity with mariners: if you draw a straight line on your chart and follow that bearing or rhumb line using a compass you will arrive at the end of the straight line. Of course, it is not the shortest path; great circles are but is much easier to navigate.

c.plot(chart=mercator, aspect_ratio=0.1)


d.plot(chart=mercator, aspect_ratio=1.0)


We can draw these curves on the sphere itself not just on its charts.

R3 = Manifold(3, 'R^3', r'\mathbb{R}^3', start_index=1)
cart.<X,Y,Z> = R3.chart(); print(latex(cart))

\displaystyle       \left(\mathbb{R}^3,(X, Y, Z)\right)

Phi = S2.diff_map(R3, {
    (polar, cart): [sin(th) * cos(ph), sin(th) * sin(ph), cos(th)],
    (mercator, cart): [cos(ze) / cosh(xi), sin(ze) / cosh(xi),
                       sinh(xi) / cosh(xi)]
    name='Phi', latex_name=r'\Phi')

We can either plot using polar co-ordinates.

graph_polar = polar.plot(chart=cart, mapping=Phi, nb_values=25, color='blue')
show(graph_polar, viewer=viewer3D)


Or using Mercator co-ordinates. In either case we get the sphere (minus the prime meridian).

graph_mercator = mercator.plot(chart=cart, mapping=Phi, nb_values=25, color='red')
show(graph_mercator, viewer=viewer3D)


We can plot the curve with an angle to the meridian of \pi/2 - \arctan 1/10

graph_c = c.plot(mapping=Phi, max_range=40, plot_points=200, thickness=2)
show(graph_polar + graph_c, viewer=viewer3D)


And we can plot the curve at angle of \pi/4 to the meridian.

graph_d = d.plot(mapping=Phi, max_range=40, plot_points=200, thickness=2, color="green")
show(graph_polar + graph_c + graph_d, viewer=viewer3D)



With automatic differentiation and symbolic numbers, symbolic differentiation is straigtforward in Haskell.

> import Data.Number.Symbolic
> import Numeric.AD
> x = var "x"
> y = var "y"
> test xs = jacobian ((\x -> [x]) . f) xs
>   where
>     f [x, y] = sqrt $ x^2 + y^2
ghci> test [1, 1]

ghci> test [x, y]
  [[x/(2.0*sqrt (x*x+y*y))+x/(2.0*sqrt (x*x+y*y)),y/(2.0*sqrt (x*x+y*y))+y/(2.0*sqrt (x*x+y*y))]]

Anyone wishing to take on the task of producing a Haskell version of sagemanifolds is advised to look here before embarking on the task.

Appendix A: Conformal Equivalence

Agricola and Thier (2004) shows that the geodesics of the Levi-Civita connection of a conformally equivalent metric are the geodesics of a connection with vectorial torsion. Let’s put some but not all the flesh on the bones.

The Koszul formula (see e.g. (O’Neill 1983)) characterizes the Levi-Civita connection \nabla

\displaystyle   \begin{aligned}  2  \langle \nabla_X Y, Z\rangle & = X  \langle Y,Z\rangle + Y  \langle Z,X\rangle - Z  \langle X,Y\rangle \\  &-  \langle X,[Y,Z]\rangle +   \langle Y,[Z,X]\rangle +  \langle Z,[X,Y]\rangle  \end{aligned}

Being more explicit about the metric, this can be re-written as

\displaystyle   \begin{aligned}  2 g(\nabla^g_X Y, Z) & = X g(Y,Z) + Y g(Z,X) - Z g(X,Y) \\  &- g(X,[Y,Z]) +  g(Y,[Z,X]) + g(Z,[X,Y])  \end{aligned}

Let \nabla^h be the Levi-Civita connection for the metric h = e^{2\sigma}g where \sigma \in C^\infty M. Following [Gadea2010] and substituting into the Koszul formula and then applying the product rule

\displaystyle   \begin{aligned}  2 e^{2 \sigma} g(\nabla^h_X Y, Z) & = X  e^{2 \sigma} g(Y,Z) + Y e^{2 \sigma} g(Z,X) - Z e^{2 \sigma} g(X,Y) \\  & + e^{2 \sigma} g([X,Y],Z]) - e^{2 \sigma} g([Y,Z],X) + e^{2 \sigma} g([Z,X],Y) \\  & = 2 e^{2\sigma}[g(\nabla^{g}_X Y, Z) + X\sigma g(Y,Z) + Y\sigma g(Z,X) - Z\sigma g(X,Y)] \\  & = 2 e^{2\sigma}[g(\nabla^{g}_X Y + X\sigma Y + Y\sigma X - g(X,Y) \mathrm{grad}\sigma, Z)]  \end{aligned}

Where as usual the vector field, \mathrm{grad}\phi for \phi \in C^\infty M, is defined via g(\mathrm{grad}\phi, X) = \mathrm{d}\phi(X) = X\phi.

Let’s try an example.

nab_tilde = S2.affine_connection('nabla_t', r'\tilde_{\nabla}')
f = S2.scalar_field(-ln(sin(th)), name='f')
for i in S2.irange():
    for j in S2.irange():
        for k in S2.irange():
            nab_tilde.add_coef()[k,i,j] = \
                nab_g(polar.frame()[i])(polar.frame()[j])(polar.coframe()[k]) + \
                polar.frame()[i](f) * polar.frame()[j](polar.coframe()[k]) + \
                polar.frame()[j](f) * polar.frame()[i](polar.coframe()[k]) + \
                g(polar.frame()[i], polar.frame()[j]) * \
                polar.frame()[1](polar.coframe()[k]) * cos(th) / sin(th)

\displaystyle       \begin{array}{lcl} \Gamma_{ \phantom{\, {\theta} } \, {\theta} \, {\theta} }^{ \, {\theta} \phantom{\, {\theta} } \phantom{\, {\theta} } } & = & -\frac{\cos\left({\theta}\right)}{\sin\left({\theta}\right)} \end{array}

g_tilde = exp(2 * f) * g

\displaystyle       \mathcal{T}^{(0,2)}\left(\mathbb{S}^2\right)


\displaystyle       \left(\begin{array}{rr}      \frac{1}{\sin\left({\theta}\right)^{2}} & 0 \\      0 & 1      \end{array}\right)

nab_g_tilde = g_tilde.connection()

\displaystyle       \begin{array}{lcl} \Gamma_{ \phantom{\, {\theta} } \, {\theta} \, {\theta} }^{ \, {\theta} \phantom{\, {\theta} } \phantom{\, {\theta} } } & = & -\frac{\cos\left({\theta}\right)}{\sin\left({\theta}\right)} \end{array}

It’s not clear (to me at any rate) what the solutions are to the geodesic equations despite the guarantees of Agricola and Thier (2004). But let’s try a different chart.


\displaystyle       \left[\left[\left[0, 0\right], \left[0, 0\right]\right], \left[\left[0, 0\right], \left[0, 0\right]\right]\right]

In this chart, the geodesics are clearly straight lines as we would hope.


Agricola, Ilka, and Christian Thier. 2004. “The geodesics of metric connections with vectorial torsion.” Annals of Global Analysis and Geometry 26 (4): 321–32. doi:10.1023/B:AGAG.0000047509.63818.4f.

Nakahara, M. 2003. “Geometry, Topology and Physics.” Text 822: 173–204. doi:10.1007/978-3-642-14700-5.

O’Neill, B. 1983. Semi-Riemannian Geometry with Applications to Relativity, 103. Pure and Applied Mathematics. Elsevier Science.

Every Manifold is Paracompact


In their paper Betancourt et al. (2014), the authors give a corollary which starts with the phrase “Because the manifold is paracompact”. It wasn’t immediately clear why the manifold was paracompact or indeed what paracompactness meant although it was clearly something like compactness which means that every cover has a finite sub-cover.

It turns out that every manifold is paracompact and that this is intimately related to partitions of unity.

Most of what I have written below is taken from some hand-written anonymous lecture notes I found by chance in the DPMMS library in Cambridge University. To whomever wrote them: thank you very much.

Limbering Up

Let \{U_i : i \in {\mathcal{I}}\} be an open cover of a smooth manifold M. A partition of unity on M, subordinate to the cover \{U_i : i \in {\mathcal{I}}\} is a finite collection of smooth functions

\displaystyle   X_j : M^n \longrightarrow \mathbb{R}_+

where j = 1, 2, \ldots N for some N such that

\displaystyle   \sum_{j = 0}^N X_j(x) = 1 \,\mathrm{for all}\, x \in M

and for each j there exists i(j) \in {\mathcal{I}} such that

\displaystyle   {\mathrm{supp}}{X_j} \subset U_{i(j)}

We don’t yet know partitions of unity exist.

First define

\displaystyle   f(t) \triangleq  \begin{cases}  0            & \text{if } t \leq 0 \\  \exp{(-1/t)} & \text{if } t > 0 \\  \end{cases}

Techniques of classical analysis easily show that f is smooth (t=0 is the only point that might be in doubt and it can be checked from first principles that f^{(n)}(0) = 0 for all n).

Next define

\displaystyle   \begin{aligned}  g(t) &\triangleq \frac{f(t)}{f(t) + f(1 - t)} \\  h(t) &\triangleq g(t + 2)g(2 - t)  \end{aligned}

Finally we can define F: \mathbb{R}^n \rightarrow \mathbb{R} by F(x) = h(\|x\|). This has the properties

  • F(x) = 1 if \|x\| \leq 1
  • 0 \leq F(x) \leq 1
  • F(x) = 0 if \|x\| > 2

Now take a point p \in M centred in a chart (U_p, \phi_p) so that, without loss of generality, B(0,3) \subseteq \phi_p(U_p) (we can always choose r_p so that the open ball B(0,3r_p) \subseteq \phi'_p(U_p) and then define another chart (U_p, \phi_p) with \phi_p(x) = \phi'_p(x)/\|x\|).

Define the images of the open and closed balls of radius 1 and 2 respectively

\displaystyle   \begin{aligned}  V_p &= \phi_p^{-1}(B(0,1)) \\  W_p &= \phi_p^{-1}\big(\overline{B(0,2)}\big) \\  \end{aligned}

and further define bump functions

\displaystyle   \psi_p(y) \triangleq  \begin{cases}  F(\phi_p(y)) & \text{if } y \in U_p\\  0            & \text{otherwise} \\  \end{cases}

Then \psi_p is smooth and its support lies in W_p \subset U_p.

By compactness, the open cover \{V_p : p \in M\} has a finite subcover \{V_{p_1},\ldots,V_{p_K}\}. Now define

\displaystyle   X_j : M^n \longrightarrow \mathbb{R}_+


\displaystyle   X_j(y) = \frac{\psi_{p_j}(y)}{\sum_{i=1}^K \psi_{p_i}(y)}

Then X_j is smooth, {\mathrm{supp}}{X_j} = {\mathrm{supp}}{\psi_{p_j}} \subset U_{p_j} and \sum_{j=1}^K X_j(y) = 1. Thus \{X_j\} is the required partition of unity.


Because M is a manifold, it has a countable basis \{A_i\}_{i \in \mathbb{N}} and for any point p, there must exist A_i \subset V_p with p \in A_i. Choose one of these and call it V_{p_i}. This gives a countable cover of M by such sets.

Now define

\displaystyle   L_1 = W_{p_1} \subset V_{p_1} \cup V_{p_2} \cup \ldots \cup V_{p_{i(2)}}

where, since L_1 is compact, V_{p_2}, \ldots, V_{p_{i(2)}} is a finite subcover.

And further define

\displaystyle   L_n = W_{p_1} \cup W_{p_2} \cup \ldots \cup W_{p_{i(n)}}        \subset        V_{p_1} \cup V_{p_2} \cup \ldots \cup V_{p_{i(n+1)}}

where again, since L_n is compact, V_{p_{i(n)+1}}, \ldots, V_{p_{i(n+1)}} is a finite subcover.

Now define

\displaystyle   \begin{aligned}  K_n &= L_n \setminus {\mathrm{int}}{L_{n-1}} \\  U_n &= {\mathrm{int}}(L_{n+1}) \setminus L_n  \end{aligned}

Then K_n is compact, U_n is open and K_n \subset U_n. Furthermore, \bigcup_{n \in \mathbb{N}} K_n = M and U_n only intersects with U_{n-1} and U_{n+1}.

Given any open cover {\mathcal{O}} of M, each K_n can be covered by a finite number of open sets in U_n contained in some member of {\mathcal{O}}. Thus every point in K_n can be covered by at most a finite number of sets from U_{n-1}, U_n and U_{n+1} and which are contained in some member of {\mathcal{O}}. This is a locally finite refinement of {\mathcal{O}} and which is precisely the definition of paracompactness.

To produce a partition of unity we define bump functions \psi_j as above on this locally finite cover and note that locally finite implies that \sum_j \psi_j is well defined. Again, as above, define

\displaystyle   X_j(y) = \frac{\psi_{j}(y)}{\sum_{i=1} \psi_{i}(y)}

to get the required result.


Betancourt, M. J., Simon Byrne, Samuel Livingstone, and Mark Girolami. 2014. “The Geometric Foundations of Hamiltonian Monte Carlo,” October, 45.

The Lie Derivative


In proposition 58 Chapter 1 in the excellent book O’Neill (1983), the author demonstrates that the Lie derivative of one vector field with respect to another is the same as the Lie bracket (of the two vector fields) although he calls the Lie bracket just bracket and does not define the Lie derivative preferring just to use its definition with giving it a name. The proof relies on a prior result where he shows a co-ordinate system at a point p can be given to a vector field X for which X_p \neq 0 so that X = \frac{\partial}{\partial x_1}.

Here’s a proof seems clearer (to me at any rate) and avoids having to distinguish the case wehere the vector field is zero or non-zero. These notes give a similar proof but, strangely for undergraduate level, elide some of the details.

A Few Definitions

Let \phi: M \longrightarrow N be a smooth mapping and let A be a 0,s tensor with s \geq 0 then define the pullback of A by \phi to be

\displaystyle   \phi^*A(v_1,\ldots,v_s) = A(\mathrm{d}\phi v_1,\ldots,\mathrm{d}\phi v_s)

For a (0,0) tensor f \in {\mathscr{F}}(N) the pullback is defined to be \phi^*(f) = f \circ \phi \in {\mathscr{F}}(M).

Standard manipulations show that \phi^*A is a smooth (covariant) tensor field and that \phi^* is \mathbb{R}-linear and that \phi^*(A\otimes B) = \phi^*A \otimes \phi^*B.

Let F : M \longrightarrow N be a diffeomorphism and Y a vector field on N we define the pullback of this field to be

\displaystyle   (F^*{Y})_x = D(F^{-1})_{F(x)}(Y_{F(x)})

Note that the pullback of a vector field only exists in the case where F is a diffeomorphism; in contradistinction, in the case of pullbacks of purely covariant tensors, the pullback always exists.

For the proof below, we only need the pullback of functions and vector fields; the pullback for (0,s) tensors with s \geq 1 is purely to give a bit of context.

From O’Neill (1983) Chapter 1 Definition 20, let \phi: M \rightarrow N be a smooth mapping. Vector fields X on M and Y on N are Frelated written X \underset{F}{\sim} Y if and only if dF({X}_p) = Y_{Fp}.

The Alternative Proof

By Lemma 21 Chapter 1 of O’Neill (1983), X and Y are F-related if and only if X(f \circ F) = Yf \circ F.

Recalling that dF(X_p)(f) = X_p(F \circ f) and since

\displaystyle   dF_x d(F^{-1})_{Fx}(X_{Fx}) = X_{Fx}

we see that the fields F^*{Y} and Y are F-related: F^*{Y}_x \underset{F}{\sim} Y_{Fx}. Thus we can apply the Lemma.

\displaystyle   (F^*{Y})(f \circ F) = (F^*{Y})(F^*{f}) =  Yf \circ F = F^*(Yf)

Although we don’t need this, we can express the immediately above equivalence in a way similar to the rule for covariant tensors

\displaystyle   (F^*{Y})(f \otimes F) = (F^*{Y})\otimes(F^*{f})

First let’s calculate the Lie derivative of a function f with respect to a vector field X where \phi_t is its flow

\displaystyle   \begin{aligned}  L_X f &\triangleq \lim_{t \rightarrow 0} \frac{\phi_t^*(f) - f}{t} \\        &=          \lim_{t \rightarrow 0} \frac{f \circ \phi_t - f \circ \phi_0}{t} \\        &=          \lim_{t \rightarrow 0} \frac{f \circ \phi (t,x) - f \circ \phi (0, x)}{t} \\        &=          (\phi_x)'_0(f) \\        &=          X_x(f) \\        &=          (Xf)_x  \end{aligned}

Analogously defining the Lie derivative of Y with respect to X

\displaystyle   (L_X Y) \triangleq \frac{(\phi_t^*{Y}) - Y}{t}f

we have

\displaystyle   \begin{aligned}  L_X(Yf) &= \lim_{t \rightarrow 0} \frac{\phi_t^*(Yf) - Yf}{t} \\          &= \lim_{t \rightarrow 0} \frac{(\phi_t^*{Y})(\phi_t^*{f}) - Yf}{t} \\          &= \lim_{t \rightarrow 0}             \frac{(\phi_t^*{Y})(\phi_t^*{f}) - (\phi_t^*{Y})f + (\phi_t^*{Y})f - Yf}{t} \\          &= \lim_{t \rightarrow 0}             \Bigg(             (\phi_t^*{Y})\frac{\phi_t^*{f} - f}{t} +             \frac{(\phi_t^*{Y}) - Y}{t}f             \Bigg) \\          &= Y(L_X f) + (L_X Y)f  \end{aligned}

Since L_X f = Xf we have

\displaystyle   X(Yf) = Y(Xf) + (L_X Y)f


\displaystyle   (L_X Y)f  = Y(Xf) - X(Yf) = [X,Y]f

as required.


O’Neill, B. 1983. Semi-Riemannian Geometry with Applications to Relativity, 103. Pure and Applied Mathematics. Elsevier Science.


On page 19, O’Neill comments that the proof of Lemma 33 is a mild generalization of the proof of proposition 28. I think (2) \iff (3) requires spelling out.

Let \phi : M^m \longrightarrow N^n and let \xi = (y^1,\ldots,y^n) be a co-ordinate system at \phi (p). Let \zeta = (x^1,\ldots,x^m) be a co-ordinate system at p. Then by (2) \frac{\partial (y^i \circ \phi)}{\partial (x^j)} has rank m. Thus by exercise 7 and by re-arranging the co-ordinates if necessary, (y^1 \circ \phi,\ldots,y^m \circ \phi) forms a co-ordinate system for M^m on a neighbourhood \cal{W} of p.

For the reverse, note that since (y^1 \circ \phi,\ldots,y^m \circ \phi) is a co-ordinate system, by exercise 7, \frac{\partial (y^i \circ \phi)}{\partial x^j} has rank m.