Metric Spaces
Posted by John Baez
guest post by Tom Leinster
While the world goes wild for categorification (and the part that doesn’t, should!), I seem to be fixated on decategorification — or counting as it’s popularly known. And right now I’m obsessed with counting metric spaces.
We’ve talked about how you can count a category. There, counting went under other names still, ‘Euler characteristic’ and ‘cardinality’. With your mind on autopilot, you can extend this definition to enriched categories. And a metric space is a special kind of enriched category! So we obtain a definition of the cardinality of a metric space.
I’ve known for a while that this can be done, but it’s only in the last few weeks that I’ve started to see the point. Below, I explain where this idea leads: to the geometry of convex sets, to fractal dimension, and to how the world looks when you take off your glasses.
This is also a cry for help! It seems that there’s a really nice story to be told, but I’m frustratingly stuck on a couple of elementary lemmas. They’re probably the sort of thing that a bright schoolkid could settle — but they’ve defeated me.
Background: the cardinality of a category
Here I’ll whizz through a definition that some of you already know… but if you don’t know it, you don’t need to. It just provides some extra motivation for the definition of the cardinality of a metric space, which I’ll define directly anyway. If you prefer, skip ahead to the definition (‘Let $A$ be a finite metric space…’) a screen or so below.
Let $\mathbf{A}$ be a finite category (finitely many objects, finitely many arrows). Put the objects in some order: $a_1, \ldots, a_n$. Let $Z$ be the $n \times n$ matrix whose $(i, j)$-entry is $|Hom(a_i, a_j)|$, the cardinality of the set of arrows from $a_i$ to $a_j$. Assuming that $Z$ is invertible, the cardinality of the category $\mathbf{A}$ is the sum of all $n^2$ entries of $Z^{-1}$. (Some of you know that you can sometimes get away without assuming $Z$ to be invertible, but never mind.)
It’s easy to adapt this definition to enriched categories $\mathbf{A}$. For instance, suppose that $\mathbf{A}$ is a category enriched in finite-dimensional vector spaces (and with only finitely many objects): then simply change ‘$|Hom(a_i, a_j)|$’ to ‘$dim(Hom(a_i, a_j))$’.
In general, suppose that $\mathbf{A}$ is enriched in some monoidal category $V$, and that we have some notion of the cardinality of objects of $V$. Mindlessly parroting the definition above, we can write down the definition of the cardinality of a $V$-enriched category with only finitely many objects. For the definition to behave sensibly, we probably want $|X \otimes Y| = |X| \cdot |Y|$ and $|I| = 1$, where $X, Y \in V$, vertical bars denote cardinality, and $I$ is the unit object of $V$.
And… metric spaces are enriched categories! This was pointed out by Lawvere. Take $[0, \infty]$, the set of non-negative reals with $\infty$ adjoined. It’s an ordered set under $\geq$, and so becomes a category — but that’s with the reverse of the usual ordering, so that there’s one arrow $x \to y$ if $x \geq y$ and none otherwise. Indeed, it’s a monoidal category, with tensor product $+$ and unit $0$. A $V$-enriched category is a set $A$ of ‘points’ together with, for each $a, b \in A$, an element $Hom(a, b) = d(a, b) \in [0, \infty]$, satisfying some axioms. A few moments’ reflection should convince you that any metric space is a $V$-enriched category.
(In fact, you don’t need all the classical metric space axioms to show this, and Lawvere argues persuasively that the classical definition is wrong, but I won’t go into that.)
So it looks as if we can talk about the cardinality of a finite metric space. The only missing ingredient is a notion of the ‘cardinality’ of an element of $[0, \infty]$. In other words, we need a way to assign a number $|x|$ to every $x \in [0, \infty]$! That’s not quite as trivial as it looks — I said that cardinality should turn tensor product into multiplication, which here means that $|x + y| = |x| \cdot |y|$ and $|0| = 1$. Assuming that cardinality takes values in the real numbers and is continuous, the only possibility is $|x| = \alpha^x$ for some constant $\alpha \geq 0$. Later I’ll try to convince you of what value I think $\alpha$ should take. If you can guess already, I want to know how!
Mr Motivator has been and gone; here’s the definition. In it, $\alpha \geq 0$ is a constant.
Let $A$ be a finite metric space, with points $a_1, \ldots, a_n$ and metric $d$. Write $Z$ for the $n\times n$ matrix whose $(i, j)$-entry is $\alpha^{d(a_i, a_j)}$. The cardinality $|A|$ of $A$ is the sum of all $n^2$ entries of $Z^{-1}$.
I’ll assume from now until nearly the end that $Z$ is always invertible — though this is a question that has cost me much sleep and Simon Willerton much computing time.
Example Let’s run through the case of a two-point space $A$. It has points $a_1$ and $a_2$, distance $d(a_1, a_2) = d$ apart. So $Z = \left( \begin{matrix} 1 &\alpha^d \\ \alpha^d &1 \end{matrix} \right),$ and a short calculation shows that the sum $|A|$ of the entries of $Z^{-1}$ is $2/(1 + \alpha^d)$. Suppose that $0 \,<\, \alpha \,<\, 1$. Then $|A|$ is ‘how many points there appear to be to someone with bad eyesight’. As $d$ increases from $0$ to $\infty$, $|A|$ increases from $1$ to $2$. If $d$ is small then $|A|$ is close to $1$: it looks as if there’s only one point. If $d$ is large then $|A|$ is nearly $2$: the points are far enough apart that you can see that there really are two of them.
This is the first piece of evidence that we should take $\alpha$ to be between $0$ and $1$. Taking different values of $\alpha$ in this range amounts to rescaling the metric by a constant factor, so is not tremendously significant.
(Probe your psyche with the following question: what does the formula $2/(1 + \alpha^d)$ remind you of? For some people it’s Fermi–Dirac statistics; for some it’s Todd classes; for others still it’s the exponential generating function for the Bernoulli numbers.)
For many months I could make nothing of this. But now I can! Read on…
Background: geometric measure theory
There’s a subject variously known as geometric measure theory, geometric probability, integral geometry, and (as a Rotary joke) continuous combinatorics. It contains, for instance, a fantastic solution to the Buffon needle problem. And it also has good things to say about cardinality.
Let’s start with a simple observation. Steve Schanuel has a paper ‘What is the length of a potato?’ in which he points out that a closed interval of length 1 centimetre is not very good as a ruler: if you put two of them end-to-end then you don’t quite get an interval of 2cm — there’s a point in the middle left over. A half-open interval would be better. Thus, the measure of the closed interval isn’t really $1 cm$: it’s $1 cm + 1 point$, or $1 cm^1 + 1 cm^0$, or simply $1 cm + 1$. Similarly, a closed interval of length $a cm$ has length $a cm + 1$.
What about a closed rectangle, say $[0, a] \times [0, b]$? From a formal point of view, its measure ought to be $(a cm + 1) \times (b cm + 1) = a b cm^2 + (a + b) cm + 1.$ You can justify this formula by asking how much paint you’d need to cover the rectangle. Certainly you need to paint the $a b cm^2$ in the middle. You also need to paint the boundary, which you might think would require $(2a + 2b) cm$, but as you’re only painting the inside of the boundary, it’s half of that. Similarly, you have to paint $90^\circ/360^\circ = 1/4$ of each of the four corner points, which requires $4 \times (1/4) cm^0 = 1$ unit of paint.
In the same fashion, you can calculate the measure of a right-angled triangle, using the fact that two of them stuck together make a rectangle, and you can play the game in higher dimensions, calculating the measure of cubes and so on. Schanuel explains how to do it for a spherical potato.
A big result in this field is Hadwiger’s Theorem, nicely explained in Klain and Rota’s book Introduction to Geometric Probability. Fix a natural number $n$. A subset of $\mathbb{R}^n$ is polyconvex if it can be expressed as a finite union of compact convex sets. Let $V_n$ be the set of ‘measures’ on the polyconvex subsets of $\mathbb{R}^n$. Here I don’t mean measures in the usual sense. A ‘measure’ in this sense assigns a real number $|A|$ to every polyconvex $A \subset \mathbb{R}^n$, and is finitely additive: $|A \cup B| = |A| + |B| - |A \cap B|.$ The other axioms for a measure are simple enough, and you can read them here.
Now, $V_n$ is a real vector space in an obvious way. If we asked for infinite additivity, it would be one-dimensional, consisting of the multiples of Lebesgue measure. Hadwiger’s Theorem says that, in fact, $\dim(V_n) = n + 1$. Specifically, $V_n$ has a basis $|\,\,|_0, |\,\,|_1, ..., |\,\,|_n$ where the measure $|\,\,|_d$ is $d$-dimensional: writing $t A$ to mean a polyconvex set $A$ scaled up by a factor of $t$, it satisfies $|t A|_d = t^d |A|_d$. For instance, $Euler characteristic, perimeter, area$ is a basis for $V_2$. Give or take a constant factor, these are the coefficients in the formula above for the measure of a rectangle. So Hadwiger’s Theorem goes some way towards formalizing ‘potato theory’.
The point
Here’s where it all comes together. The idea is to define the cardinality of infinite metric spaces by approximating them by finite ones. This turns out to produce many of the invariants of geometric measure theory.
For example, take an interval $[0, a]$ of length $a$. Let $(A_k)_{k \in \mathbb{N}}$ be a sequence of finite subsets of $[0, a]$ that converges to $[0, a]$ in an obvious sense:
for all $\varepsilon \,>\, 0$ there exists $K \in \mathbb{N}$ such that for all $k \geq K$, the $\varepsilon$-balls centred on the elements of $A_k$ cover $[0, a]$.
Theorem The sequence $(|A_k|)_{k \in \mathbb{N}}$ of cardinalities converges to $a + 1$.
This is just like Schanuel’s observation that a closed interval of length $a cm$ should have measure $a cm + 1$!
Recall, however, that the definition of the cardinality of a metric space depends on the choice of a constant $\alpha \geq 0$. To make the theorem true, we have to take $\alpha = e^{-2}$; otherwise, the sequence of cardinalities converges to $C a + 1$ for some positive constant $C \neq 1$. So from now on, $\alpha = e^{-2}$.
Assuming that every metric space has cardinality, lots of good things follow. Using the fact that every compact metric space is a limit (in the Gromov–Hausdorff metric) of a sequence of finite metric spaces, it can be shown that every compact metric space has a well-defined cardinality. Products behave well with respect to cardinality, so the rectangle $[0, a] \times [0, b]$ has cardinality $a b + a + b + 1$, and so on.
The Ancient Greeks knew that ratios are more significant than lengths. Lengths depend on the unit of measurement, but ratios don’t. So when we have a metric space $A$, it’s often worth considering the whole family $(t A)_{0 < t < \infty}$ of spaces, where $t A$ means $A$ with the metric multiplied by a factor of $t$. With this in mind, let’s define for each space $A$ a function $\begin{matrix} \chi_A: &(0, \infty) &\to &\mathbb{R} \\ &t &\mapsto &|t A|. \end{matrix}$ I’ll call $\chi_A$ the cardinality function of $A$. It tells us the cardinality not only of $A$, but also of all its brothers and sisters $t A$.
Examples Let $A$ be a closed interval of length $a$. Then $\chi_A(t) = a t + 1.$ Let $A$ be a closed $a \times b$ rectangle. Then $\chi_A(t) = a b t^2 + (a + b) t + 1,$ exactly the formula from the section on geometric measure theory, but with ‘$t$’ instead of ‘cm’.
Conjecture Let $A$ be a convex subset of $\mathbb{R}^n$. Then $\chi_A(t) = |A|_n t^n + \, \cdots \, + |A|_1 t + |A|_0$ where the $|\,\,|_d$’s are the measures of Hadwiger’s Theorem, suitably normalized.
The conjecture says in particular that $\chi_A$ is a polynomial and that its degree is the dimension of $A$.
We can also attempt to calculate the cardinality functions of some self-similar spaces. (For this I’ll assume some plausible-seeming but unproved statements on how cardinality behaves with respect to colimits.) Taking the hoariest example, let $A$ be Sierpiński’s gasket. Since $t A$ is the union of $3$ copies of $(t/2)A$, overlapping at $3$ points, we’d expect $\chi_A(t) = 3\chi_A(t/2) - 3.$ Solving this gives $\chi_A(t) = b t^D + 3/2$ where $D = (\log 3)/(\log 2)$ and $b$ is some other constant. Although this isn’t a polynomial, its ‘degree’ in some analytic sense is $D$. And $D$ is the Hausdorff dimension of $A$! So just as in the conjecture, the degree of $\chi_A$ is the dimension of $A$.
In all these examples, the constant term is the Euler characteristic. The higher coefficients depend on the metric, but the constant term is purely topological.
In summary, the story seems to go like this. The notion of the cardinality of a category leads to the notion of the cardinality of a finite metric space, which leads to the notion of the cardinality of a compact metric space, which encompasses the most important invariants of geometric measure theory: all of the measures on real polyconvex sets, Hausdorff dimension, and maybe more.
Cry for help
All of the above depends on some elementary lemmas that I’ve been unable to prove. This has been driving me up the wall! Kind reader: now that you’ve seen where this story seems to lead, perhaps you will be moved to help.
The main unproved assumption is that every metric space has cardinality. It’s true for metric spaces with $\leq 3$ points, and Simon has tested millions of other spaces without finding a counterexample. But I can’t prove or disprove it!
I’ll state the problem in a completely elementary form, and I’ll try to give you the psychological advantage by calling it an exercise:
Exercise 1 Say that a square matrix $Z$ is metric if it is symmetric, its diagonal entries are all $1$, its off-diagonal entries are all in $[0, 1)$, and $Z_{i j} Z_{j k} \leq Z_{i k}$ for all $i, j, k$. Prove, or disprove, that every metric matrix is invertible.
Thinking of the cardinality of a finite space as how many points there appear to be to someone with bad eyesight, you’d expect spaces with points spaced more widely apart to have greater cardinality. This suggests the following.
Exercise 2 In this exercise, assume that the metric spaces involved have cardinality. Let $A$ be a finite metric space.
- Let $\tilde{A}$ be another metric space with the same set of points as $A$ and with $d_{\tilde{A}}(a, b) \geq d_A(a, b)$ for all $a, b$. Show that $|\tilde{A}| \geq |A|$.
- Show that $|t A| \geq |A|$ for all $t \geq 1$. (A special case of the previous part.)
- Show that if $A$ has $n$ points then $1 \leq |A| \leq n$.
These exercises are conjectures! It’s not obvious, but they’re closely related: the statements of Exercise 2 very nearly imply the statement of Exercise 1.
Simon’s tests suggest that the symmetry assumption might not be necessary to prove the conjecture in Exercise 1 or to prove that $|A| \leq n$ in the last part of Exercise 2. He has a potential non-symmetric counterexample to the conjecture that $|A| \geq 1$, but isn’t sure if it’s due to a rounding error or not. I’m most concerned to settle these conjectures in the symmetric case: metric spaces in the completely classical sense.
Re: Metric Spaces
Amazing post, Tom! I hope someone does those ‘exercises’!
Tom wrote:
For what it’s worth, Alain Connes makes some cryptic remarks about Bernoulli numbers in the book Triangle of Thoughts, which suggest that all three of these things are closely linked — except maybe Bose–Einstein statistics rather than Fermi–Dirac!
The first remark goes like this: “Planck had been interested in the black body problem since 1894. Living in Berlin, he was familiar with the experiments by Rubens and Kurlbaum. In the fall of 1900, he had the amazing intuition of a function of a single variable that fit beautifully with the two types of experimental results, and reconciled Lord Rayleigh’s law with Wien’s displacement law. This function is not at all simple, and even from a purely mathematical view it contains an enormous number of hidden treasures, such as Bernoulli numbers…”
In the second remark, Marcel Schutzenberger says, “The origin of Bernoulli numbers lies in the symmetric functions of integers. They play a role in the sum of the $k$th powers of integers from $1$ to $n$,” and Connes replies “These numbers appear in the asymptotics of $x/(1-exp(-x))$, almost by definition.”
There are two more such remarks, which you can find by going to the Amazon link above and searching under ‘Bernoulli’. (I don’t usually link to sites that sell books, but this time I’ll make an exception.)
I’ve been trying to understand this stuff more deeply for years. Here are some comments designed to explain Connes’ remarks to people who know little of the history of physics, or indeed physics.
Suppose you have a system which can be in states labelled $0, 1, 2, \dots$, and the probability of being in the $n$th state is proportional to $exp(-n x)$ for some parameter $x$. Then the actual probability of being in the $n$th state is $exp(-n x)/Z$, where the normalizing factor $Z$ is
$Z = \sum_{n = 0}^\infty exp(-n x) = \frac{1}{1 - exp(-x)}$
To physicists this is the ‘partition function of the harmonic oscillator’, deeply intertwined with ‘Bose–Einstein statistics’. The similar function for Fermi–Dirac statistics is $1 + exp(-x)$.
If you try to Taylor expand this function
$\frac{1}{1 - exp(-x)}$
you run into a minor problem: it blows up at $x = 0$. You can just do a Laurent expansion instead — but the easiest way to do that is do a Taylor expansion of
$\frac{x}{1 - exp(-x)}$
and then divide by $x$. This Taylor expansion turns out to be
$\frac{x}{1 - exp(-x)} = \sum_{n = 0}^\infty B_n \frac{x^n}{n!}$
where the coefficients $B_n$ are called the Bernoulli numbers.
The fun then lies in finding ways to compute and understand the Bernoulli numbers. I wrote a lengthy annotated problem set about this which contains my best insights so far about what’s ‘really going on’; you can also see answers by Jeff Morton. This problem set explains why Bernoulli numbers show up when you sum
$1^k + \cdots + n^k$
In a nutshell, they’re deeply linked to the discrete analogue of integration — obtained as the inverse of the discrete analogue of differentiation.
There’s still a lot that remains mysterious to me. I don’t really know why Bernoulli numbers show up in Kervaire and Milnor’s formula for the number of smooth structures on the $(4n-1)$-sphere.
More importantly, I don’t really know why the function $x/(1 - exp(-x))$ is so crucial in the theory of Todd classes, and whether this is related to the partition function of the harmonic oscillator somehow. I think it is…
Anyway, it’s quite possible that none of this will help you at all. Maybe you’re secretly dealing with Fermi–Dirac statistics and dividing something by the partition function $1 + exp(-x)$ — but if so, I sure don’t see why!