## July 24, 2021

### Entropy and Diversity Is Out!

#### Posted by Tom Leinster My new book, Entropy and Diversity: The Axiomatic Approach, is in the shops!

If you live in a place where browsing the shelves of an academic bookshop is possible, maybe you’ll find it there. If not, you can order the paperback or hardback from CUP. And you can always find it on the arXiv. I posted here when the book went up on the arXiv. It actually appeared in the shops a couple of months ago, but at the time all the bookshops here were closed by law and my feelings of celebration were dampened.

But today someone asked a question on MathOverflow that prompted me to write some stuff about the book and feel good about it again, so I’m going to share a version of that answer here. It was long for MathOverflow, but it’s shortish for a blog post.

There are lots of aspects of the book that I’m not going to write about here. My previous post summarizes the contents, and the book itself is quite discursive. Today, I’m just going to stick to that MathOverflow question (by Aidan Rocke) and what I answered.

Might there be a natural geometric interpretation of the exponential of entropy?

I answered more or less as follows. The direct answer to your literal question is that I don’t know of a compelling geometric interpretation of the exponential of entropy. But the spirit of your question is more open, so I’ll explain (1) a non-geometric interpretation of the exponential of entropy, and (2) a geometric interpretation of the exponential of maximum entropy.

### Diversity as the exponential of entropy

The exponential of entropy (Shannon entropy, or more generally Rényi entropy) has long been used by ecologists to quantify biological diversity. One takes a community with $n$ species and writes $\mathbf{p} = (p_1, \ldots, p_n)$ for their relative abundances, so that $\sum p_i = 1$. Then $D_q(\mathbf{p})$, the exponential of the Rényi entropy of $\mathbf{p}$ of order $q \in [0, \infty]$, is a measure of the diversity of the community, or the “effective number of species” in the community.

Ecologists call $D_q$ the Hill number of order $q$, after the ecologist Mark Hill, who introduced them in 1973 (acknowledging the prior work of Rényi). There is a precise mathematical sense in which the Hill numbers are the only well-behaved measures of diversity, at least if one is modelling an ecological community in this crude way. That’s Theorem 7.4.3 of my book. I won’t talk about that here.

Explicitly, for $q \in [0, \infty]$,

$D_q(\mathbf{p}) = \biggl( \sum_{i:\,p_i \neq 0} p_i^q \biggr)^{1/(1 - q)}$

($q \neq 1, \infty$). The two exceptional cases are defined by taking limits in $q$, which gives

$D_1(\mathbf{p}) = \prod_{i:\, p_i \neq 0} p_i^{-p_i}$

(the exponential of Shannon entropy) and

$D_\infty(\mathbf{p}) = 1/\max_{i:\, p_i \neq 0} p_i.$

Rather than picking one $q$ to work with, it’s best to consider all of them. So, given an ecological community and its abundance distribution $\mathbf{p}$, we graph $D_q(\mathbf{p})$ against $q$.

This is called the diversity profile of the community, and is quite informative. Different values of the parameter $q$ tell you different things about the community. Specifically, low values of $q$ pay close attention to rare species, and high values of $q$ ignore them.

For example, here’s the diversity profile for the global community of great apes: (from Figure 4.3 of my book). What does it tell us? At least two things:

• The value at $q = 0$ is $8$, because there are $8$ species of great ape present on Earth. $D_0$ measures only presence or absence, so that a nearly extinct species contributes as much as a common one.

• The graph drops very quickly to $1$ — or rather, imperceptibly more than $1$. This is because 99.9% of ape individuals are of a single species (humans, of course: we “outcompeted” the rest, to put it diplomatically). It’s only the very smallest values of $q$ that are affected by extremely rare species. Non-small $q$s barely notice such rare species, so from their point of view, there is essentially only $1$ species. That’s why $D_q(\mathbf{p}) \approx 1$ for most $q$.

### Maximum diversity as a geometric invariant

A major drawback of the Hill numbers is that they pay no attention to how similar or dissimilar the species may be. “Diversity” should depend on the degree of variation between the species, not just their abundances. Christina Cobbold and I found a natural generalization of the Hill numbers that factors this in — similarity-sensitive diversity measures.

I won’t give the definition (see that last link or Chapter 6 of the book), but mathematically, this is basically a definition of the entropy or diversity of a probability distribution on a metric space. (As before, entropy is the log of diversity.) When all the distances are $\infty$, it reduces to the Rényi entropies/Hill numbers.

And there’s some serious geometric content here.

Let’s think about maximum diversity. Given a list of species of known similarities to one another — or mathematically, given a metric space — one can ask what the maximum possible value of the diversity is, maximizing over all possible species distributions $\mathbf{p}$. In other words, what’s the value of

$\sup_{\mathbf{p}} D_q(\mathbf{p}),$

where $D_q$ now denotes the similarity-sensitive (or metric-sensitive) diversity? Diversity is not usually maximized by the uniform distribution (e.g. see Example 6.3.1 in the book), so the question is not trivial.

In principle, the answer depends on $q$. But magically, it doesn’t! Mark Meckes and I proved this. So the maximum diversity

$D_{\text{max}}(X) := \sup_{\mathbf{p}} D_q(\mathbf{p})$

is a well-defined real invariant of finite metric spaces $X$, independent of the choice of $q \in [0, \infty]$.

All this can be extended to compact metric spaces, as Emily Roff and I showed. So every compact metric space has a maximum diversity, which is a nonnegative real number.

What on earth is this invariant? There’s a lot we don’t yet know, but we do know that maximum diversity is closely related to some classical geometric invariants.

For instance, when $X \subseteq \mathbb{R}^n$ is compact,

$\text{Vol}(X) = n! \omega_n \lim_{t \to \infty} \frac{D_{\text{max}}(tX)}{t^n},$

where $\omega_n$ is the volume of the unit $n$-ball and $tX$ is $X$ scaled by a factor of $t$. This is Proposition 9.7 of my paper with Roff and follows from work of Juan Antonio Barceló and Tony Carbery. In short: maximum diversity determines volume.

Another example: Mark Meckes showed that the Minkowski dimension of a compact space $X \subseteq \mathbb{R}^n$ is given by

$\dim_{\text{Mink}}(X) = \lim_{t \to \infty} \frac{D_{\text{max}}(tX)}{\log t}$

(Theorem 7.1 here). So, maximum diversity determines Minkowski dimension too.

There’s much more to say about the geometric aspects of maximum diversity. Maximum diversity is closely related to another recent invariant of metric spaces, magnitude. Mark and I wrote a survey paper on the more geometric and analytic aspects of magnitude, and you can find more on all this in Chapter 6 of my book.

#### Postscript

Although diversity is closely related to entropy, the diversity viewpoint really opens up new mathematical questions that you don’t see from a purely information-theoretic standpoint. The mathematics of diversity is a rich, fertile and underexplored area, beckoning us to come and investigate. Posted at July 24, 2021 8:51 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3337

### Re: Entropy and Diversity Is Out!

The exponential of

$\sum_{i \in X} p_i \ln p_i$

is

$\prod_{i \in X} {p_i}^{p_i}$

and last week David Jaz-Myers told me a nice categorification of this quantity. Suppose $\pi \colon P \to X$ is any function between sets. Let $P_i$ be the fiber over $i \in X$:

$P_i = \{y \in P: \; \pi(y) = i \}$

We can think of $P$ as a ‘set over $X$’, and in the category of sets over $X$ we have

$End(P) = \prod_{i \in X} {P_i}^{P_i}$

In other words, this is the set of functions from $P$ to itself that preserve every fiber.

David told me that someone had just come out with a paper using this idea to think about entropy in new ways. I can’t remember who that was, but maybe someone can tell us.

Posted by: John Baez on July 25, 2021 5:24 AM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

You may be thinking of Spivak and Hosgood’s new paper.

Coincidentally, because of seeing a different paper on information loss associated to a map, namely this one of Fullwood and Parzygnat, I was starting to think about how to interpret finite probability spaces with rational probabilities as given by combinatorial problems of counting points in fibres of a function, and how one should interpret entropy. And my colleague David Butler told me, when I shared my thoughts, he was trying to get such an approach into the intro probability bridging course for people coming to uni with insufficient mathematics background (and then replacing counting points with areas of the rectangles in a histogram — which is essentially what Spivak and Hosgood are doing in a much more highbrow way).

So these ideas are very attractive and natural from many points of view, and should be better known!

Posted by: David Roberts on July 25, 2021 7:41 AM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

Oh, and one tiny reason why I was exploring this is that it helps makes sense of the convention for calculating entropy that $0\cdot\log(0)=0$, since one is counting the number of functions $\emptyset\to \emptyset$, of which there is one.

Posted by: David Roberts on July 25, 2021 7:46 AM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

John very nearly wrote:

We can think of $P$ as a ‘set over $X$’, and in the category of sets over $X$ we have

$End_X(P) = \prod_{i \in X} P_i^{P_i}.$

(I’ve added an $X$ subscript to your $End$, to make clear they’re endomorphisms over $X$.)

Right. And we can try to take this further and get our hands on the number

$D(\mathbf{p}) = \prod_{i \in X} p_i^{-p_i}$

where — assuming $P$ and $X$ are finite —

$p_i = \frac{|P_i|}{|P|}.$

An easy observation:

$D(\mathbf{p})^{-|P|} = \frac{|End_X(P)|}{|End(P)|},$

that is,

$D(\mathbf{p})^{-|P|} = \text{probability that a random endo of} \ P\ \text{is an endo over}\ X.$

For example, consider the probability distribution

$\mathbf{p} = (2/10, 5/10, 3/10).$

Think of a 10-element set partitioned into three blocks. There are $10^{10}$ endomorphisms of the set, and $2^2 5^5 3^3$ of them preserve the blocks. The probability that a random endomorphism preserves the blocks is

$\frac{2^2 5^5 3^3}{10^{10}} = D(\mathbf{p})^{-10}.$

Joachim Kock and I spent a little bit of time trying to squeeze something out of this, but to no avail.

Posted by: Tom Leinster on July 25, 2021 10:53 AM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

In our notation, this quantity $D(\mathbf{p})^{-|P|}$ would be $L(d)^{-A(d)}$, but this isn’t something that popped up when we were writing our paper, and I can’t think of a nice intuition for it off the top of my head…

Posted by: Tim Hosgood on July 25, 2021 5:49 PM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

Here’s a blog article explaining the new paper I was talking about:

Posted by: John Baez on July 25, 2021 4:12 PM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

I see that Tim and David use the term “length” to mean the diversity of order $1$, i.e. the exponential of Shannon entropy. They say in the abstract of their paper that “classically” it has also been called the “perplexity”, although they don’t seem to give a reference for that.

I’m not surprised to see this quantity going by many names, given (1) the extremely varied contexts in which entropy itself arises, and (2) the fact that the exponential of entropy is often a more natural quantity than entropy itself.

The name “perplexity” reminds me of expected surprise, although that’s a term for one kind of entropy rather than diversity (i.e. a logarithmic thing, not an exponential thing).

Posted by: Tom Leinster on July 25, 2021 5:06 PM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

David and I searched for a name for this quantity for a while. Since this is far from my usual area, I had no idea, and could only think of looking on the Wikipedia page for Shannon entropy, where it didn’t seem to be mentioned. I can’t remember exactly, but I think David mentioned to me that he’d asked somebody about it and they’d mentioned the term “perplexity”, but I do admit to being rather lax with finding any citation to back this up!

(As for the term “length”, this is simply to fit in with the rest of the analogy, since we already have “width” and “area”, and these terms end up corresponding to the usual definitions for a certain rectangle that you can draw.)

Posted by: Tim Hosgood on July 25, 2021 5:40 PM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

Just in case it sounds like I was telling you off, I wasn’t! It’s very hard to look up the name of something when you don’t already know its name (and if you did, you wouldn’t need to look it up). I experienced that myself a few years ago when I stumbled across Fermat quotients without knowing what they were called. Despite my best web-searching efforts, it was weeks or maybe months before I discovered their name.

And many of these entropic quantities have an absolute ton of names. For instance, the introduction to Chapter 3 of my book lists fourteen names for relative entropy.

I vaguely remember that the exponential of entropy has also been called “extropy”.

Posted by: Tom Leinster on July 25, 2021 6:05 PM | Permalink | Reply to this

### What’s in a name? That which we call an entropy…

Several years ago, a colleague and I were looking into axiomatic characterizations of Shannon entropy and how they might be abstracted away to provide a notion of “information” in contexts where a probability distribution has not been defined. It was pure luck that we learned that the conditions we’d arrived at defined something called the rank function of a polymatroid.

Posted by: Blake Stacey on July 25, 2021 9:20 PM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

The difficulty of finding out the names for mathematical concepts is a peculiar phenomenon in an age where, in general, it’s so incredibly easy to find abundant information on even the most obscure topics.

In the phase when I was looking into Fermat quotients but didn’t know what they were called, I was pretty sure they had to have a well-known name. But that confidence didn’t help me find out what that name was. From the moment I did discover the name (I forget how), I had access to more information about them than I needed or will ever need. It’s a real step change.

Posted by: Tom Leinster on July 25, 2021 10:28 PM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

These sorts of problems are exactly the sorts of things that Valeria et al are hoping to solve! https://topos.site/blog/2021/07/introducing-the-mathfoldr-project/

Posted by: Tim Hosgood on July 26, 2021 12:37 AM | Permalink | Reply to this

### Re: Entropy and Diversity Is Out!

This is a long shot: could

MR2400035 (2009b:11007) Jesse Elliott, Ring structures on groups of arithmetic functions, J. Number Theory 128 (2008) 709 – 730

be useful for this?

Posted by: jackjohnson on July 26, 2021 5:16 PM | Permalink | Reply to this

I think this is a fancy way of interpreting Boltzmann’s entropy formula (in which apart from a constant, entropy is defined as the logarithm of the number of microstates that correspond to a macroscopic state).

Posted by: Steve Huntsman on July 26, 2021 2:32 PM | Permalink | Reply to this

### Re:

Thanks for reminding me of that, Steve! That’s a great example to keep in mind.

Posted by: David Roberts on July 27, 2021 12:54 AM | Permalink | Reply to this

Post a New Comment