## October 23, 2011

### Measuring Diversity

#### Posted by Tom Leinster

Christina Cobbold and I wrote a paper on measuring biological diversity:

Tom Leinster and Christina A. Cobbold,
Measuring diversity: the importance of species similarity.
Ecology, in press (doi:10.1890/10-2402.1).

As the name of the journal suggests, our paper was written for ecologists — but mathematicians should find it pretty accessible too.

While I’m at it, I’ll mention that I’m coordinating a five-week research programme on The Mathematics of Biodiversity at the Centre de Recerca Matemàtica, Barcelona, next summer. It includes a one-week exploratory conference (2–6 July 2012), to which everyone interested is warmly welcome.

In a moment, I’ll start talking about organisms and species. But don’t be fooled: mathematically, none of this is intrinsically about biology. That’s why this post is called “Measuring diversity”, not “Measuring biological diversity”. You could apply it in many other ways, or not apply it at all, as you’ll see.

It’s an example of what Jordan Ellenberg has amusingly called applied pure math. I think that’s a joke in slightly poor taste, because I don’t want to surrender the term “applied math” to those who basically use it to mean “applied differential equations”. Nevertheless, I suspect we’re on the same side.

Long-time patrons of the Café may remember a pair of posts in 2008 on entropy, diversity and cardinality. But those were long posts, a long time ago, and there’s a lot about them that I’d change now. So I’ll start afresh.

Imagine a ‘community’ of organisms — the fish in a lake, the fungi in a forest, or the bacteria on your skin. We divide them into $S$ groups, conventionally called species, though they needn’t be species in the ordinary sense. (The division of organisms into species is somewhat arbitrary, which is a problem, though it’s less of a problem with the approach presented here than with many previous approaches.)

We then record two things about the community. First:

Relative abundances  The relative frequencies, or abundances, of the species form a probability distribution $p = (p_1, \ldots, p_S)$ on $\{1, \ldots, S\}$. Here $p_i$ is the proportion of the total population belonging to species $i$, where ‘proportion’ is measured in any way you think sensible (number of individuals, total mass, etc).

Note that we only record relative abundances, not absolute abundances. As it’s usually used, the word diversity denotes an intensive quantity. If nine-tenths of a forest is destroyed, it might be a terrible thing, but on the (unrealistic) assumption that all the flora and fauna in the forest are distributed homogeneously, it doesn’t actually cause a decrease in biodiversity.

The second thing we record is:

Similarities  The similarity between each pair of species is measured by a real number between $0$ and $1$, with $0$ denoting total dissimilarity and $1$ denoting identical species. Writing the similarity between the $i$th and $j$th species as $Z_{i j}$, this gives an $S \times S$ matrix $Z$ with entries in $[0, 1]$. Our only assumption on $Z$ is that its diagonal entries are all $1$: every species is identical to itself.

There are many approaches to measuring inter-species similarity, of which probably the most familiar is genetic, as in ‘you share 98% of your DNA with a chimpanzee’. Different measures of similarity will produce different measures of diversity.

Sometimes one has, instead of a measure of inter-species similarity (measured on a scale of 0 to 1), a measure of inter-species distance (measured on a scale of 0 to $\infty$). Distances $d_{i j}$ can be converted into similarities $Z_{i j}$ by the transformation $Z_{i j} = e^{-d_{i j}}$, or more generally by $Z_{i j} = e^{-t d_{i j}}$ for some positive scale factor $t$. That’s not the only transformation you can use, but it has some good mathematical properties.

What we have to do now is take this data and turn it into a single number, measuring the diversity of the community. Actually, it’s not going to be quite as simple as that… but let’s take it one step at a time.

The similarities form an $S \times S$ matrix $Z$, and the relative abundances can be regarded as forming an $S$-dimensional column vector $p$. So, we get an $S$-dimensional column vector $Z p$, whose $i$th entry is

$(Z p)_i = \sum_j Z_{i j} p_j.$

This is the expected similarity between an individual of the $i$th species and an individual chosen at random. It therefore measures the ‘ordinariness’, or lack of distinctiveness, of that individual.

The average ordinariness of an individual in the community is, then,

$\sum_i p_i (Z p)_i.$

This is greatest if the community is concentrated into a few very similar species. Economists have used the word concentration for quantities like this. Now, we’re after a measure of diversity, which should be inversely related to concentration. So we could define the diversity of the community as the reciprocal of the concentration:

$1/\sum_i p_i (Z p)_i.$

This turns out to be a good measure of diversity. But it’s not the only good one.

Why not? I’ll give two explanations: one mathematical, one ecological.

Mathematically, the point is that when I wrote down the formula for the ‘average’ ordinariness, I neglected the fact that there are many good notions of average. In particular, there are the power means. For $t \in \mathbb{R}$, the power mean of numbers $x_1, \ldots, x_S \geq 0$, weighted by a probability distribution $p_1, \ldots, p_S$, is got by transforming each $x_i$ into $x_i^t$, then forming their ordinary mean weighted by the $p_i$s, then applying the inverse transformation. In other words, it’s

$\Bigl( \sum_i p_i x_i^t \Bigr)^{1/t}.$

We’ll apply this with $x_i = (Z p)_i$. For reasons I won’t explain, I’ll shift the indexing by putting $t = q - 1$, and I’ll restrict to $q \geq 0$. So, the average ordinariness ‘of order $q$’ is

$\Bigl( \sum_i p_i (Z p)_i^{q - 1} \Bigr)^{1/(q - 1)}.$

This is a measure of concentration. Its reciprocal is

$D_q^Z(p) = \Bigl( \sum_i p_i (Z p)_i^{q - 1} \Bigr)^{1/(1 - q)}.$

And that, by definition, is the diversity of order $q$ of the community. The diversity measure we arrived at above was the case $q = 2$, and is called the quadratic diversity, since it’s the reciprocal of a quadratic form.

The formula for $D_q^Z(p)$ doesn’t make sense for $q = 1$ or $q = \infty$, but you can easily make sense of it by taking limits. Doing this leads to the definitions

$D_1^Z(p) = \prod_i (Z p)_i^{-p_i}$

and

$D_\infty^Z(p) = 1/\max_i (Z p)_i.$

Technical note: in order for everything to be well-defined, you have to take the sums and max to be over only those values of $i$ for which $p_i \gt 0$ (that is, over only the species that are actually present).

So, we’ve got not just one measure of diversity, but a one-parameter family of them:

$(D_q)_{q \geq 0}.$

Ecologically, this spectrum of diversity measures corresponds to a spectrum of viewpoints on what diversity is. Consider two bird communities. The first looks like this:

It contains four species, one of which makes up most of the population, and three of which are quite rare. The second community looks like this:

It has only three species, but they’re evenly balanced.

Now, which community is more diverse? It’s a matter of opinion. Or, if you like, it’s a matter of how you interpret the word ‘diverse’. Usually in the mainstream press, and often in scholarly articles too, ‘biodiversity’ is used as a synonym for ‘number of species present’. On this count, the first community is more diverse. But if you’re mostly concerned with the functioning of the whole community, the role of rare species might not be particularly important: maybe your primary concern is that no species is too dominant, and on that score, the second community wins.

Varying the parameter $q$ corresponds to varying your viewpoint. Specifically, $q$ controls how little emphasis you place on rare species. So the graphs of $D_q^Z(p)$ against $q$, for the two communities, might look like this:

The purple curve represents the first community, and the blue curve represents the second. (The exact shapes of the graphs will depend on the similarity matrix $Z$.) For low values of $q$ (emphasizing rare species), the first community looks more diverse than the second. For high values of $q$ (emphasizing common species), it’s the opposite.

It turns out that many diversity measures previously used in ecology are special cases of the ones given above. Also, these measures have excellent mathematical properties. Lots are listed in our paper. Here I’ll give just two.

Naive model  There’s a ‘naive’ model of an ecological community in which distinct species are always assumed to have nothing in common. This is a terribly crude assumption, and makes a community consisting of two species of slug as diverse as a community consisting of a slug and a giraffe.

Nevertheless, this is the model used by most diversity measures to date. It corresponds to taking $Z = I$. When you take $Z = I$ using our measures, the formula for diversity is:

$D_q^Z(p) = \begin{cases} \Bigl( \sum_i p_i^q \Bigr)^{1/(1 - q)} & if  q \neq 1, \infty \\ \prod_i p_i^{-p_i} & if  q = 1 \\ 1/\max_i p_i & if  q = \infty. \end{cases}$

These are known in ecology as the Hill numbers, and in mathematics as the exponentials of the Rényi entropies. A lot is known about them. Even more is known about the case $q = 1$, which is the exponential of Shannon entropy.

Effective number  Our measures are effective numbers, which means that a community of $S$ equally abundant, totally dissimilar species is assigned a diversity of $S$. In symbols,

$D_q^I(1/S, \ldots, 1/S) = S,$

for all $q \in [0, \infty]$.

So if someone tells you ‘this community has diversity 26.2’, and they’re using an effective number measure, that means it’s slightly more diverse than a community of 26 equally abundant, totally dissimilar species. If they come to you a year later saying that its diversity has dropped to 13.1, that means, in a directly comprehensible way, that its diversity has halved. As Mark Hill (of the Hill numbers) put it, effective numbers ‘enable us to speak naturally’.

As far as I’m concerned, this work links together many of my interests involving measures of size. Apart from diversity being an important mathematical concept in itself, it’s related to entropy, power means, and magnitude of metric spaces.

As far as biologists are concerned, there seem to be two main points of interest.

One is the even-handed approach to the spectrum of possible viewpoints — treating all values of $q$ democratically, rather than choosing one and claiming that it’s the ‘best’. This leads to the graphical device of drawing graphs like the one above, in order to compare and contrast communities. I’m surprised that this has generated so much enthusiasm, because these graphs (‘diversity profiles’) have been advocated for a long time, by many different authors. But they also seem to be new to many people.

The other — and the reason behind the title of our paper — is that we’ve built inter-species similarity into the model. Ours isn’t the first diversity measure to do this, but it seems to be the most general. I’d like to explain the practical impact that this can have, but I’m running out of energy now, so I’ll leave that for another day.

Update (9 November 2011) John kindly let me write a version of this post for Azimuth. It’s actually quite different from what I’ve written here. The major thing that it has and this post doesn’t is an illustration of how taking species similarity into account can change your judgement on which of two communities is the more diverse.

Posted at October 23, 2011 11:10 PM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2450

### Re: Measuring Diversity

Thanks, that’s a very clear post.

When writing

As it’s usually used, the word diversity denotes an intensive quantity,

you link to the Wikipedia page on the intensive/extensive property distinction. We really should have such a page at nLab, especially as it is a favourite topic of Lawvere’s. I’ve been trying to grasp it through nForum discussions, e.g., here, and longer ago at the Café.

Does anyone here have a good category theoretic handle on it?

Posted by: David Corfield on October 25, 2011 9:24 AM | Permalink | Reply to this

### Re: Measuring Diversity

Actually, I assumed there’d be an nLab page, looked for one, and was surprised not to find one. (I realize now that the canonical response from a dedicated nLabber would be: “you should have made one!”) What I did find was the nForum discussion that you linked to.

However, the Wikipedia page was a revelation to me. Until I saw it, I’d been under the impression that “intensive” and “extensive” were words of Lawvere’s invention.

Posted by: Tom Leinster on October 25, 2011 11:27 AM | Permalink | Reply to this

### Re: Measuring Diversity

However, the Wikipedia page was a revelation to me. Until I saw it, I’d been under the impression that “intensive” and “extensive” were words of Lawvere’s invention.

Wow! Thanks for communicating that revelation! The Wikipedia page makes so much more sense than any explanation I’ve ever read of Lawvere’s usage. (Not that I have trouble distinguishing between covariant and contravariant functors, but I’ve never understood why both should be called “quantities”, or how I’m supposed to remember which is “extensive” and which is “intensive”, or what those words are supposed to suggest.) Do you understand the relationship between the two usages? Is there a “category of substances”?

Posted by: Mike Shulman on October 25, 2011 2:38 PM | Permalink | Reply to this

### Re: Measuring Diversity

What do we have then? Certainly there’s the thought that

To say that distributions are extensive quantities implies that they transform covariantly. To say that functions are intensive quantities implies that they transform contravariantly. (p. 321)

Can we square that with density as intensive and volume as extensive? Well density is a function from a space (a body) to the reals. I can pull back a density from one space to another, e.g., a subspace.

Volume is something I integrate a function against to provide me with a real number. A delta function at a point is something I can integrate a (smooth) function against to give me a value of the function at that point. I can certainly push forward ‘evaluation at a point’ along a map between spaces. And I can push forward ‘evaluation against a volume’.

There’s a product between an extensive and an intensive quantity which yields another extensive quantity, such as volume and density giving mass. So distributions may be multiplied by smooth functions.

Posted by: David Corfield on October 25, 2011 3:56 PM | Permalink | Reply to this

### Re: Measuring Diversity

density is a function from a space (a body) to the reals. I can pull back a density from one space to another, e.g., a subspace.

Yes, I guess I buy that.

Volume is something I integrate a function against to provide me with a real number.

I don’t think I buy that. It seems to me that “volume” in the sense used in “volume is an extensive quantity” refers to the total volume of a substance. Physicists are used to regarding the “volume form/distribution” assigned to space as a given, so what we are talking about here is the result of integrating that volume form over the spatial region occupied by the substance. I don’t immediately see any sense in which that is covariant or contravariant.

As for a volume form, if it is a volume-form-valued distribution which one integrates against smooth real-valued-functions, then yes, I guess it would be covariant. But I don’t see any particular reason to privilege that choice over considering the volume form to be a smooth volume-form-valued function and integrating it against a real-valued distribution, in which case the volume form would be contravariant and the other object covariant.

Posted by: Mike Shulman on October 26, 2011 4:51 AM | Permalink | Reply to this

### Re: Measuring Diversity

I’m with Mike here.

A typical “intensive” quantity such as a temperature or density is a function from our space $X$ to $\mathbb{R}$. (In other contexts, $\mathbb{R}$ might be replaced by some other canonical object such as $S^1$ or $2$ or $\mathbb{C}$ or Sierpiński space.) So, of course, the object $Hom(X, \mathbb{R})$ of “intensive quantities on $X$” is contravariant in $X$. You pull functions back.

But extensive quantities… hmm. What Mike says. You push measures or distributions forward, but measures and distributions don’t seem to fit the description of “extensive quantities”. The measure of the whole space is an extensive quantity, but a measure is a whole lot more than just that.

Formally, and in outline, a (real) distribution on a space $X$ is a map

$Hom(X, \mathbb{R}) \to \mathbb{R}$

satisfying suitable properties, where $Hom(X, \mathbb{R})$ is some space of “test functions”. So the space of distributions is very loosely “$Hom(Hom(X, \mathbb{R}), \mathbb{R})$”, which is why it’s covariant in $X$.

I suppose, though, that what I’ve said about measures and distributions is beside the point, because I’ve only shown that there are things that transform covariantly and aren’t extensive quantities.

Here’s a pair of questions. What is the difference between “intensive” and “local”? What is the difference between “extensive” and “global”?

Posted by: Tom Leinster on October 26, 2011 11:24 PM | Permalink | Reply to this

### Re: Measuring Diversity

Tom wrote:

What is the difference between “intensive” and “local”? What is the difference between “extensive” and “global”?

These pairs are closely related. In physics an extensive quantity is usually a global quantity computed by integrating a local one over some region of space. An intensive quantity is local. But there are some intensive quantities (like energy density) that you should want to integrate, and others (like temperature) that you should never try to integrate over a region. Basically this is because the latter ones transform contravariantly (like functions) rather than covariantly (like measures).

Posted by: John Baez on October 29, 2011 10:15 AM | Permalink | Reply to this

### Re: Measuring Diversity

That’s flies against Lawvere for whom intensive and contravariant coincide.

Is the reason you give right for not wanting to integrate temperature? Isn’t it because, although temperature is intensive, its inverse is more natural, as you say here. So it’s more natural to integrate the inverse temperature, as here:

Entropy, in the thermodynamic sense in which the term was coined, is the integral of the inverse temperature of a body, with respect to heat.

Similarly, energy density is a natural thing to integrate, where inverse energy density is not, even though both are intensive.

Intensive quantities are like derivatives, $d f/d g$ so you want to integrate them with respect to $g$.

You don’t want to integrate temperature because it’s odd to integrate with respect to entropy.

Hmm, so why do people try to calculate average temperatures? Isn’t that integrating temperature against something spatial, then dividing by total size of space?

Posted by: David Corfield on October 29, 2011 2:27 PM | Permalink | Reply to this

### Re: Measuring Diversity

Hmm, so why do people try to calculate average temperatures?

Because they're usually working within a limited range, so that the difference between the average of the inverses of the temperatures and the inverse of the average of the temperatures is negligible?

Posted by: Toby Bartels on November 15, 2011 11:05 PM | Permalink | Reply to this

### Re: Measuring Diversity

Fair points. The idea that extensive quantities, such as mass, combine simply by addition when two objects are brought together, while an intensive quantity, such as density, combines messily is only right insofar as we take the total mass and average density of the whole.

Let me check something though. The distribution itself is the map from (smooth) functions (with compact support) to the reals (or whatever), isn’t it? So a delta function the way I used to think of it as a very spiky function isn’t the distribution. It’s integration against that, or much better evaluation at the point, which is the distribution.

We have a tendency to want to think of distributions as functions to integrate against, because functions can be used to generate distributions. I suppose if I have a sum of delta functions, i.e., a map which given a function adds its values as specified points, then the result for two functions with non-overlapping support is the sum of the results taking them individually. So that we could say that that sum of deltas is extensive.

And an area, as perhaps we were once taught, is the limit of such a sum of delta functions.

Posted by: David Corfield on October 26, 2011 8:54 AM | Permalink | Reply to this

### Re: Measuring Diversity

It’s integration against that, or much better evaluation at the point, which is the distribution.

Technically, yes. But we still do think of a delta distribution as a “really spiky function” (at least, I do); that’s just a convenient way to define it formally. Like how we might define a scheme to be a certain type of functor $Ring \to Set$, and yet still think of it as a “space” rather than as a functor.

Posted by: Mike Shulman on October 26, 2011 4:47 PM | Permalink | Reply to this

### Re: Measuring Diversity

Mike wrote:

But we still do think of a delta distribution as a “really spiky function” (at least, I do)

That’s a terrible habit. It’s infinitely better to think of the Dirac delta as a measure. Physics students typically learn this through great suffering: we do calculations with Dirac deltas in different coordinate systems, get different answers, and pull our hair out until we realize they don’t transform as functions.

Posted by: John Baez on October 29, 2011 10:21 AM | Permalink | Reply to this

### Re: Measuring Diversity

That’s a terrible habit.

Perhaps I never got over it because I never had to do any calculations. (-: But I thought the whole point of the notion of “distribution” was to learn to generalize your intuitive picture of “function” to include these other things. Of course you have to simultaneously learn that not all your old intuitions about “functions” apply to these new objects, but that’s no different from any other generalization in mathematics.

Posted by: Mike Shulman on October 29, 2011 9:39 PM | Permalink | Reply to this

### Re: Measuring Diversity

Continuing further into Some calculus with extensive quantities: wave equation, we see the interval $[a, b]$ taken as denoting the distribution

$\phi \mapsto \int_{a}^{b} \phi(x) d x,$

where $\phi$ is any real valued function.

Does this justify calling length an extensive quantity? Won’t you say as above

It seems to me that “length” in the sense used in “length is an extensive quantity” refers to the total length of a substance?

So $(b - a)$ is the total length of the interval running from $a$ to $b$, the result of applying that distribution to the constant 1 function. This is being discussed at prop. 2 of Calculus of extensive quantities. It results from the covariant nature of the distribution functor applied to the terminal morphism $X \to 1$.

Whereas the terminal morphism leads to constant intensive quantities on $X$, it provides the means to produce totals for extensive quantities. So when you say

It seems to me that “volume” in the sense used in “volume is an extensive quantity” refers to the total volume of a substance,

the fact you could form a total, in a way you can’t do for density, indicates that there’s an extensive quantity about.

Posted by: David Corfield on October 26, 2011 4:56 PM | Permalink | Reply to this

### Re: Measuring Diversity

David wrote:

the fact you could form a total, in a way you can’t do for density, indicates that there’s an extensive quantity about.

That seems to me to be the closest anyone’s got here to resolving this. “There’s an extensive quantity about” indicates a rather loose connection, but still. Fortunately, I’ll see Anders Kock in a month, and I’ll try to gain some further understanding from him.

Posted by: Tom Leinster on October 26, 2011 11:29 PM | Permalink | Reply to this

### Re: Measuring Diversity

the fact you could form a total, in a way you can’t do for density, indicates that there’s an extensive quantity about.

You can’t form a total for density? Can’t you integrate it and get something like “mass”, which is an extensive quantity?

Posted by: Mike Shulman on October 27, 2011 3:55 AM | Permalink | Reply to this

### Re: Measuring Diversity

But what are you going to integrate your density against? You need to have a measure to hand. If you’re in $\mathbb{R}^n$ or some other context where there’s a standard measure, then no problem, but otherwise forming a total for density seems like it would be problematic.

I guess the point is that on an arbitrary “space” $X$, you have a canonical pairing

$F(X) \times M(X) \to \mathbb{R}$

where $F(X)$ is the set of “functions on” $X$ and $M(X)$ is the set of “measures on” $X$. (All the terms in quotation marks are rather flexible.) The pairing is integration.

But the situation isn’t symmetric, because there is a canonical nontrivial function on $X$ (the one with constant value $1$), but there isn’t a canonical measure on $X$. Applying the pairing, that means that there is a canonical nontrivial map

$M(X) \to \mathbb{R}$

(integrate against the constant function $1$, i.e. take the measure of the whole of $X$) but there isn’t a canonical nontrivial map

$C(X) \to \mathbb{R}.$

Posted by: Tom Leinster on October 27, 2011 5:24 AM | Permalink | Reply to this

### Re: Measuring Diversity

I guess the point is that on an arbitrary “space” $X$

I feel like this is the source of some of the confusion. It sounds to me that with the standard physics meaning, intensive and extensive quantities are not properties of a space, but of a substance which exists in space. So it does seem reasonable to me to assume that there is a background volume form on space against which you can integrate the density (to obtain mass) or the constant function (to obtain total volume).

Posted by: Mike Shulman on October 27, 2011 1:07 PM | Permalink | Reply to this

### Re: Measuring Diversity

So the thought that homotopy is extensive to cohomology’s being intensive amounts to the fact that cohomology is about arrows out of an object (to, e.g., an Eilenberg-Mac Lane space) and so can be pulled back, is contravariant, etc., while homotopy is about arrows in (from, e.g., a sphere) and so can be pushed forward, is covariant?

Seems to be what’s being said here.

Posted by: David Corfield on October 25, 2011 4:33 PM | Permalink | Reply to this

### Re: Measuring Diversity

Tom wrote:

However, the Wikipedia page was a revelation to me. Until I saw it, I’d been under the impression that “intensive” and “extensive” were words of Lawvere’s invention.

Mike wrote:

Wow! Thanks for communicating that revelation!

Yikes!

You guys should have told me these terms were meaningless gibberish. One learns the intensive/extensive distinction in a physics class. I guess you know that now, but I can’t resist saying more.

Pressure is intensive because you can stick a tiny little pressure gauge in a tire and measure its pressure. Volume is extensive because you can’t measure volume that way: you need to look at the whole tire.

Temperature is intensive because you can stick a little thermometer in a bowl of soup and measure its temperature. Energy is extensive because you need to look at the whole bowl of soup to know its energy.

Back in the 1300’s, the great scientist Oresme noticed this sort of distinction and wrote about it in his main math book, the same book where he invented the technique of graphing a function and proved that the harmonic series diverged:

In a quality, or accidental form, such as heat, he distinguished the intensio (the degree of heat at each point) and the extensio (as the length of the heated rod).

Later, in thermodynamics, people noticed that quantities come in ‘conjugate pairs’ with one being extensive and the other intensive. Up to some fudge functions, the conjugate of volume is pressure, the conjugate of energy is temperature, and so on. The relation is this:

$\frac{\partial S}{\partial e} = i$

where $S$ is entropy, $e$ is your extensive quantity, and $i$ is its conjugate, an intensive quantity.

But what does this relation mean? It means is that if you have a system that’s maximizing entropy for a given value of $e$, and you see how the entropy changes as you change $e$, you can define an intensive quantity $i$ by this formula.

For example, your bowl of soup is probably doing a pretty good job of maximizing entropy for a given value of energy. So, take your extensive quantity $e$ to be energy. If you put the soup in a microwave for a minute, that will change its energy a small amount $\Delta e$. If you could then measure how much its entropy changed, say $\Delta S$, you could work out

$\frac{\Delta S}{\Delta e}$

and that would be close to

$\frac{\partial S}{\partial e} = i$

where $i$ is the intensive quantity corresponding to energy—roughly, temperature.

(In fact it’s the reciprocal of temperature, because someone goofed when defining temperature. Ultimately the reciprocal of temperature, or coolness, is more important than temperature.)

So, Tom, if we were sufficiently clever, we could probably take Lawvere’s ideas on intensive and extensive quantities and our reflections on entropy and combine them to get something nice. In fact Lawvere wrote about entropy… so maybe he was sufficiently clever. I looked at his paper, and it seemed rather orthogonal to ours, but I didn’t understand it very well.

By the way, my treatment of thermodynamic conjugates differs from the Wikipedia treatment I linked to, which is more conventional but somewhat less nice, because it treats energy as fundamental to this game, whereas I claim that entropy is more fundamental, being applicable whenever you have any probability distribution.

Posted by: John Baez on October 27, 2011 6:00 AM | Permalink | Reply to this

### Re: Measuring Diversity

Lawvere talks of intensive quantities being like ratios of extensive quantities. So is the idea that if we can fix on one extensive quantity in a domain, here entropy for thermodynamics, then we can pair together further extensive quantities with intensive quantities?

In a bunch of situations you choose the action as the designated extensive quantity. So how do you choose that designated extensive quantity, e.g., how do you know to choose entropy over energy?

Hmm, if entropy is extensive, why is Tom’s diversity intensive?

Posted by: David Corfield on October 27, 2011 8:58 AM | Permalink | Reply to this

### Re: Measuring Diversity

David wrote:

if entropy is extensive, why is Tom’s diversity intensive?

Excellent question. I don’t know.

We can simplify the question by removing the word “Tom’s”. Presumably you’re adding my name to indicate the diversity measures that take similarity into account. The version that doesn’t take similarity into account is given by

$D_q^I(p) = \Bigl( \sum_i p_i^q \Bigr)^{1/(1 - q)},$

which has been used as a measure of diversity since a paper of Mark Hill in 1973. Some readers here will know this better as the exponential of Rényi entropy.

For me, the sticking point is that I don’t understand what physicists mean by entropy. So, I don’t know why you would think of entropy as extensive. I should go and read some stuff.

Posted by: Tom Leinster on October 28, 2011 4:56 PM | Permalink | Reply to this

### Re: Measuring Diversity

…I don’t understand what physicists mean by entropy

I wonder if Entropy – A Guide for the Perplexed could help. It’s written by two philosophers of physics.

Posted by: David Corfield on November 2, 2011 9:44 AM | Permalink | Reply to this

### Re: Measuring Diversity

David wrote:

Lawvere talks of intensive quantities being like ratios of extensive quantities.

That makes no sense in the usual physics meanings of these terms, unless “like” is being used to hint at something big.

I’d translate what you’re saying into this sentence: when it’s well-defined, the ratio of measures is a function.

This ratio is usually, a bit confusingly, called a ‘Radon-Nikodym derivative’. That’s because people like to write a measure as $d \mu$, so a ratio of them looks like $d \mu / d \nu$.

Posted by: John Baez on October 29, 2011 10:30 AM | Permalink | Reply to this

### Re: Measuring Diversity

Lawvere talks of intensive quantities being like ratios of extensive quantities.

That makes no sense in the usual physics meanings of these terms

Sure it does. Example: density is the ratio of mass to volume.

Of course, you can't simply divide the total mass by the total volume (unless you're content to get a volume-weighted average density). Really, to get the density at a given point, you have to take divide the mass of an infinitesimal region around that point by the volume of that region. As you noticed, this is the Radon–Nikodym ‘derivative’.

I think that when Lawvere says that an intensive quantity is a function, he means that literally; when he says that an extensive quantity is a measure, he means that the extensive quantity is the integral (against the constant function 1) of a measure, but the measure is really what's important, so he's going to talk about that.

Posted by: Toby Bartels on November 15, 2011 11:18 PM | Permalink | Reply to this

### Re: Measuring Diversity

I think that when Lawvere says that an intensive quantity is a function, he means that literally; when he says that an extensive quantity is a measure, he means that the extensive quantity is the integral (against the constant function 1) of a measure, but the measure is really what’s important, so he’s going to talk about that.

John’s comment here gave me the impression that in this situation, a physicist would call the measure itself an intensive quantity, and only use “extensive” for its integral.

In other words, what I’m hearing is that in physics, we have local quantities which are called “intensive”, some of which are (contravariant) functions and some of which are (covariant) measures, and the integrals of the intensive measures are global quantities called “extensive”. Whereas Lawvere is ignoring the global quantities entirely, reusing the word “extensive” for the local covariant quantities which can be integrated to give global ones, and narrowing the word “intensive” to the local contravariant quantities. Is that right?

Posted by: Mike Shulman on November 16, 2011 8:25 AM | Permalink | Reply to this

### Re: Measuring Diversity

I don’t know about the physicists’ use, but I think you have Lawvere’s position right. There’s more to be understood about the codomains of these quantities. Lawvere points to the use of linear categories, as contrasted with distributive ones, in this way.

Posted by: David Corfield on November 16, 2011 12:32 PM | Permalink | Reply to this

### Re: Measuring Diversity

David suggests fixing an extensive quantity and then pairing up up intensive and extensive quantities by taking ratios. This sounds very much like what I saying to Mike about measures and functions. Suppose you have a space $X$ and fix a measure $\mu$ on it (e.g. $X = \mathbb{R}^n$ with Lebesgue measure). Then any nice enough function $f$ on $X$ gives rise to a measure $f\mu$ (or “$f d\mu$”). The opposite process is to take a measure $\nu$ that’s absolutely continuous wrt $\mu$ and form the “ratio” $d\nu/d\mu$, i.e. the Radon-Nikodym derivative. This gives a pairing between (suitable) functions and measures.

On the other hand, some of us are sceptical that measures in the standard mathematical sense are really extensive quantities.

Posted by: Tom Leinster on October 27, 2011 12:23 PM | Permalink | Reply to this

### Re: Measuring Diversity

You guys should have told me these terms were meaningless gibberish. One learns the intensive/extensive distinction in a physics class.

So whenever I hear a mathematician say meaningless gibberish, I should assume it’s something one learns in a physics class and ask you about it? Or is it just Lawvere? (-:

Does Lawvere’s use of the terms make sense to you, as a physicist?

Posted by: Mike Shulman on October 27, 2011 1:09 PM | Permalink | Reply to this

### Re: Measuring Diversity

No need to guess. Lawvere never claimed to have invented these terms. From the introduction to “Categories in Continuum Physics” (LNM 1174), page 7:

The important distinction between intensive and extensive quantities can also be exemplified in any category X of the kind under consideration. While these terms, of philosophical origin, are costumarily [typo?] employed only in thermodynamics, (contrasting temperature, pressure, and density with energy, volume, and mass), they are actually applicable throughout continuum physics and indeed in mathematics generally.

He then gives further explanations, partly along the lines already given by John, which I am too lazy to reproduce in full. But I think, his introductions usually make great reading. It is a pity that perhaps they may not receive the attention they deserve because of the heading “Introduction”.

Posted by: Marc Olschok on October 28, 2011 8:33 PM | Permalink | Reply to this

### Re: Measuring Diversity

Mike wrote:

So whenever I hear a mathematician say meaningless gibberish, I should assume it’s something one learns in a physics class and ask you about it? Or is it just Lawvere? (-:

Well, since Lawvere started out working on continuum mechanics, and developed topos theory largely in order to formalize his ideas in that subject, there’s a reasonable chance anything of his that sounds like “meaningless gibberish” to an expert in category theory will be clarified by pestering someone who knows continuum mechanics.

Luckily, you don’t need to pester me. Continuum mechanics is a branch of field theory, and Urs knows that subject well, so you can pester him instead. As you know, he’s been taking Lawvere’s ideas, replacing the categories by $(\infty,1)$-categories, and getting large hunks of modern physics to fall out quite naturally.

Does Lawvere’s use of the terms make sense to you, as a physicist?

He’s stretching them, as is his usual wont, but I always found this intensive/extensive stuff to be one of the more comprehensible aspects of Lawvere’s work.

(I’ll let you interpret that however you want.)

Posted by: John Baez on October 29, 2011 10:02 AM | Permalink | Reply to this

### Re: Measuring Diversity

Does Lawvere’s use of the terms make sense to you, as a physicist?

He’s stretching them, as is his usual wont, but I always found this intensive/extensive stuff to be one of the more comprehensible aspects of Lawvere’s work.

Okay, so can you explain where his usage comes from? As David said, Lawvere’s identification of intensive with contravariant seems to contradict what you said about some intensive quantities being covariant and others contravariant.

Posted by: Mike Shulman on October 29, 2011 9:43 PM | Permalink | Reply to this

### Re: Measuring Diversity

We also had a discussion here of “extensional” vs “intentional.” Wikipedia also has a brief explanation of what some people mean by an extensional definition. Seems like it is related to “extensive” vs “intensive” dichotomy.

Posted by: Euene Lerman on October 25, 2011 3:27 PM | Permalink | Reply to this

### Re: Measuring Diversity

That’s a different sense of the terms, isn’t it? As explained here

…‘creature with a heart’ and ‘creature with a kidney’ have the same extension because they are true of the same individuals: all the creatures with a kidney are creatures with a heart. But the two expressions have different intensions because the word ‘heart’ does not have the same extension, let alone the same meaning, as the word ‘kidney.’

Posted by: David Corfield on October 25, 2011 4:03 PM | Permalink | Reply to this

### Re: Measuring Diversity

David wrote:

That’s a different sense of the terms, isn’t it?

Probably! I certainly wouldn’t recommend anyone trying to understand the extensive/intensive distinction to think about the extensional/intensional distinction, which lives in a whole different realm of discourse—the philosophy of language, not continuum mechanics.

Often “language is wiser than us”, as Heidegger said, so it makes sense to look for relations between similar terms even if they’re not apparent. For example, there’s a good reason why certain integrals take integral values. But to zeroth order, the extensive/intensive distinction is completely different from the extensional/intensional one. So at the very least one had better learn what they mean before trying to relate them!

Posted by: John Baez on October 29, 2011 9:53 AM | Permalink | Reply to this

### Re: Measuring Diversity

I certainly wouldn’t recommend anyone trying to understand the extensive/intensive distinction to think about the extensional/intensional distinction, which lives in a whole different realm of discourse

And yet by your links, that appears to be precisely what you suggest …

Posted by: Toby Bartels on November 15, 2011 11:23 PM | Permalink | Reply to this

### Re: Measuring Diversity

Hi Tom,

I’ve been meaning to tell you how much I enjoy reading your work on this subject. The topic of “diversity” is also very important in financial risk management. You might hear that a portfolio should be “diversified” to reduce risk. But what does that mean? If you hold 100 assets, but they all have identical risk characteristics, your portfolio consists of effectively a single species and you are, in fact, not diversified at all.

I’ve applied some of your ideas to time series analysis as a first attempt due to the fact it is such low hanging fruit with interesting results. During a crisis, the effective number of species in the market is reduced and things start behaving similarly.

Not sure if you’re interested, but that could be a possible new (and more lucrative, in terms of funding) frontier for research.

Looking forward to following your work on this regardless of the applications.

Posted by: Eric on October 25, 2011 11:15 AM | Permalink | Reply to this

### Re: Measuring Diversity

Thanks for the kind words, Eric.

Diversity has been studied quantitatively in economics, if not finance. In fact, there’s a book that I’ve never managed to track down:

L. Hannah and J. Kay, Concentration in the Modern Industry: Theory, Measurement, and the U.K. Experience. MacMillan, London, 1977.

In it, they apparently introduce a quantity now known by some as the “Hannah–Kay index of concentration”, which measures how concentrated an industry is. What does “concentration” mean? The extreme of concentration is a monopoly: all activity concentrated into one company. The opposite extreme is to have many small, equally active companies.

This Hannah–Kay index is

$\Bigl( \sum_i p_i^q \Bigr)^{1/(q - 1)}$

where $p_i$ is the market share of the $i$th company. It’s nothing but $1/D_q^I(p)$, the reciprocal of what ecologists call the Hill number and information theorists call the exponential of the Rényi entropy (of order $q$).

If you wanted to introduce some notion of the similarity/difference between companies, you could replace $I$ by another matrix $Z$. For example, if the activity in an industry is spread across many different companies, but they’re all owned by the same parent company, then you’d probably regard the industry as being more concentrated (closer to a monopoly) than if those companies were unrelated. You could model this by using coefficients $Z_{i j}$ measuring the extent to which companies $i$ and $j$ are related.

I haven’t heard of the Hannah–Kay/Hill/Rényi formula being used to measure diversity of a portfolio, but I don’t see why not. In general, regardless of application, it appears to be the only measure of diversity with all the sensible properties that one would like.

Posted by: Tom Leinster on October 26, 2011 11:46 PM | Permalink | Reply to this

### Re: Measuring Diversity

A much more commonly used measure of industry concentration is the Herfindahl-Hirschmann Index (HHI) which is the sum of the squares of the firms’ shares.

It has the cute property of being equal to the average industry price-cost margin times the market demand elasticity in a Cournot oligopoly model and it is still used by some antitrust authorities to provide indicative thresholds for the analysis of mergers.

It is not, however, used as an index of diversity (other than diversity in size). For markets where firms produce substantially heterogeneous products, no similar index is used in competition policy analysis.

A number of economists have worked on more general measures of diversity, e.g.,

http://www.jstor.org/stable/2118476

http://www.jstor.org/stable/3132144

and

http://www.jstor.org/stable/2692310

Posted by: valter on November 23, 2011 5:45 PM | Permalink | Reply to this

### Re: Measuring Diversity

Reading “sum of squares” suddenly reminded me of something. In the board game Diplomacy seven players compete to capture “supply centres” on a map of Europe.

Each player starts with 3 supply centres (except Russia which has 4). There area total of 34 available. When you calculate the sum of squares of number of supply centres held by each player you essentially get what fans of the game call the “Index”. They say

“Index is the sum of squares of the number of supply centers divided by the
number of players. It is a measure of how far the game has progressed.”

The idea is that as the game goes on a few players tend to dominate and the Index increases. Although the entropy, or diversity, is decreasing, this quantity Index does increase with time.

Posted by: Tom Ellis on November 25, 2011 8:55 AM | Permalink | Reply to this

### Re: Measuring Diversity

That’s really funny! But not hugely surprising, I suppose, in that $\sum_i p_i^2$ is somehow the most obvious among all the quantities $\sum_i p_i^q$ from which entropy and diversity are derived.

$\sum p_i^2$ has a simple probabilistic interpretation: the probability that two individuals come from the same group. And $1/\sum p_i^2$, which is the diversity of order $2$, is nearly as straightforward: if you pick successive individuals at random, it’s the expected number you have to pick before obtaining two from the same group.

I seem to remember “$\sum \alpha^2$” appearing in the complicated formula for assessing Cambridge undergraduate mathematics exams; it’s the sum of the squares of the number of alpha grades on each paper. This encourages people to do whole questions, not lots of parts. If they’d wanted to encourage that further, they could have used $\sum \alpha^q$ for some $q \gg 2$.

Posted by: Tom Leinster on November 25, 2011 11:34 AM | Permalink | Reply to this

### Re: Measuring Diversity

Thanks! That’s really interesting. I had no idea these things were used in antitrust proceedings. And I’d read (well, browsed) the Weitzman paper before; it seems to be quite widely cited. But I hadn’t come across the other two.

I hadn’t heard of the Herfindahl-Hirschmann index (HHI). In the notation we’re using here, it’s

$\sum_i p_i^2.$

Here’s how I’d understand it. Suppose all our companies make soap, and the market share of each company is defined to be the proportion of all soap sold that’s made by that company. Then the HHI is the probability that two randomly-chosen bars of soap will have been made by the same company. It’s high if soap is only made by a couple of huge corporations, and low if lots of little companies have similar market shares.

In the notation above, $\sum p_i^2$ is

$1/D_2^I(p)$

— in other words, it’s the Hannah–Kay index of order 2. So HHI is a special case of the Hannah–Kay index.

Maybe the Hannah–Kay index is a really obscure thing. I confess I’ve only seen it discussed in one economics paper; I first read about it in an ecology paper. Is it something you’d heard of previously?

Posted by: Tom Leinster on November 23, 2011 6:06 PM | Permalink | Reply to this

### Re: Measuring Diversity

A classic industrial organization textbook (Scherer and Ross, 1990) briefly mentions the book by Hannah and Kay in a footnote on the HHI, but I had forgotten about it. I have never seen it anywhere else.

Even the HHI is being replaced by more “economic” methods that look directly at the incentives to raise prices rather than relying on concentration measures.

Posted by: valter on November 23, 2011 11:40 PM | Permalink | Reply to this

### Re: Measuring Diversity

Thanks, Valter. I haven’t tried enormously hard to track down Hannah and Kay’s book, but I have tried a little bit. From the difficulty of the task, I began to suspect that their work might not be terribly well known. It’s good to get confirmation of that from an insider.

Posted by: Tom Leinster on November 24, 2011 10:32 PM | Permalink | Reply to this

### Re: Measuring Diversity

Measuring biodiversity include all the following.

Number of individuals

Number of species

Number of individuals within each species.

Percentage of total number of individuals that each species represents.

Genetic diversity between species

Genetic diversity between individuals within each species

It seems as though this last one is not mentioned in your post. Some species might have large genetic diversity within the species. Other species might have small genetic diversity within the species.

You should also make some reference to the number of chromosomes, or genes, or DNA base pairs, each species has. A species with more genes would have more possible genetic diversity within the species than a species with fewer genes. Also a change in a few genes in a species with a small number of genes, would represent a greater percentage change in the total genes, than a change in the same small number of genes in a species with a larger number of genes.

Posted by: Jeffery Winkler on October 26, 2011 8:49 PM | Permalink | Reply to this

### Re: Measuring Diversity

Jeffery wrote:

Measuring biodiversity include all the following.

That’s a more dogmatic statement than I think is wise. In the ecological community, there’s been 50 years of debate about what biological diversity really is. Sometimes this debate hasn’t been very productive, because of people talking at cross-purposes: to one person, “diversity” means one thing, and to another person, it means something else. There are many different opinions on how we should use the word.

I don’t know whether your sentence is a declaration that measuring biodiversity should involve measuring all the things you list, or whether it’s a statement that measuring diversity always does involve measuring these things. I guess you’re saying what you believe it should involve. (If you meant what it does involve, that’s incorrect: people measuring biodiversity usually ignore many of the factors you list.)

Genetic diversity between individuals within each species

It seems as though this last one is not mentioned in your post. Some species might have large genetic diversity within the species. Other species might have small genetic diversity within the species.

This is an important point, and one we’ve thought about. I believe our diversity measures cope with this well. Remember that

“species” need not mean species.

It could mean “genus”, or “subspecies”, or even “individual”: any grouping you like. As I wrote in the post,

We divide them into $S$ groups, conventionally called species, though they needn’t be species in the ordinary sense

and as we wrote in the paper,

The word ‘species’ can stand for any unit thought biologically meaningful.

As you take finer and finer divisions of your community, the diversity calculated will go up and up, because of the variation within each class.

If you’re dealing with a community of microbes, you’re probably going to have to work at the level of individuals from the start. That’s partly because most microbes haven’t been classified taxonomically, and partly because a test-tube of goo will contain far more microbes than can be examined individually. What people normally do is put them in a centrifuge and look at the DNA variation, in some very specific and often rather technical sense.

Incidentally, if it happens that taking finer and finer divisions of your community doesn’t result in significant increases in diversity, something funny’s happening. It indicates that you’ve reached a level of subdivision where each class is very homogeneous.

For example, there’s a sense in which there are only about 50 cows in Canada. That is, the effective number of cows is about 50. (I forget the exact number, but it’s astoundingly low: something of that order.) It’s so low because cattle have been intensively bred for a long time, to maximize meat yield, milk yield, etc. One consequence is that a large herd might not have appreciably more diversity than an individual.

You should also make some reference to the number of chromosomes, or genes, or DNA base pairs, each species has.

I disagree. We’re deliberately non-prescriptive. It’s for individual scientists to decide which measure of inter-species similarity is most appropriate to their situation. Even restricting to genetic techniques, there are many ways that you could quantify the similarity $Z_{i j}$ between two species.

If you read the Discussion of our paper, you’ll see that we address this point:

We also anticipate, and answer, a possible objection to our own diversity measures. In order to compute $D_q^Z(p)$, one has to assign a similarity coefficient $Z_{i j}$ to each pair of species. There is no canonical way to do this, so it might be objected that this makes the quantification of diversity too subjective.

Our answer is that diversity is subjective: it depends on which characteristics of organisms are taken to be important. This flexibility is, in fact, an advantage. If community A is genetically more diverse, but functionally less diverse, than community B, that is not a contradiction but a point of interest. Different ways of quantifying similarity lead to different measures of diversity. The word ‘diversity’ means little until one has specified the biological characteristics with which one is concerned.

Then a couple of paragraphs later:

Similarity and diversity vary according to perspective. Suppose, for example, that we are interested in the antigenic diversity of a collection of strains of the parasite Plasmodium. If similarity is measured using a nucleotide comparison of the entire genome then any two strains will look near-identical, giving the collection a very low diversity. But since we wish to measure antigenic diversity, we are really only concerned with the part of the genome that determines antigenicity. A nucleotide comparison localized to that region will reveal the sought-after differences, producing lower similarities and higher diversity.

Posted by: Tom Leinster on October 27, 2011 2:06 AM | Permalink | Reply to this

### Re: Measuring Diversity

John kindly let me write a version of this post for Azimuth. It’s actually quite different from what I’ve written here. The major thing that it has and this post doesn’t is an illustration of how taking species similarity into account can change your judgement on which of two communities is the more diverse.

Posted by: Tom Leinster on November 9, 2011 2:56 PM | Permalink | Reply to this

### Re: Measuring Diversity

Sometimes one has, instead of a measure of inter-species similarity (measured on a scale of 0 to 1), a measure of inter-species distance (measured on a scale of 0 to &infty;).

I know, I should read your paper, but do ever consider requiring these distances to obey the triangle inequality? (making Z a specification of a Lawvere metric space, so you in particular must have thought of this). I would find it very odd if i is 90% similar to j and j is 90% similar to k but i is only 20% similar to k; imposing the triangle inequality on their antilogarithms would prevent this.

Posted by: Toby Bartels on November 15, 2011 11:31 PM | Permalink | Reply to this

### Re: Measuring Diversity

Actually, we don’t discuss this kind of thing in the paper. It’s written for ecologists, and although more and more math gets used in ecology these days, we kept it to a minimum. Proposition A5 in the appendix is relevant to your question, but it still doesn’t answer it directly.

So: no, we don’t require any kind of triangle inequality. Quite simply, we never needed to assume it for anything we wanted to prove about diversity measures, so we didn’t.

I would find it very odd if $i$ is 90% similar to $j$ and $j$ is 90% similar to $k$ but $i$ is only 20% similar to $k$; imposing the triangle inequality on their antilogarithms would prevent this.

I see what you mean, though I think you mean “logarithms” (or conceivably “antiexponentials”): the formula is $Z_{i j} = e^{-d_{i j}}$, or “similarity is the negative exponential of distance”. So a similarity of $0.9$ corresponds to a distance of $-\log(0.9)$.

Here’s a reasonable situation in which the triangle inequality on the logarithms is violated. In other words, here’s an example of a reasonable notion of similarity that doesn’t come from a metric.

Suppose we assign to each species a 4-bit word. The bits in the word might be the answers to four yes/no questions (“does it lay eggs?” etc). The similarity between two species is defined to be the number of questions for which the species have the same answers, divided by 4. Suppose that species 1, 2 and 3 have words

$1010, \qquad 1100, \qquad 0101$

respectively. Then

$Z_{12} = Z_{23} = 0.5, \qquad Z_{13} = 0.$

But if we define $d_{i j}$ by $Z_{i j} = e^{-d_{i j}}$ then $d_{12} = d_{23} = \log 2, \qquad d_{13} = \infty.$

So this $d$ does not satisfy the triangle inequality.

The similarity measure in this example does satisfy some kind of triangle inequality, namely $Z_{i k} \geq Z_{i j} + Z_{j k} - 1$. But there are probably other sensible measures of similarity that don’t satisfy this.

Posted by: Tom Leinster on November 16, 2011 2:00 AM | Permalink | Reply to this