Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

April 26, 2012

The Mathematics of Biodiversity

Posted by Tom Leinster

Interested in biological diversity? Want to know more about how diversity can be quantified? Maybe diversity comes up in your work. Maybe you’ve heard rumours that there’s serious mathematics involved, and you want to know more. Or maybe you’re just curious.

If so, come to a meeting in Barcelona! It’s running 2-6 July, and there are grants to cover attendance expenses. (If you want one, please apply as soon as possible.) We also have free slots for contributed talks.

We’ve assembled what is already a head-spinningly varied group of people, from livestock breeding experts to ostensibly pure mathematicians to evolutionary ecologists. Two or three of your Café hosts will be there. Details follow.

THE MATHEMATICS OF BIODIVERSITY

  • Exploratory Conference
  • Centre de Recerca Matemàtica, Barcelona
  • 2-6 July 2012
  • www.crm.cat/CBIO

What is diversity? How do we measure it? This event brings together life scientists and mathematicians to advance our understanding of diversity and its measurement. We welcome everyone with an interest in measuring diversity, from microbial biologists to pure mathematicians to conservation ecologists.

Grants for attendance expenses are available. The official deadline to apply is 30 April. If you want to apply but think you will miss the deadline (or have already missed it), all is not lost: let us know at the addresses below.

We are also taking offers of contributed talks. If you would like to give one, please contact us as soon as possible.

Our current list of speakers is:

Scientific enquiries: Tom,Leinster#glasgow,ac,uk. Administrative enquiries: NPortet#crm,cat (Ms Neus Portet).

Posted at April 26, 2012 6:48 PM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2522

20 Comments & 0 Trackbacks

Re: The Mathematics of Biodiversity

Your ‘Measuring diversity’ paper is referred to by Pavlovic here in a paper which prefers ‘proxets’, categories enriched over the multiplicative monoid [0,1][0, 1], to the equivalent generalized metric spaces.

Posted by: David Corfield on April 27, 2012 10:57 AM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Thanks, David. I think a bit of your browsing history accidentally peeped through there (don’t worry, nothing embarrassing) — you link to the wrong paper. I think you must mean this one.

Dusko sent me a copy of that paper some months ago, but I’m afraid I didn’t get round to either reading it properly or replying to him, sad to say.

Posted by: Tom Leinster on April 29, 2012 11:13 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Dusko’s paper uses FCA ( = Formal Concept Analysis). There was a Dagstuhl meeting a few years ago at which various people working in FCA met people working in Domain theory and Chu spaces. (For instance one of the papers there was Zhang, G.Q.: Chu spaces, concept lattices, and domains. In Brookes, S., Panan-
gaden, P., eds.: Electronic Notes in Theoretical Computer Science. Volume 83.,
Elsevier (2004).) Given the link between Chu, domains and logic, is there a `logic of bio-diversity’?

Posted by: Tim Porter on April 30, 2012 7:45 AM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

We’ve assembled what is already a head-spinningly varied group of people,

Can you quantify their diversity?

Posted by: Urs Schreiber on April 27, 2012 11:53 AM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Can you quantify their diversity?

Yes, he can, and in many different ways! First, you should specify what kind of similarity between these people is most relevant to your interest: academic, geographical, genetic, etc. Next, give percentages to quantify that similarity — just how similar are experts on livestock breeding and category theory, or Cambridge and the other Cambridge? Finally, are you more interested in the total number of different specialties/cities/genotypes/etc. represented, or do you find that only the most common ones are relevant, or do you want to graph the entire diversity profile?

Posted by: Mark Meckes on April 27, 2012 5:48 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Can you quantify their diversity?

Yes, he can, and in many different ways!

Not to say: in diverse ways. ;-)

Seriously, it is clear that there are many ways to assign numbers to natural phenomena and to declare that these numbers mean something. But I am imagening that the mathematics of diversity can somehow give more universal answers?

Can you give me a rough idea of what “mathematics of diversity” can accomplish? What would be a typical theorem in diversity-theory? What do we learn from diversity theory? How is diversity theory different from just statistics?

Posted by: Urs Schreiber on April 29, 2012 10:02 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

>>>Can you quantify their diversity?

>>Yes, he can, and in many different ways!

>Not to say: in diverse ways. ;-)

Actually, can we take that idea seriously? If we pick a probability distribution, its Renyi extropy gives a family of quantities measuring “diversity” depending on a continuous parameter alpha, which (if we fix a distribution from which to draw alpha) we can interpret as another probability distribution on the set of possible diversities. Could we iterate this process, producing not just effective numbers, but effective numbers of effective numbers, and so on?

For example, if a population is very evenly divided into n equal populations, then all of its Renyi extropies will be approximately n, and so the diversity of extropies would be only slightly more than 1. On the other hand, if the Shannon extropy of an arbitrary distribution is equal to n, we would expect the Renyi extropies to be more varied, and the diversity of extropies would be larger.

Perhaps one could even recover, in some special circumstances, the original distribution from knowledge, say, the Shannon extropy, the Shannon extropy of the Renyi extropies, the Shannon extropy of the Renyi extropies of the Renyi extropies, etc.?

Posted by: Owen Biesel on April 30, 2012 1:27 AM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Thanks for picking this up in a constructive way, Owen.

I don’t know about you (you all), but I found it amusingly self-referential to say of conference about biodiversity that there are going to be a

head-spinningly varied group of people

and then to exclaim that this can be made precise

in many different ways!

And, yes, while amusing, it immediately seems to raise serious questions for a “theory of diversity” to be.

Posted by: Urs Schreiber on May 2, 2012 8:26 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

I don’t know about Tom’s original post, but the self-referentiality in my comment was deliberate. I thought about writing “in many diverse ways”, but that seemed just a little too unsubtle.

More seriously, I haven’t had a chance to think carefully about it, but what Owen is suggesting seems closely related to the way entropies of different flavors appear as rate functions in large deviations theory.

Posted by: Mark Meckes on May 3, 2012 1:45 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Interesting, Owen, interesting.

I don’t have a direct answer, but here are two things that feel related.

First, if you know all the Rényi extropies of a (finite) probability distribution then you can recover the distribution itself, up to permutation of the points (or ‘species’). In fact, you don’t even have to know the Rényi extropies of all orders qq: it’s enough to know them for some sequences of orders qq converging to \infty.

Second, there’s the thing known as the Giry monad. I don’t know how much category theory you know, but if the answer is “not much” then fear not: there is a direct intuitive explanation, as follows.

Suppose we have a space XX of some kind. There are many ways of choosing a point randomly from XX. (Interpret this, if you like, as “there are many probability distributions on XX”, though I don’t want to be too precise or formal here.) Now, suppose you have a random way of choosing a random way of choosing points from XX. For example, maybe X=X = \mathbb{R}, you choose a number σ{1,2,3,4,5,6}\sigma \in \{1, 2, 3, 4, 5, 6\} by throwing a fair die, and then you choose a point from \mathbb{R} according to the normal distribution N(0,σ)N(0, \sigma) with mean 00 and standard deviation σ\sigma.

The point is that this extra layer of randomness doesn’t really make the process any more random: it still just reduces to a random way of choosing points from XX. In other words:

a random way of choosing a random way of choosing points from XX

gives rise canonically to

a random way of choosing points from XX.

This is, in essence, the idea behind the Giry monad. As I said, I don’t know if it really has any relevance to your comment; it’s just a hunch.

Posted by: Tom Leinster on May 9, 2012 5:20 AM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

This is, in essence, the idea behind the Giry monad.

I haven’t yet got around to understanding the Giry monad, but I feel like this gives me a good place to start, because it a familiar idea to me which I already know how to think about in my own terms. Topological subtleties aside, it just says that the space of probability measures on XX is convex.

Posted by: Mark Meckes on May 10, 2012 12:50 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Topological subtleties aside, it just says that the space of probability measures on XX is convex.

Yes, I suppose it does!

If I were to give a slightly more detailed sketch of the idea behind the Giry monad, I’d add the following two things:

  • each point of XX gives rise canonically to a random way of choosing points from XX (namely, “always choose that point”)
  • any map XYX \to Y (whatever that means) gives rise canonically to a map from (random ways of choosing points from XX) to (random ways of choosing points from YY).

But these are kind of trivial compared to what I said in my previous comment.

Posted by: Tom Leinster on May 10, 2012 5:41 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

You might want to take a look at the article I link to here – A Categorical Foundation for Bayesian Probability.

Posted by: David Corfield on May 11, 2012 9:19 AM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Thanks — I did catch that link in the other thread and put the paper high on my to-read list. (Unfortunately, my progress through that list is likely to be quite slow for the foreseeable future (but then, isn’t everyone’s?).)

Posted by: Mark Meckes on May 11, 2012 1:51 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

If you literally have a to-read list, and actually use it, I think you’re already much more organized than most of us.

Posted by: Tom Leinster on May 11, 2012 4:46 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

I do literally have a to-read list, in fact two of them (papers and books). Whether I actually use them in a meaningful way is another matter.

Posted by: Mark Meckes on May 11, 2012 5:03 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Urs wrote…

But I am imagining that the mathematics of diversity can somehow give more universal answers?

I hope not. I view it as part of the general idea of decategorification.

Can you give me a rough idea of what “mathematics of diversity” can accomplish? What would be a typical theorem in diversity-theory? What do we learn from diversity theory? How is diversity theory different from just statistics?

These are excellent questions that deserve long answers. I’ll try to restrain myself.

The first thing is that there are deep and difficult questions that have nothing to do with statistics. In fact, my own involvement in the mathemtics of diversity is entirely non-statistical. There are very serious statistical challenges in measuring diversity: for example, how can you possibly guess the number of species too rare to show up in your sample? But personally, my interests lie elsewhere.

Earlier this year, I gave a talk about diversity measurement at the (British) National Centre for Statistical Ecology. (I had to come clean and begin by admitting that I knew almost no statistics and almost no ecology.) At some point in the talk I showed a slide saying something like “we assume that our community is fully censused”, causing laughter from the audience. But the point is that even if you know about every last beetle in your community, producing meaningful invariants is still non-trivial.

So, if it’s not statistical, what is it?

Maybe I’ll say more another time, but for now I’ll just point out the information-theoretic connections. Urs, you’ve probably seen enough posts about this to know that there’s a close connection with entropy. An ecological community of nn species can be crudely modelled as a probability distribution

p=(p 1,,p n), p = (p_1, \ldots, p_n),

where p ip_i represents the relative abundance (or proportion) of the iith species. Ecologists often measure the diversity of the community as the Shannon entropy of the distribution, namely

H(p)=p ilog(p i), H(p) = -\sum p_i \log(p_i),

or (better) as its exponential,

e H(p)=p 1 p 1p 2 p 2p n p n. e^{H(p)} = p_1^{-p_1} p_2^{-p_2} \cdots p_n^{-p_n}.

Then there are various related entropies, such as the Rényi and Tsallis entropies, appearing in the physics, information theory and statistics literature. Many of them have appeared in the ecology literature too.

Are these quantities really relevant ecologically? Well, that’s been the subject of lots of debate. Ultimately what we want is some theorem saying: “any diversity measure satisfying conditions X, Y and Z must be one of the following”. There’s a long tradition of such theorems in information theory, tapping into the theory of functional equations. Some of the conditions involved are clearly well-motivated ecologically, and some are more tenuous.

In my opinion, the most incisive work on ecological diversity measurement rises high above ecology. It becomes about something far more general, something mathematically universal. That’s really why I’m interested, attractive as the ecological applications may be.

Here’s a recent example. Take an ecological community partitioned into mm geographical areas or “subcommunities”. Ecologists have long asked: how much of the whole community’s diversity can be attributed to the diversity within the individual subcommunities, and how much to the variation between the subcommunities? You can imagine this might affect decisions on how to allocate resources for conservation.

The average diversity within each subcommunities is traditionally called the α\alpha-diversity, the diversity between the subcommunities is called the β\beta-diversity. Those are loose descriptions only. To turn them into precise quantities is no mean feat, and in fact people did it wrongly for several decades.

It was shown in 2007 (by sometime Café contributor Lou Jost) that, in fact, if you want α\alpha- and β\beta-diversity to be independent in an intuitively obvious sense, then there’s only one possible way to define them. It’s essentially a theorem about functional equations, but as far as I know it’s not one in the functional equation literature. And it’s a definitive answer; it’s the canonical way of partitioning diversity.

(I should mention that a similar result had been obtained in the late 1970s, by the Canadian statistician Rick Routledge, unknown to Jost at the time. But either Routledge didn’t realize the significance of his own work, or he was rather too quiet about it.)

I want to say much more, especially about how this links in to the theory of magnitude/Euler characteristic of enriched categories and lax colimits. But I think this comment is long enough already.

Posted by: Tom Leinster on April 30, 2012 3:53 AM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Thanks for the reply, Tom!

I wrote:

But I am imagining that the mathematics of diversity can somehow give more universal answers?

Your first reaction to this is to say…

I hope not.

…but I gather there is some misunderstanding between us at this point about which hopes are being discussed, because right afterwards you do allude to precisely such more universal answers, when you write, explicitly:

the most incisive work on ecological diversity measurement rises high above ecology. It becomes about something far more general, something mathematically universal.

and before that

Ultimately what we want is some theorem saying: “any diversity measure satisfying conditions X, Y and Z must be one of the following”.

Concerning this last point: how would you describe to an information-theorist the difference between information theory and diversity theory?

(That’s probably the question I should have asked instead of “How is diversity theory different from just statistics?”)

Posted by: Urs Schreiber on May 2, 2012 8:22 PM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

I gather there is some misunderstanding

Ah, yes. I wasn’t clear. When I wrote “I hope not”, I meant that I hoped you weren’t merely imagining that the mathematics of diversity can somehow give more universal answers. I hope it really can. Like, if Elvis walks into the room, you say “is that Elvis, or am I imagining it?”, and I reply, “no, you’re not imagining it!”

Concerning this last point: how would you describe to an information-theorist the difference between information theory and diversity theory?

One difference is that there isn’t really a known thing called “diversity theory” — yet.

Another is in the applications that shape the two areas. There is, of course, a lot of biological literature on diversity measures, with varying degrees of mathematical sensitivity and varying degrees of biological relevance. (I don’t think those two things are in opposition; on the contrary, I think they pull in the same direction. But, of course, people have different backgrounds.) There’s also related work in other fields, especially economics (which I know roughly zero about).

On the other hand, information theory grew out of communication theory, and has been nourished by its interactions with statistical mechanics.

Forgetting the applications and thinking about the pure mathematics of it, I have a certain vision of what “diversity theory” is/could be, though here’s probably almost no one else on earth who sees it the same way. My personal view is that it’s a part of some general story about cardinality-like invariants. It’s the part concerning probability distributions.

I said a lot about how diversity fits into a general story about invariants of size in my two posts on “Entropy, diversity and cardinality”, back in 2008. A bit more recently, a theorem emerged confirming that there’s a substantial connection here. To a zeroth approximation, it says that “magnitude is maximum diversity”: the magnitude of a metric space equals the maximum diversity of a probability distribution on it. That’s not quite right, but it conveys the general flavour.

Thanks for asking!

Posted by: Tom Leinster on May 3, 2012 6:02 AM | Permalink | Reply to this

Re: The Mathematics of Biodiversity

Hello all,

It seems like there could be a few people here who might be interested in our paper “Hyperconvexity and Tight Span Theory for Diversities” which is available here:

http://arxiv.org/abs/1006.1095

I wrote it a couple of years ago with Paul Tupper (Simon Fraser), and it is slowly creeping through the journal acceptance process.

I’m afraid the term ‘diversity’ might be getting a bit overloaded. For us, a diversity is a pair (X,δ)(X,\delta) where XX is a set and δ\delta is a non-negative function defined on finite subsets satisfying

δ(A)=0|A|1\delta(A) = 0 \iff |A| \leq 1

and

if BB \neq \emptyset then δ(AC)δ(AB)+δ(BC)\delta(A \cup C) \leq \delta(A \cup B) + \delta(B \cup C).

You’ll see that restricting δ\delta to 2-sets gives the standard metric axioms. We show that this is the ‘natural’ abstraction for {\em phylogenetic diversities} (hence the name).

The main contribution, however, is that the useful (and rather beautiful) theory of injective hulls for metric spaces generalises quite naturally to diversities. In phylogenetics (my ‘home turf’) the injective hull has lead to all sorts of methods for analysing and visualising evolutionary data. Originally, the idea came from work trying to extend the Hahn-Banach theory to arbitrary metric spaces.

Incidentally, the injective hull is exactly the injective hull of category theory (and we introduce the category theory of diversities).

We balance up a fairly large chunk of theory with some applications and links to other areas.

Sorry to blow my (our) own trumpet, but there had been questions about combining diversity and category theory…..

cheers,

David.

Posted by: David Bryant on May 8, 2012 4:41 AM | Permalink | Reply to this

Post a New Comment