## August 18, 2013

### The Shannon Capacity of a Graph, 2

#### Posted by Tom Leinster

Graph theorists have many measures of how ‘big’ a graph is: order, size, diameter, radius, circumference, girth, chromatic number, and so on. Last time I told you about two of them: independence number and Shannon capacity. This time I’ll bring in two other measures of size that I’ve written about previously in contexts other than graph theory: maximum diversity and magnitude.

The post will end not with a bang, but a whimper. Although there’s something simple and clean to say about the connection between maximum diversity and standard graph invariants, the relationship with magnitude is much more mysterious, and I don’t have much more to say than ‘what’s going on?’. So there will be no satisfying conclusion. Consider yourself warned!

A crash course in diversity measurement  Imagine an ecosystem consisting of a number of species. We know what proportions the various species are present in, and we know how similar the species are to one another. How can we quantify, in a single number, the ‘diversity’ of the ecosystem?

Over the years, many people have answered this question in many different ways. The answer I’m about to explain is the one given in a paper of mine with Christina Cobbold, called Measuring diversity: the importance of species similarity. I also posted about it here and on John’s blog. But I won’t assume you’ve read any of that.

Let’s say we have $n$ species, in respective proportions $p_1, \ldots, p_n$. Thus, $p = (p_1, \ldots, p_n)$ is a probability vector: a finite sequence of nonnegative reals summing to $1$. And let’s say that the similarity of one species to another is measured on a scale of $0$ to $1$, with $0$ meaning ‘completely different’ and $1$ meaning ‘identical’. These similarities are the entries of an $n \times n$ matrix $Z$, the similarity matrix, which I’ll assume is symmetric and has $1$s down the diagonal (because any species is identical to itself).

The expected similarity between an individual from the $i$th species and a randomly-chosen individual is $\sum_j Z_{i j} p_j$. This measures how ‘ordinary’ the $i$th species is. We can express this formula more neatly if we regard $p$ as a column vector: then the ordinariness of the $i$th species is simply $(Z p)_i$, the $i$th entry of the column vector $Z p$.

A community in which most species are very ordinary is not very diverse; conversely, a community in which most species are highly distinctive is very diverse. So the average ordinariness, across all species, measures the ecosystem’s lack of diversity. The average ordinariness is

$\sum_i p_i (Z p)_i.$

To get a measure of diversity itself, we take the reciprocal:

$D_2^Z(p) := 1\Big/ \sum_i p_i (Z p)_i.$

What’s that $2$ doing on the left-hand side? Well, there’s a certain $2$-ness to the right-hand side, since the denominator is the quadratic form $p^{\text{T}} Z p$. And in fact, this diversity measure is just one of a one-parameter family of measures.

The other members of the family come about by varying our interpretation of ‘average’ in the phrase ‘average ordinariness’. Just now, we interpreted it as ‘arithmetic mean’. But the arithmetic mean is only one member of a larger family: the power means. Let $t \in [-\infty, \infty]$. The power mean of order $t$ of positive reals $x_1, \ldots, x_n$, weighted by a probability vector $p = (p_1, \ldots, p_n)$, is

$M_t(p, x) := \Bigl( \sum_{i: p_i\gt 0} p_i x_i^t \Bigr)^{1/t}.$

In other words: transform each $x_i$ by taking the $t$th power, form the arithmetic mean, then apply the inverse transformation. Clearly the formula doesn’t make sense for $t = 0$ and $t = \pm \infty$, but we can fill in those gaps by taking the limit as $t \to 0$ or $t \to \pm \infty$. This gives

$M_0(p, x) = \prod_i x_i^{p_i}, \qquad M_\infty(p, x) = \max_{i: p_i \gt 0} x_i, \qquad M_{-\infty}(p, x) = \min_{i: p_i \gt 0} x_i.$

In this notation, the formula for diversity given above is

$D_2^Z(p) = 1/M_1(p, Z p).$

Generally, for any $q \in [0, \infty]$, the diversity of order $q$ of the ecosystem is

$D_q^Z(p) = 1/M_{q - 1}(p, Z p).$

This turns out to be a quantity between $1$ and $n$, and can be thought of as the ‘effective number of species’. It takes its minimum value of $1$ when only one species is actually present, that is, $p_i = 1$ for some $i$. It takes its maximum value of $n$ when the $n$ species are all completely dissimilar (that is, $Z$ is the identity matrix) and they’re present in equal proportions.

Different values of $q$ attach different amounts of significance to rare species, in a way that I won’t explain. But it’s important to recognize that the diversities of different orders really do measure different things. For example, it often happens that ecosystem A has greater diversity of order $1$ than ecosystem B, but less diversity of order $2$.

If you know some information theory, you can recognize among these diversity measures some classical measures of entropy. For example, when $Z = I$ (different species are deemed to have nothing whatsoever in common), we get the exponential of Rényi entropy:

$D_q^I(p) = \Bigl( \sum_i p_i^q \Bigr)^{1/(1 - q)}.$

The even more special case $Z = I$, $q = 1$ gives the exponential of Shannon entropy.

Maximum diversity  Now suppose we have a list of $n$ species of known similarities, and also suppose that we have control over their proportions or frequencies. In other words, we’ve fixed $n$ and $Z$ and we can vary $p$. How can we choose $p$ to maximize the diversity? And what is the value of that maximum diversity?

Well, these questions don’t entirely make sense. Remember that we have not just one diversity measure, but a one-parameter family of them. Perhaps the $p$ that maximizes diversity of order $1$ comes nowhere near maximizing diversity of order $2$. And perhaps $\sup_p D_1^Z(p)$, the maximum diversity of order $1$, is different from $\sup_p D_2^Z(p)$. After all, different values of $q \in [0, \infty]$ give genuinely different measures.

So it comes as a surprise that, in fact, there’s a ‘best of all possible worlds’: a probability vector that maximizes diversity in all these uncountably many senses at once. And it’s a double surprise: all these different diversity measures have the same maximum value. In other words:

Theorem   Let $n \geq 1$ and let $Z$ be an $n \times n$ similarity matrix. Then:

• There is a probability vector $p_{max}$ such that for all $q \in [0, \infty]$, we have $D_q^Z(p_{max}) = \sup_p D_q^Z(p)$;

• $\sup_p D_q^Z(p)$ is independent of $q \in [0, \infty]$.

In particular, associated to any similarity matrix $Z$ is a real number, its maximum diversity, defined as

$D_{max}(Z) = \sup_p D_q^Z(p)$

for any $q \in [0, \infty]$. Which $q$ makes no difference: that’s the second part of the theorem.

How graphs come in  Now consider the special case where all inter-species similarities are either $0$ or $1$: each pair of species is deemed to be either totally dissimilar or totally identical. The matrix $Z$ of similarities is then simply a symmetric matrix of $0$s and $1$s, with $1$s down the diagonal.

A matrix with these properties is the same thing as a finite, reflexive, undirected graph — which in these posts I’m just referring to as graphs. That’s because a graph $G$ can be seen in terms of its adjacency matrix $A_G$.

So, given a graph $G$, we can ask: what is the maximum diversity of the matrix $A_G$? The official notation for this is $D_{max}(A_G)$, but I’ll write it as $D_{max}(G)$ instead, and call it the maximum diversity of $G$. What is it, explicitly?

It turns out that it’s exactly the independence number of $G$. In Part 1 I mentioned that a set of vertices in a graph is said to be independent if no two elements have an edge between them, and that the independence number $\alpha(G)$ is the largest cardinality of an independent set in $G$. So, the result is that

$D_{max}(G) = \alpha(G)$

for all graphs $G$. Or in words:

Maximum diversity is independence number.

There are uncountably many proofs. Because $D_{max}(G)$ is equal to $\sup_p D_q^{A_G}(p)$ for any $q \in [0, \infty]$, you can choose whichever $q$ is most convenient and maximize the diversity of that order. If you take $q = \infty$, it becomes pretty simple — I reckon that’s the easiest proof. (In Example 4.2 of my paper, I did it with $q = 2$, but that was probably less efficient.)

Back to Shannon capacity  Last time, I gave you the definition of the Shannon capacity of a graph $G$: it’s

$\Theta(G) = \sup_{k \geq 1} \alpha\bigl(G^{\boxtimes k}\bigr)^{1/k} = \lim_{k \to \infty} \alpha\bigl(G^{\boxtimes k}\bigr)^{1/k},$

where $\boxtimes$ is the ‘strong product’ of graphs: the product in the category of (reflexive, as always) graphs. If we want, we can write the definition of $\Theta$ with $D_{max}$ in place of $\alpha$, since they’re the same thing.

I mentioned that computing Shannon capacity is really hard: no one even knows the Shannon capacity of the 7-cycle! So people have put a lot of effort into finding quantities that are similar to $\Theta$, but easier to compute.

For example, suppose we can find a tight but easy-to-compute upper bound on $\Theta(G)$. We already have plenty of lower bounds, namely, $\alpha\bigl(G^{\boxtimes k}\bigr)^{1/k}$ for any $k$ we like. So we’d then have sandwiched $\Theta(G)$ between two easy-to-compute bounds. For some graphs, the upper and lower bounds might even be equal, giving us the exact value of $\Theta(G)$.

The best-known substitute for the Shannon capacity is called the the Lovász number, $\vartheta(G)$. I confess, I don’t have any feel for the definition (which we won’t need here), but here are some of its good properties:

• It’s much more efficient to compute than the Shannon capacity.

• It’s an upper bound: $\Theta(G) \leq \vartheta(G)$. (Yup, the quantity denoted by a small letter is bigger than the quantity denoted by a big letter.)

• It’s multiplicative: $\vartheta(G \boxtimes H) = \vartheta(G)\cdot\vartheta(H)$. This formula helps us to compute the Lovász number of any graph of the form $G \boxtimes H$. Neither the independence number $\alpha$ nor the Shannon capacity $\Theta$ is multiplicative.

As Omar Antolín Camarena pointed out last time, even the Shannon capacity of the 5-cycle $C_5$ was unknown until Lovász defined his number. Here’s how Lovász did it. Having proved that $\vartheta$ is an upper bound for $\Theta$, he then showed that $\vartheta(C_5) = \sqrt{5}$. But

$\alpha\bigl( C_5^{\boxtimes 2} \bigr)^{1/2} \leq \Theta(C_5) \leq \vartheta(C_5) = \sqrt{5},$

and we calculated last time that $\alpha\bigl( C_5^{\boxtimes 2} \bigr)^{1/2} = \sqrt{5}$. Hence $\Theta(C_5)$ can only be $\sqrt{5}$.

(There’s at least one other useful upper bound for $\vartheta(G)$, defined by Haemers and pointed out by Tobias last time, but I won’t say anything about it.)

Maximum diversity and magnitude  We do know one nice quantity to which maximum diversity is closely related, and that’s magnitude.

I’ve written a lot about magnitude of metric spaces and other structures, but at its most concrete, magnitude is simply an invariant of matrices. A weighting on a matrix $A$ is a column vector $w$ such that each entry of the column vector $A w$ is $1$, and it’s an easy lemma that if both $A$ and its transpose have at least one weighting then $\sum_i w_i$ is independent of the weighting $w$ chosen. This quantity $\sum_i w_i$ is the magnitude $\left|A\right|$ of $A$. If $A$ happens to be invertible then it has a unique weighting, and $\left|A\right|$ is simply the sum of all the entries of $A^{-1}$.

As a trivial example, if $A$ is the $n \times n$ identity matrix then $\left|A\right| = n$.

There are a couple of possible meanings for ‘the magnitude of a graph’, because there are (at least!) a couple of ways of turning a graph into a matrix. Not so long ago, I wrote about one of them, in which the matrix encodes the shortest-path metric on the graph. But today, I’ll do something different: when I say ‘the magnitude of $G$’, I’ll mean the magnitude of the adjacency matrix of $G$. That is, by definition,

$\left|G\right| := \left|A_G\right| \in \mathbb{Q}$

for today. (I don’t know whether $\left|G\right|$ is always defined, that is, whether $A_G$ always admits a weighting.)

It’s easy to show that magnitude has one of the good properties satisfied by the Lovász number but not the Shannon capacity: it’s multiplicative. That is,

$\left| G \boxtimes H\right| = \left| G\right| \cdot \left| H \right|$

for all graphs $G$ and $H$. So if we were to replace maximum diversity by magnitude in the definition

$\Theta(G) = \sup_{k \geq 1} D_{max}\bigl(G^{\boxtimes k})^{1/k}$

of Shannon capacity, we’d just end up with magnitude again.

How is maximum diversity related to magnitude? The first fact, which I won’t prove here, is as follows: given a similarity matrix $Z$,

the maximum diversity of $Z$ is the magnitude of some principal submatrix of $Z$.

(Recall that a ‘principal submatrix’ of an $n \times n$ matrix is one obtained by choosing a subset $S$ of $\{1, \ldots, n\}$, then selecting just those rows and columns indexed by $S$.) In particular, given a graph $G$,

the maximum diversity of $G$ is the magnitude of some induced subgraph of $G$.

(Recall that an ‘induced subgraph’ of a graph is one consisting of a subset of the vertices and all edges between them.) We can take that subgraph to be any independent set of maximal size: for this subgraph has $\alpha(G)$ vertices and no nontrivial edges, so by the ‘trivial example’ above, its magnitude is $\alpha(G) = D_{max}(G)$.

Digression  If we had to compute the maximum diversity of a similarity matrix, how could we do it? For the present post I don’t think it matters, but the answer’s short enough that I might as well tell you. Given an $n\times n$ similarity matrix $Z$, let’s say that $S \subseteq \{1, \ldots, n\}$ is ‘good’ if the resulting $S \times S$ submatrix $Z_S$ admits at least one weighting all of whose entries are nonnegative. Then

$D_{max}(Z) = \max \{ \left|Z_S\right| : S \text{ is good} \}.$

This gives an algorithm for computing maximum diversity. It’s slow for large $n$, since we have to run through all $2^n$ subsets of $\{1, \ldots, n\}$, but at least each step is fast.

In some significant cases, maximum diversity is equal to magnitude. For example, suppose we decide to give two species a similarity of $1$ if they’re the same, $1/2$ if they’re different but belong to the same genus, $1/4$ if they’re in different genera but the same family, and so on up the taxonomic hierarchy. Then $D_{max}(Z) = \left|Z\right|$. The numbers $1, 1/2, 1/4, \ldots$ are unimportant here; any other descending sequence would do. In general, given any finite ultrametric space — one satisfying the strengthened triangle inequality

$d(x, z) \leq \max \{ d(x, y), d(y, z) \}$

— the matrix $Z$ with entries $Z_{x y} = e^{-d(x, y)}$ has maximum diversity equal to its magnitude.

Let’s have an example: once more, the 5-cycle $C_5$.

• A weighting on the adjacency matrix $A_G$ of a graph $G$ is a way of assigning to each vertex $x$ a real number $w(x)$, such that $\sum_{y: y \text{ adjacent to } x} w(y) = 1$ for each vertex $x$. There’s a unique weighting on $C_5$, giving weight $1/3$ to each vertex. The magnitude is the sum of the weights, so $\left|C_5\right| = 5/3 = 1.66\ldots$.

• The maximum diversity is the independence number: $D_{max}(C_5) = \alpha(C_5) = 2$.

• The Shannon capacity is $\Theta(C_5) = \sqrt{5} = 2.23\ldots$.

• The Lovász number is the same: $\vartheta(C_5) = \sqrt{5} = 2.23\ldots$.

It’s immediately clear that unlike the Lovász number, magnitude is not a helpful upper bound for the Shannon capacity — it’s not an upper bound at all. Indeed, it’s not even an upper bound for the independence number.

So for $G = C_5$,

$\left|G\right| \leq D_{max}(G) = \alpha(G) \leq \Theta(G) \leq \vartheta(G).$

The equality and the second and third inequalities are true for all $G$. I don’t know whether the first inequality is true in general— that is, whether the magnitude of a graph is always less than or equal to its maximum diversity. Or if not, is it at least true that $\left|G\right| \leq \Theta(G)$ for all $G$? Again, I don’t know.

Nor do I know where to go from here. Most of my understanding of magnitude and maximum diversity comes from the context of metric spaces. (For a finite metric space $X$, one uses the similarity matrix $Z$ given by $Z_{x y} = e^{-d(x, y)}$.) For subspaces of Euclidean space, maximum diversity is always less than or equal to magnitude — which is the opposite way round from what we just saw. That’s because in Euclidean space, passing to a subspace always decreases the magnitude, and by the principle I mentioned earlier, the maximum diversity of a space is equal to the magnitude of some subspace.

Mark Meckes has done a great deal to illuminate the concept of maximum diversity and its relationship to magnitude, putting it into a satisfying analytical setting, not just for finite spaces but also for compact subspaces of Euclidean space (and more generally, so-called spaces of negative type). You can read about this in Positive definite metric spaces and a new paper — Edit: out now! Magnitude, diversity, capacities, and dimensions of metric spaces. But all that’s in a geometrical context quite unlike the graph-theoretic world.

I don’t know whether these ideas lead anywhere interesting. The root of the problem is that I don’t understand much about the magnitude of (the adjacency matrix of) a graph. If you have any insights or observations, I’d very much like to hear them.

Posted at August 18, 2013 1:52 PM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2647

### Calculating the independence number

Although it doesn’t matter for this post, computing the independence number for an arbitrary “real world” graph in practice is a very interesting problem if you like that kind of thing. It also involves these inequalities, so if you’ll excuse the digression…

Most of the literature on the computing side discusses finding a maximum clique rather than a maximum independent set, so I’ll stick to that to reduce my chances of getting an inequality backwards. We denote this $\omega$, because it’s the “opposite” of $\alpha$.

The best known way of finding a maximum clique in an arbitrary graph $G$ goes something like this: let $v$ be some vertex. Then any clique in $G$ either does not contain $v$, or contains only $v$ and possibly some more vertices adjacent to $v$. We use this to grow a clique, which is conventionally called $C$. Initially $C$ is empty. We also have a set of vertices which could potentially be added to be added to $C$. Conventionally this set is called $P$, and initially it contains every vertex in the graph. Now, while $P$ isn’t empty, we pick a vertex $v$ from $P$. Then we recursively solve the problem with $v$ added to $C$ and $P$ filtered to include only vertices adjacent to $v$ (and hence to everything else in $C$), and then with $v$ in neither $C$ nor $P$.

Here’s where the inequalities come in. We keep track of the biggest clique found so far, which is usually called the incumbent. At each step in the search, we want a quick way to know whether it’s worth continuing, i.e. whether we might be able to find a clique big enough to unseat the incumbent. One way of doing that is to look at $|C| + |P|$. But $|P|$ isn’t a very good bound.

The best known algorithms right now use a graph colouring as a bound instead: if we can colour the vertices of $P$ using $k$ colours, $P$ cannot contain a clique with greater than k vertices, so if $|C| + k$ isn’t larger than the size of the incumbent then search at the current point can be abandoned. Calculating the minimum colour number $\chi$ of a graph is also hard, though, but a greedy colouring $\chi^*$ can be computed very quickly.

There are other considerations (which vertex to select from $P$, which greedy colouring to use, some clever tricks involving bitset encodings, a way of reducing the number of colourings that need to be done, parallelisation, discrepancy based search, …) which aren’t of interest here. But the use of a greedy colouring as a bound works because

$\omega \le \chi \le \chi^* \le |P| .$

The Lovász theta function (of the complement graph) is interesting from an algorithmic perspective because

$\omega \le \overline{\theta} \le \chi \le \chi^* \le |P| .$

Also, unlike $\chi$, $\overline{\theta}$ can theoretically be computed to within $\epsilon$ in polynomial time. This would suggest that using a nearest-integer approximation of $\overline{\theta}$ as a bound would be better than a greedy colouring. Unfortunately calculating the theta function for an arbitrary graph in practice is an open problem…

It doesn’t help that all of these approximations can be arbitrarily bad, and if $P \ne NP$ then there’s no good polynomial time approximation for either $\omega$ or $\chi$ at all.

Still, it’s not all bad news: the smallest graph I’ve found where it’s not possible to compute the clique number exactly has 500 vertices and was designed specifically to screw up these kinds of algorithms. For reasonably sparse graphs we can take it to millions of vertices. So it’s not as horrific in practice as some hard graph problems.

Posted by: Ciaran McCreesh on August 18, 2013 8:48 PM | Permalink | Reply to this

### Re: Calculating the independence number

Thanks for all that, Ciaran.

Although it doesn’t matter for this post, computing the independence number for an arbitrary “real world” graph in practice is a very interesting problem if you like that kind of thing.

Actually, I’m more interested in that than you might guess. In my post, I pointed out in passing that if one uses the most obvious algorithm for calculating the maximum diversity of an $n \times n$ matrix, it takes $2^n$ steps. Obviously, it would be good to find a faster way.

Since calculating the independence number is a special case of calculating the maximum diversity (namely, the case where the matrix consists of just 0s and 1s), knowledge about the former problem is likely to be useful in attacking the latter problem. And fortunately, as your post makes clear, the problem of calculating the independence number is well studied.

(For anyone reading who doesn’t see the connection between the clique number and the independence number, here it is. Every graph $G$ has a complement $\overline{G}$: the vertices are the same, but two vertices are adjacent in $\overline{G}$ exactly when they’re not adjacent in $G$. A clique in a graph is a set of vertices each of which is adjacent to each other, and the clique number $\omega(H)$ of a graph $H$ is the largest cardinality of a clique in it. It’s immediate from the definition that $\alpha(G) = \omega(\overline{G})$: the independence number of a graph is the clique number of its complement.)

Christina Cobbold implemented the maximum diversity algorithm without any attempt at optimizing either the code or the algorithm itself, and on bog-standard hardware. I think it was done in Matlab. For 20-something species, it took only a couple of minutes to calculate the maximum diversity: for while 2 to the power of 20-something is large, each step is fast. Since no attempt at optimization was made, I’m sure this could be improved — and maybe you know how to.

But inevitably, there’s a limit to how much it can be improved while the algorithm still takes an exponential number of steps. That’s why I’m interested in finding a better algorithm. Can you see one?

Posted by: Tom Leinster on August 19, 2013 2:50 PM | Permalink | Reply to this

### Re: Calculating the independence number

The bad news is, in the worst case, it’s likely that there is no polynomial time algorithm for the problem. Here’s why: independence number is NP-hard, so it has a polynomial time algorithm iff $P = NP$. But we don’t know of a polynomial time algorithm for any NP-hard problem, and it’s generally believed that $P \ne NP$. Since a polynomial algorithm for maximum diversity could compute independence number in polynomial time, maximum diversity is also at least NP-hard.

There’s a small chance there’s a loophole. It could be that all interesting similarity matrices have some special extra property which simplifies the problem. For example, independence number is easy if you know that a graph consists of the disjoint union of some number of cliques: you construct a maximum independent set by picking a vertex, eliminating any adjacent vertices, and repeating until there’s nothing left.

If there’s not a loophole, the good news is that for some algorithms with exponential worst case behaviour, the worst case rarely happens, so wanting to compute maximum diversity for large matrices isn’t necessarily a lost cause.

The best algorithms for computing independence number in practice are branch and bound algorithms. The state of the art begins with three algorithms due to Tomita: 1, 2, 3. Then there’s a clever bitset encoding due to San Segundo: 1, 2. Prosser did a comparison of all of these, and published code. Finally, if you’ve got some really tricky graphs and access to expensive hardware, I’ve got code and a paper under review for a parallel version of these algorithms. It’s worth noting that these all use (the complement of) a greedy colouring as a bound, rather than the (complement of the) Lovász theta function.

As to whether these can be adapted to maximum diversity: maybe… San Segundo’s bitset encodings definitely can’t be, but branch and bound is worth considering. The key properties of independence number that make branch and bound work are:

• There’s a link between independence number and independent sets. Suppose I claim that a graph has independence number at least $a$. Then I can always prove it to you by showing you an independent set with $a$ elements, and you can verify this proof quickly. (Incidentally, if I claim a graph has independence number less than $a$, there may sometimes not be a short proof of this fact if certain widely held assumptions about computational complexity are correct.) We need this because branch and bound assumes there’s a well-behaved decision problem (does graph $G$ contain an independent set of size $a$?) associated with the optimisation problem (what is the largest independent set of $G$?).

• I can incrementally construct every independent set, and given an independent set I can easily tell you what the independence number would be if such an independent set is maximum (it’s just the cardinality of the independent set).

• If I know the independence number of an induced subgraph, adding in an extra vertex from the original graph cannot decrease the independence number.

• Suppose I have a set of independent vertices. Then I can easily select which of the remaining vertices in the graph could individually be added to this set (vertices which are not individually adjacent to anything in the set), and I can get a usually-good-in-practice upper bound on how much this could help.

• The independence number of a graph doesn’t change if the vertices are permuted. (Not strictly necessary, but very convenient.)

If these properties, or something similar, also hold for maximum diversity then it’s likely there’s an algorithm which won’t behave exponentially in practice for most cases. As a first guess: is what you call a “good” subset of $\{1, \ldots, n\}$ in some ways like being an independent set? In other words, are all subsets of a “good” subset also “good”? And does the magnitude of a “good” subset behave like the cardinality of an independent set?

Posted by: Ciaran McCreesh on August 20, 2013 6:23 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

Firstly, when you’re using the adjacency matrix for your similarities, the notion of being totally identical is not transitive. So $A$ and $B$ can be totally identical (i.e. adjacent in some graph) and $B$ and $C$ can be totally identical, without $A$ and $C$ being totally identical, right?

Secondly, I seem to remember some slogan about Euler characteristic being preserved by left adjoint functors but not by right adjoint functors. So the Euler characteristic of a finite set, considered as a discrete topological space is its cardinality, but the cardinality of a topological space considered as a set is not equal to its Euler characteristic. Free functors are generally (by definition?) left adjoints. The usual metric space structure on a combinatorial graph can be thought of as the free $[0,\infty]$-category on the underlying $[0,\infty]$-graph(*). The underlying $[0,\infty]$-graph is basically the adjacency matrix. I realise this could be random word association, but is there any obvious relation between the magnitude of the adjacency matrix and the magnitude of the metric space?

(*) A $[0,\infty]$-graph is a set $X$ with a function $X\times X\to [0,\infty]$.

Posted by: Simon Willerton on August 20, 2013 9:22 AM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

Firstly, when you’re using the adjacency matrix for your similarities, the notion of being totally identical is not transitive. So $A$ and $B$ can be totally identical (i.e. adjacent in some graph) and $B$ and $C$ can be totally identical, without $A$ and $C$ being totally identical, right?

Thanks — I was going to say something about that in my post, but decided against in order not to strain the reader’s attention too much. So I was secretly hoping that someone might bring it up.

Yes, right, being totally identical isn’t transitive. This isn’t as implausible as it sounds.

For instance, one of the many conflicting definitions of “species” is this: two individuals are of the same species if they can interbreed. (I guess we’re talking about sexual organisms here, and some caveats need to be added in order to avoid stupidities about infertile individuals or individuals of the same sex.) There’s no obvious reason why this should be an equivalence relation: perhaps gruffaloes can breed with buffaloes, and buffaloes with bison, but not gruffaloes with bison. In principle, you could have a ring of interbreeding organisms encircling the globe, with “holonomy”, if you see what I mean. Perhaps there are genetic reasons why this wouldn’t happen, but it doesn’t seem impossible from the definition alone.

Perhaps more convincingly, suppose we’re looking at microbial communities, where (as you know) one often has nowhere near a complete taxonomic classification of all the microbes present, and anyway it’s practically impossible to examine the microbes one by one. Instead, you extract their DNA and look at the variation. One might declare two individual microbes to be of the same species if they have a genetic similarity of $\gt 99$%, say. And this isn’t an equivalence relation.

(Microbial biologists are well aware of this, and have a bunch of tricks to make it an equivalence relation. This leads to the notion of an operational taxonomic unit.)

Secondly, I seem to remember some slogan about Euler characteristic being preserved by left adjoint functors but not by right adjoint functors […]

Interesting question. I don’t know, but it seems to have a lot to do with David’s ideas here.

My first thought is that passing from the adjacency matrix $A_G$ to the similarity matrix $Z_G$ isn’t quite a free-type construction, because the metric doesn’t tell you how many paths of a given length there are between a given two points — just the length of the shortest one.

That seems to suggest that David’s $P_G$ is the thing to consider. The formula for it, $P_G(q) = \sum_{d = 0}^\infty A_G^d q^d$, even looks like the formula for a free monoid. Its magnitude at $q = 1$ is the Euler characteristic of the graph $G$, if we interpret the undirected graph $G$ as a directed graph with each undirected edge replaced by $\stackrel{\longrightarrow}{\leftarrow}$. (That’s very close to the little lemma in “The Euler characteristic of a category” saying that the EC of the free category on a directed graph is equal to the EC of that graph.)

On the other hand, $A_G q$ and $P_G(q)$ simply don’t have the same magnitude.

Posted by: Tom Leinster on August 20, 2013 11:21 AM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

In principle, you could have a ring of interbreeding organisms encircling the globe, with “holonomy”, if you see what I mean. Perhaps there are genetic reasons why this wouldn’t happen, but it doesn’t seem impossible from the definition alone.

This happens in a few ring species of birds that live around geographical barriers like the North Pole, Himalayas, or the Sierra Nevada.

Posted by: Stuart Presnell on August 22, 2013 8:17 AM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

Thanks! From the Wikipedia article you link to:

Formally, the issue is that interfertility (ability to interbreed) is not a transitive relation

That’s exactly what I meant. The article also include an interesting quote from Richard Dawkins: ring species “are only showing us in the spatial dimension something that must always happen in the time dimension.”

Posted by: Tom Leinster on August 22, 2013 12:03 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

The real world, as always, is even more complicated, because “ability to interbreed” is not $\{0,1\}$-valued. My understanding of ring species and the like is that, roughly, the farther apart two (fertile, opposite-sex) individuals live, the less successfully they can interbreed. For example, for animals in which the female lays large numbers of eggs which are then fertilized by the male, the number of eggs which are successfully fertilized decreases, such that near neighbors have a very high fertilization rate, but widely separated individuals, should they somehow happen to mate, have a fertilization rate of 0.

Posted by: Mark Meckes on August 23, 2013 4:23 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

Hmm, the second half of my last comment was over-hasty: I wasn’t distinguishing properly between free categories and free $[0, \infty]$-categories.

Here’s a basic question. Consider finite graphs, directed or not as you prefer. What’s the simplest way of associating a matrix $M_G$ to each graph $G$, in such a way that the magnitude of $M_G$ is always equal to the Euler characteristic (vertices minus edges) of $G$?

We already know one way, at least for directed graphs: take $M_G$ to be the matrix of cardinalities of hom-sets of the free category on $G$. But I’m wondering whether there’s some simpler way of doing it, not involving a free construction.

Posted by: Tom Leinster on August 20, 2013 4:30 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

If “Euler characteristic” means the trace of the identity map of a dualizable object in a symmetric monoidal category, then it is preserved by any strong monoidal functor. For some reason, though, left adjoints seem to have more of a tendency to be strong monoidal than right ones: strong/lax doctrinal adjunctions seem more common than oplax/strong ones.

Posted by: Mike Shulman on August 21, 2013 5:15 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

Here’s a rather trivial observation, which may or may not be useful. The set of $n \times n$ similarity matrices is a compact convex set, and the adjacency matrices of graphs (in the sense of this post) with $n$ vertices are precisely its extreme points.

Posted by: Mark Meckes on August 24, 2013 10:45 AM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

Interesting point.

Posted by: Tom Leinster on August 24, 2013 1:32 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

In my post, I said Mark had been working a lot on the relationship between magnitude and maximum diversity. He’s just posted a new paper about this (and several other things!), illuminating that relationship in the context of suitable compact metric spaces.

Mark Meckes, Magnitude, diversity, capacities, and dimensions of metric spaces. arXiv:1308.5407, 2013.

Posted by: Tom Leinster on August 27, 2013 2:11 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

I don’t know what you have learned about this topic over the last two years, but I have a connection between the magnitude of the adjacency matrix of a graph and the independence number.

A set of vertices in a graph is dominating if every vertex in the graph is either a member of the set or adjacent to a member. Any maximal independent set is a dominating set, so the minimum size of a dominating set in a graph is a lower bound for the maximum size of an indepedent set. The size of the smallest dominating set can be expressed as an integer linear program: $\min_{x \in \mathbb{Z}^n} \mathbf{1}^T x$ such that $x \geq \mathbf{0}$ and $Ax \geq \mathbf{1}$. The fractional dominating set number is the fractional relaxation of that program: $\min_{x \in \mathbb{R}^n} \mathbf{1}^T x$ such that $x \geq \mathbf{0}$ and $Ax \geq \mathbf{1}$. It is also equal to the value of the dual program $\max_{x \in \mathbb{R}^n} \mathbf{1}^T x$ such that $x \geq \mathbf{0}$ and $Ax \leq \mathbf{1}$. If there is some vector $w \geq \mathbf{0}$ such that $Aw = \mathbf{1}$, then $w$ is feasible in both the primal and dual linear programs and $\mathbf{1}^T w$ is the fractional dominating set number. For $d$-regular graphs, $A\mathbf{1} = (d+1)\mathbf{1}$ so both the fractional dominating set number and the magnitude are equal to $1/(d+1)$.

The magnitude can differ from the fractional domination number. Consider the following seven vertex tree. The central vertex has degree two. Both of its neighbors have degree three. The final four vertices are leaves. The vector that is equal to $-1/3$ on the central vertex, $2/3$ on its neighbors, and $1/3$ on the leaves is a weighting so the magnitude is $7/3$. The dominating set number and fractional dominating set number are both $2$.

Posted by: Daniel Cullina on April 14, 2015 8:16 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

It is also equal to the value of the dual program $\max_{x \in \mathbb{R}^n} \mathbf{1}^T x$ such that $x \geq \mathbf{0}$ and $A x \leq \mathbf{1}$.

Because I’ve never done a course on linear programming, I didn’t know this, but Wikipedia is telling me that this is an instance of “strong duality” — and that in principle, the two occurrences of $\mathbf{1}$ in the original problem have switched positions on passing to the dual problem.

For $d$-regular graphs, $A\mathbf{1} = (d+1)\mathbf{1}$ so both the fractional dominating set number and the magnitude are equal to $1/(d+1)$.

That must be a typo for $n/(d + 1)$, where $n$ is the number of vertices.

Here’s a picture of your example:

Posted by: Tom Leinster on April 15, 2015 6:03 PM | Permalink | Reply to this

### Re: The Shannon Capacity of a Graph, 2

Yes, I meant to write $n/(d+1)$.

Not only have the two occurrences of $\mathbf{1}$ switched places, one program uses $A$ and the other uses $A^T$. In general, the primal and dual polytopes live in different spaces and the $A$ matrix is not square, let alone symmetric. Fractional dominating set has an unusually symmetric linear program.

Posted by: Daniel Cullina on April 15, 2015 6:53 PM | Permalink | Reply to this

Post a New Comment