Random Permutations (Part 5)
Posted by John Baez
To go further in understanding the properties of random permutations, we need to use ‘bivariate’ generating functions — that is, generating functions involving 2 variables. Here’s a good introduction to these:
- Phillipe Flajolet and Robert Sedgewick, Analytic Combinatorics, Part III: Combinatorial Parameters and Multivariate Generating Functions.
I will try to explain these in a way that takes advantage of Joyal’s work on species. Then I’ll use them to tackle this puzzle from Part 3.
Puzzle 5. What is the expected number of cycles in a random permutation of an -element set?
There are two main kinds of generating functions in combinatorics, which apply to functors from two different groupoids to :
- Let be the groupoid of finite sets and bijections, or equivalently the groupoid with natural numbers as objects and permutations as morphisms from to . Let be a functor. (We call this sort of functor a finite species.) Then the exponential generating function of is the formal power series defined by where is the cardinality of applied to .
- Let be the groupoid of totally ordered finite sets and order-preserving bijections, or equivalently the groupoid with natural numbers as objects and only identity morphisms. Let be a functor. Then the ordinary generating function of is the formal power series defined by where is the cardinality of applied to .
Why is there an in the denominator in the first case but not the second? It comes from the theory of groupoid cardinality; using this theory we could define generating functions for functors for any groupoid . But we only need these two cases so I’ll resist the temptation to explain such generalities.
I’ve spent a lot more time thinking about functors and their exponential generating functions than functors and their ordinary generating functions. The former are quite rich and interesting; the latter seem a bit pathetic in comparison. A functor is determined up to isomorphism by the cardinality of for each , so it’s really just a function from the natural numbers to itself. Still, converting it into a formal power series — its ordinary generating function — is a useful trick for solving all sorts of problems. So we shouldn’t scorn the ordinary generating functions. Furthermore, we can restrict any functor to a functor from , or left Kan extend any functor to a functor , and this sets up some nice relations between exponential and ordinary generating functions.
The fun really starts when we combine the two ideas. Suppose we have a functor
Then we can define its generating function , which is a formal power series in two variables and , by
This is called a bivariate generating function. All the usual tricks with generating functions work for this kind too. The systematic mathematician in me wants to explain all these tricks, but that would take a while. So with no further ado let me just illustrate them by using them to answer a question:
Puzzle 5. What is the expected number of cycles in a random permutation of an -element set?
This is Example III.4 in Analytic Combinatorics, but I’ll explain the method from scratch.
We start by making up a functor
that assigns to the pair the set of permutations of with cycles. Note that we’re treating as a finite set but as a natural number.
What is the generating function ? To answer that it’s best to make up functors
that assign to each finite set the set of permutations of with cycles. We have
so
Thus, to figure out we just need to know for each .
A permutation of with cycles is the same as a way to chop up into unlabeled pieces and put a cyclic ordering on each piece, so
where is the species of cyclic orderings. We thus have
But the generating function for cyclic orderings is famous: there are cyclic orderings on an -element set, so
Thus, we get
This is fairly complicated: if we actually wrote this out as a power series, the coefficients would involve Stirling numbers of the first kind. That shouldn’t be surprising, since these numbers count the permutations of having cycles. But it’s easier to work with , which is why this ‘bivariate generating function’ idea is a good thing. Let’s see why:
See? It’s much more pretty to combine all the into a single this way.
The function is not exactly what we need to solve our puzzle, but we’re very close now. We want to know the average number of cycles of a random permutation of a -element set. Call this . Since there are permutations with cycles, we have
But notice, this is something we can compute using , since
So, to work out the average we just differentiate with respect to , set , and work out the coefficients of its Taylor series as a function of . Let’s do it!
We saw
For those whose calculus is rusty this could be the hardest step in the whole computation:
This gives
Now we just need to work out the Taylor series of this function. I take it back: this is the hardest step. We have
where is the harmonic number
I’ll show this in a minute. Given this, we get
Or, in something more like plain English: the average number of cycles in a random permutation of an -element set is
Asymptotically this is just . For a better estimate, use , where is Euler’s constant.
So, we’re done! This offers more evidence that a random permutation of a huge set has rather few, mostly rather large cycles. For example, suppose we take a random permutation of a set with a million elements. Then our calculation show that on average it has only about 14 cycles. From last time, we know that the median size of the largest cycle is roughly 606,000, while its mean size is roughly 624,000.
But how do we do that last step in our computation — the one that led to the harmonic numbers? For this, note that if we multiply any formal power series
by the geometric series
we get a series whose coefficients are the partial sums of the sequence :
Since the power series for has coefficients starting at , when we multiply it by we get a power series whose coefficients are the harmonic numbers!
Some readers may be disappointed at how a calculation that started so nobly with functors devolved into some clever tricks with calculus. Others may feel the opposite way.
Personally I like the combination of functors and calculus. I’d like to get more of the calculations to proceed at the level of functors from to , or even from to . But the main ‘trick’ here, the method of computing the mean of a family of probability distributions by differentiating a bivariate generating function with respect to and then setting , is very pretty. And it’s not a one-off: it’s quite generally useful. See Analytic Combinatorics for more examples!
Re: Random Permutations (Part 5)
Here’s one part of the computation that’s easy to categorify:
where
As noted in the post, is the generating function of , the species of cyclic orderings. Its derivative is the generating function of , the species of total orderings. So,
A -structure on a finite set is a way of chopping it into two parts, totally ordering the first part and cyclically ordering the second. There’s a groupoid of -structured -element sets, and the above equations imply that the cardinality of this groupoid is
But this is easy to see directly! There are different kinds — different isomorphism classes — of -structured -element sets, one for each . The th kind consists of -element sets where we’ve cyclically ordered of the elements and totally ordered the rest. The automorphism group of this th kind is since we can ‘cycle’ the cyclically ordered part.
The cardinality of a groupoid is the sum over isomorphism classes of the reciprocals of the cardinalities of the automorphism groups. So, the cardinality of the groupoid of -structured -element sets is
The next question: why does the average length of a cycle in a random permutation of an -element set equal the cardinality of the groupoid of -structured -element sets?
My post explains why. But we can seek a deeper reason. Is this just an equation between numbers, or does the computation in my post lift to something like an equivalence of groupoids that explains this equation?