November 24, 2019

Random Permutations (Part 2)

Posted by John Baez Here’s a somewhat harder pair of problems on random permutations, with beautiful answers.

Puzzle 3. In the limit as $n \to \infty$, what is the probability that the shortest cycle in a randomly chosen permutation of an $n$-element set has length $\gt k$?

We mainly need to count the permutations of an $n$-element set whose shortest cycle has length $\gt k$. So, we define a functor

$Perm_{\gt k} \colon S \to FinSet$

from the groupoid of finite sets to the category of finite sets, sending any finite set $X$ to the finite set

$Perm_{\gt k}(X) = \{ permutations \; of \; X \; whose \; shortest \; cycle \; has \; length \gt k \}$

A functor from $S$ to $FinSet$ is called a finite species and this is what my course was about. We think of a finite species $G : S \to FinSet$ as a map sending any finite set $X$ to the set of ‘$G$-structures’ we can put on $X$. Any finite species $G$ has a ‘generating function’

$|G|(x) = \sum_{n \ge 0} \frac{|G(n)|}{n!} x^n$

where in $G(n)$ we are using $n$ as our notation for some $n$-element set. Our basic strategy will be to compute the generating function of the species $Perm_{\gt k}$ and use that to count permutations of an $n$-element set whose shortest cycle has length $k$. But in fact, to save work, we will just work out the asymptotics as $n \to \infty$.

We can compose species: a $G \circ H$ structure on $X$ consists of a way of writing $X$ as a union of disjoint parts, putting a $G$-structure on the set of parts, and putting an $H$-structure on each part. We have

$|G \circ H| = |G| \circ |H|$

where at right we’re composing generating functions.

So, if we can write $Perm_{\gt k}$ as a composite of two species whose generating functions we know, we can work out $|Perm_{\gt k}|$.

In fact $Perm_{\gt k}$ is the composite of two species called $Exp$ and $Cyc_{\gt k}$. The species $Exp$ is called ‘being a finite set’: we can put an $Exp$-structure on any finite set in exactly one way. The species $Cyc_{\gt k}$ is called ‘being a cyclically ordered set of cardinality $\gt k$’: we can put a $Cyc_{\gt k}$-structure on a finite set only if its cardinality is $\gt k$, and then a $Cyc_{\gt k}$-structure on it is just a cyclic ordering.

We have

$Perm_{\gt k} \cong Exp \circ Cyc_{\gt k}$

because a permutation with cycles of length $\gt k$ on a finite set $X$ is the same as a way of chopping $X$ into subsets of cardinality $\gt k$ and cyclically ordering each one. Thus, we have

$|Perm_{\gt k}| = |Exp| \circ |Cyc_{\gt k}|$

and we just need to compute the two generating functions on the right-hand side.

The generating function of $Exp$ is easy to compute:

$|Exp|(x) = \sum_{n \ge 0} \frac{1}{n!} x^n = exp(x)$

and this explains the name $Exp$. The generating function of $Cyc_{\gt k}$ is

$|Cyc_{\gt k}|(x) = \sum_{n \gt k} \frac{(n-1)!}{n!} x^n = \sum_{n \gt k} \frac{x^n}{n}$

since there are $(n-1)!$ cyclic orderings on an $n$-element set. We have

$\sum_{n \ge 1} \frac{x^n}{n} = \ln\left(\frac{1}{1-x}\right)$

(just integrate the geometric series) so we get

$|Cyc_{\gt k}|(x) = \ln\left(\frac{1}{1-x}\right) - \left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)$

Since

$|Perm_{\gt k}| = |Exp| \circ |Cyc_{\gt k}|$

we get

$|Perm_{\gt k}|(x) = e^{ \ln\left(\frac{1}{1-x}\right) - \left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right) }$

or simply

$|Perm_{\gt k}|(x) = \frac{e^{-\left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)}}{1-x}$

This is really a kind of formula for $|Perm_{\gt k}(n)|$, the number of permutations of an $n$-element set with shortest cycle of length $\gt k$. After all, by the definition of generating function, this equation says

$\sum_{n \ge 0} \frac{|Perm_{\gt k}(n)|}{n!} x^n = \frac{e^{-\left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)}}{1-x}$

Even better, the coefficient

$\frac{|Perm_{\gt k}(n)|}{n!}$

is the probability we’re after: the probability that the shortest cycle in a randomly chosen permutation of an $n$-element set has length $\gt k$.

Unfortunately, working out these coefficients is hard.

Fortunately, we just want to know the limit of these coefficients as $n \to \infty$. And for this we can use a little result from complex analysis. I’ll prove it later:

Lemma. Suppose

$f(z) = \sum_{n \ge 0} c_n z^n$

is a meromorphic function defined on a disc of radius $\gt 1$ centered at the origin, which is analytic on this disc except for a simple pole at $z = 1$. Then $\lim_{n \to \infty} c_n$ equals the residue of $-f$ at $z = 1$.

If you want to understand this lemma, I urge that you consider an example: the function $1/(1-z)$. The coefficients $c_n$ of its power series are all $1$, and it has a pole at $z = 1$ with residue $-1$. This is why we need to take the residue of $-f$ in the lemma. In fact, proving the lemma consists of reducing the general problem to this special case.

Given this lemma, the quantity we’re trying to compute:

$\lim_{n \to \infty} \frac{|Perm_{\gt k}(n)|}{n!}$

is the residue of the function

$\frac{e^{-\left(z + \frac{z^2}{2} + \cdots + \frac{z^k}{k}\right)}}{z-1}$

at the point $z = 1$. This function is just minus the generating function of $Perm_{\gt k}$, but I’ve changed $x$ to $z$ to emphasize that now we’re thinking of it as a function on the complex plane. It’s actually analytic everywhere except $z = 1$, and since $1/(z-1)$ has a simple pole of residue $1$ at this point, the residue of

$\frac{e^{-\left(z + \frac{z^2}{2} + \cdots + \frac{z^k}{k}\right)}}{z-1}$

at $z = 1$ is just the numerator evaluated at $z = 1$, namely

$e^{-\left(1 + \frac{1}{2} + \cdots + \frac{1}{k}\right)}$

So this number is the probability that the shortest cycle in a randomly chosen permutation of an $n$-element set has length $\gt k$, in the $n \to \infty$ limit.

This is beautiful, I think. And it leads immediately to another beautiful fact.

Puzzle 4. Give a simpler asymptotic formula for the above probability as $k \to \infty$.

Here we use the fact that

$1 + \frac{1}{2} + \cdots + \frac{1}{k} = \ln k + \gamma + o(1)$

where $\gamma \approx 0.57721$ is Euler’s constant. So, as $k \to \infty$, the probability we’ve just computed is

$e^{-\left(1 + \frac{1}{2} + \cdots + \frac{1}{k}\right)} \sim e^{-\left(\ln k + \gamma \right)} = \frac{1}{e^\gamma \, k}$

When Euler discovered his constant $\gamma$ in his 1776 paper “On a memorable number naturally occurring in the summation of the harmonic series”, he voiced his suspicion that $e^\gamma$ would be interesting… and here is some evidence that it is!

By the way, I learned about these questions here:

They were first answered in 1952:

To wrap up, let me prove that lemma. It’s just a special case of a much more general useful result on the asymptotic behavior of the coefficients of a power series:

• Philippe Flajolet and Robert Sedgewick, Analytic Combinatorics, Cambridge University Press, Cambridge, 2009.

But once you’ve seen the proof of this special case, it’s easy to generalize.

Lemma. Suppose

$f(z) = \sum_{n \ge 0} c_n z^n$

is a meromorphic function defined on a disc of radius $\gt 1$ centered at the origin, which is analytic on this disc except for a simple pole at $z = 1$. Then $\lim_{n \to \infty} c_n$ equals the residue of $-f$ at $z = 1$.

Proof. The idea is to ‘subtract off the singularity’. We can write

$f(z) = g(z) + \frac{a}{1-z}$

where $g$ is analytic on a disc of radius $\gt 1$ and $a$ is the residue of $-f$ at $z = 1$. If we write

$g(z) = \sum_{n \ge 0} d_n z^n$

then we have

$c_n = d_n + a$

for all $n$. But the Cauchy–Hadamard theorem says that

$\limsup_{n \to \infty} |d_n|^{1/n} = 1/R$

where $R$ is the radius of convergence of $\sum_{n \ge 0} d_n z^n$, and since $R \gt 1$ we have $d_n \to 0$ as $n \to \infty$. Thus

$c_n \to a$

as was to be shown.   █

Posted at November 24, 2019 2:43 AM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3162

Re: Random Permutations (Part 2)

Wow, this is a gorgeous application of generating functors and generating functions. It highlights the differences in the approaches and where they both shine: the functors rely on explicit bijections and looking into the process of building up the thing you’re counting (so they feel very concrete and constructive), while the functions are truly analytical objects that can be worked with complex analytically (and so they partake in the magic that is complex analysis)!

Posted by: David Jaz Myers on November 24, 2019 10:22 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Thanks! Yes, I’m looking for examples that really illustrate the interaction between species and complex analysis. I’m also hoping there’s more than meets the eye here: maybe a kind of categorification of certain pieces of complex analysis.

For example, for any open set $U \subseteq \mathbb{C}$ there’s a ring $A(U)$ of analytic functions on $U$. But also for any $U$ containing the origin there’s a 2-rig of finite species whose generating functions lie in $A(U)$, where the 2-rig operations are the usual addition and multiplication of species (that is, coproduct or more generally finite colimits, and the Day convolution tensor product). I wish this were a sheaf (or stack) of 2-rigs on $\mathbb{C}$, but it seems I only have access to these generating functions through their Taylor series at the origin. I need to think about how analytic continuation works at the categorified level.

Posted by: John Baez on November 24, 2019 10:30 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Here’s a question I don’t know the answer to.

Preamble: any map $f: X \to X$ from a finite set to itself — not necessarily a permutation! — gives rise canonically to a permutation $f|_{X_f}$ of a subset $X_f$ of $X$. It’s constructed like this: the chain of subsets

$f X \supseteq f^2 X \supseteq f^3 X \supseteq \cdots$

can’t keep getting smaller forever (since they’re all finite), so it must stabilize somewhere. Call the set at which it stabilizes $X_f$. Thus,

$X_f = \bigcap_{n \geq 0} f^n X.$

(This is the set of “eventually periodic” points of $f$.) Then $f|_{X_f}: X_f \to X_f$ is surjective and therefore bijective.

Now the question: when $X$ is large and $f: X \to X$ is a function chosen uniformly at random, how big do we expect $X_f$ to be?

In words only: what is the expected cardinality of the set of eventually periodic points of a random endomorphism?

In notation: for a finite set $n$, let $a_n$ denote the mean cardinality of $X_f$, taken over all endomorphisms $f$ of an $n$-element set $X$. What can be said about $a_n$ in terms of $n$, for large $n$?

I’ve phrased this in an open-ended way, because I don’t have a good sense of what a good precise question is. Obviously $a_n$ grows with $n$, so a first attempt at a precise question is this: what is $\lim_{n \to \infty} \frac{a_n}{n}$ (if it exists)?

Posted by: Tom Leinster on November 25, 2019 2:45 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Thanks for a great puzzle, Tom!

I should be able to figure this out: it’s a good test of my generating function expertise. But if someone knows, they should just say.

Right now I’m guessing $a_n$ is of order $\sqrt{n}$.

One reason I’m guessing this is that if we take a random permutation of a large $n$-element set, the expected size of its largest cycle is $\lambda n$ where $\lambda \cong 0.624$ is the Golomb–Dickman constant, while if we take a random endofunction, the expected size of its largest cycle (“eventual cycle”) is much smaller, just $\lambda \sqrt{\pi n/2}$.

But all this “largest cycle” stuff is distracting clutter when it comes to your question; your question is more fundamental.

Posted by: John Baez on November 25, 2019 3:29 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

By the way, here’s a fun thing I covered in my class.

There’s a species $End$ where $End(X)$ is the set of endomorphisms of a finite set $X$. There’s a species $Perm$ where $Perm(X)$ is the set of permutations of $X$. And there’s a species $Tree$ where $Tree(X)$ is the set of rooted trees with $X$ as vertices.

We can compose species, and we have

$End \cong Perm \circ Tree$

This says an endomorphism of $X$ is a way of chopping $X$ into parts, putting a permutation on the set of parts, and putting a rooted tree structure on each part. This expresses how as we repeatedly apply the endomorphism to the elements of $X$, they flow down to the roots of trees, and from then on they get permuted.

Indeed there’s a species $Cyc$ where $Cyc(X)$ is the set of cyclic orderings of $X$ and a species $Exp$ where $Exp(X)$ is the one-element set, and we have

$Perm \cong Exp \circ Cyc$

Combining these facts we get

$End \cong Exp \circ \Cyc \circ Tree$

This expresses how an endofunction is a disjoint union of ‘connected components’, each of which consists of a cycle and trees rooted at each point of the cycle.

I would like to use this to solve your question.

Posted by: John Baez on November 25, 2019 5:21 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

That is a fun thing!

I feel as if my question should be answerable, but I don’t have your skills in generating functions, sadly, nor can I think of any sneaky argument. Or not yet.

Posted by: Tom Leinster on November 25, 2019 8:00 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I shouldn’t overpromise: I’m trying to build up my skills with generating functions, but I’m not yet a master.

Even if this identity

$End \cong \Exp \circ \Cyc \circ \Tree$

doesn’t help solve your problem, I think it’s something more mathematicians should know: perhaps not phrased in the language of species, but simply in pictures. I bet more grad students know the classification of finite abelian groups than the structure of endofunctions on finite sets! That’s wack.

Posted by: John Baez on November 25, 2019 9:59 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I bet more grad students know the classification of finite abelian groups than the structure of endofunctions on finite sets! That’s wack.

That is, indeed, wack. I guess we teach group theory rather than monoid theory because there are so many more gems to be found in the former. (As far as I know; though maybe I wouldn’t, as like almost everyone else, I know more about groups than monoids.) But it seems like the balance has tilted too far towards groups/permutations and away from monoids/endofunctions.

Posted by: Tom Leinster on November 25, 2019 10:14 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I suspect your initial guess was right.

To recap: given an endofunction $f$ of a finite set $X$, put

$\im^\infty(f) = \bigcap_{n \geq 0} f^n X.$

(I called it $X_f$ before.) For $n \geq 1$, let $a_n$ denote the expected cardinality of $\im^\infty(f)$ for a random endofunction $f$ of an $n$-element set. You more or less conjectured that

$\frac{a_n}{\sqrt{n}} \to \frac{1}{\lambda\sqrt{\pi/2}} = 1.27594\ldots$

as $n \to \infty$, where $\lambda$ is the Golomb-Dickman constant. I believe that’s correct.

My argument is a combination of mathematical argument and computer experiment. Here goes.

For the mathematical part, I’ll want some terminology. Let $f$ be an endofunction of a finite set $X$. I mentioned before that $f$ restricts to a permutation of $\im^\infty(f)$. I’ll call this the permutation induced by $f$, and I’ll call $\im^\infty(f)$ the eventual image of $f$.

Given $1 \leq k \leq n$ and a permutation $\sigma$ of $\{1, \ldots, k\}$, how many endofunctions of $\{1, \ldots, n\}$ have $\sigma$ as their induced permutation? By what we know about the structure of endofunctions on a finite set, the answer is actually independent of $\sigma$. It’s the number $T_{n, k}$ of rooted forests with vertex-set $\{1, \ldots, n\}$ and with $\{1, \ldots, k\}$ as the set of roots.

Cayley’s formula easily implies that $T_{n, k} = k n^{n - k - 1}$. At least, so says Wikipedia, though I didn’t check.

So, given $1 \leq k \leq n$ and a prescribed permutation $\sigma$ of $\{1, \ldots, k\}$, the number of endofunctions of $\{1, \ldots, n\}$ whose induced permutation is $\sigma$ is $k n^{n - k - 1}$.

Since there are $k!$ permutations of $\{1, \ldots, k\}$, the number of endofunctions of $\{1, \ldots, n\}$ with eventual image $\{1, \ldots, k\}$ is $k! \cdot k n^{n - k - 1}$.

Of course, the same is true for any other $k$-element subset of $\{1, \ldots, n\}$. So, the number of endofunctions of $\{1, \ldots, n\}$ whose eventual image has cardinality $k$ is

$\binom{n}{k} k! \cdot k n^{n - k - 1} = \frac{n! k n^{n - k - 1}}{(n - k)!}.$

This is true for all $1 \leq k \leq n$.

There are $n^n$ endofunctions of $\{1, \ldots, n\}$. So, choosing one uniformly at random, the expected cardinality of its eventual image is

$\frac{1}{n^n} \sum_{k = 1}^n \frac{n! k n^{n - k - 1}}{(n - k)!} \cdot k = \sum_{k = 1}^n \frac{k^2 (n - 1)!}{n^k (n - k)!}.$

That expected cardinality is, by definition, $a_n$. So we’ve shown that

$a_n = \sum_{k = 1}^n \frac{k^2 (n - 1)!}{n^k (n - k)!}.$

I was feeling too tired/stupid to attempt to find a closed formula for $a_n$, or to try to figure out its asymptotics without the aid of a closed formula. So instead I just put some values of $n$ into sage.

What came out suggested that, as you predicted, $a_n$ grows like $\sqrt{n}$. And so I made a table showing $a_n$ and $a_n/\sqrt{n}$ for various values of $n$:

n          a_n          a_n/sqrt(n)

1    1.000000000000  1.00000000000000
2    1.500000000000  1.06066017177982
3    1.888888888889  1.09055050846929
4    2.218750000000  1.10937500000000
5    2.510400000000  1.12268501014309
10    3.660215680000  1.15746182762620
100   12.209960630216  1.22099606302160
1000   39.303212926178  1.24287672209294
10000  124.999121868081  1.24999121868081
20000  176.912788799720  1.25096232638906

It looks as if $a_n/\sqrt{n}$ is converging. What to? Well, you mentioned the constant $\lambda\sqrt{\pi/2}$, whose reciprocal is $1.27594\ldots$. The numerical data above makes it plausible that this is the limit of $a_n/\sqrt{n}$. And if so, that’s the conjectured result.

Posted by: Tom Leinster on November 25, 2019 11:53 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Wow, Tom! You did what I wanted to do, but without explicitly talking about species — just sort of diving in there and thinking about all the ways you could flesh out a permutation of a subset of $\{1,\dots, n\}$ to an endofunction on all of $\{1,\dots, n\}$. Great!

Now the fun part is studying the asymptotic behavior of your sum

$a_n = \sum_{k = 1}^n \frac{k^2 (n-1)!}{n^k (n-k)!}$

as $n \to \infty$.

If you look at Theorem 3.12.1 in Lagarias’ paper, which is about random endofunctions, you’ll notice the proof contains the identity

$\sum_{k = 1}^n \frac{k (n-1)!}{n^k (n-k)!} = 1$

(after a trivial change of variables). This is not exactly your sum, since it contains a $k$ in the numerator instead of a $k^2$. And of course your sum is growing as $n$ increases. But they’re scarily similar!

By the way, this proof is even more concerned with showing that

$\sum_{k = 1}^n \frac{ln(k)\,(n-1)!}{n^k (n-k)!} \; \sim \; \frac{1}{2} \ln n$

as $n \to \infty$.

So, I suspect that these folks — Lagarias, and the people whose work he’s discussing — could also handle the asymptotics of

$a_n = \sum_{k = 1}^n \frac{k^2 (n-1)!}{n^k (n-k)!}$

as $n \to \infty$. Whether we can is another matter.

By the way, I strongly believe that

$a_n = \sum_{k = 1}^n \frac{k^2 (n-1)!}{n^k (n-k)!} \sim c \sqrt{n}$

but I’m not convinced yet that the constant $c$ is

$\frac{1}{\lambda \sqrt{\pi/2}}$

Posted by: John Baez on November 26, 2019 1:23 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

In his discussion of random endofunctions, Lagarias refers to this paper:

They compute the $m$th moment of this random variable: the length of the $i$th longest cycle of a randomly chosen endofunction on an $n$-element set, asymptotically as $n \to \infty$.

On page 549 they define a function

$Q_n(k) = \sum_{j=1}^n \frac{(n-1)! j^k}{(n-j)! n^j}$

I’m going to stick with their names for variables so I don’t get confused, but your $a_n$ is their $Q_n(2)$.

On page 550 they say

$Q_n(2) = n Q_n(0)$

They also give an exact formula for $Q_n(0)$ in terms of a special function called the incomplete gamma function, and also a simpler asymptotic formula good for $n \to \infty$:

$Q_n(0) \sim \sqrt{\pi / 2n}$

Their formula says more — it includes lower-order terms — but it implies this.

Putting all this together we get

$a_n \sim c \sqrt{n}$

where

$c = \sqrt{\pi / 2} \approx 1.25331413732 \dots$

This matches your computer calculations quite nicely. The Golomb–Dickman constant $\lambda$ is a red herring here!

So, for those who haven’t been following carefully, we’ve proved this:

Theorem. Let $a_n$ be the expected number of periodic points for a randomly chosen endofunction of an $n$-element set. Then

$a_n \sim \sqrt{\frac{\pi n}{2}}$

as $n \to \infty$.

By the way, this gives an extremely impractical but pretty way to compute $\pi$, sort of like Buffon’s needle but using random endofunctions.

Posted by: John Baez on November 26, 2019 6:39 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Splendid, so we’re done!

And your correction to my constant shows that convergence is faster than it would have been had my constant been correct (e.g. at $n = 20,000$ we’re only about 0.002 off the actual value of $\lim_{n \to \infty} a_n/\sqrt{n}$, whereas it would have been further off with the constant I thought it was).

A couple of remarks:

You did what I wanted to do, but without explicitly talking about species — just sort of diving in there

Maybe I should be ashamed of not using species :-) I’m not as skilled with them as you are. But I think my argument was fundamentally species-theoretic, and could be nicely phrased that way. E.g. where I said this —

By what we know about the structure of endofunctions on a finite set…

— that’s probably best formalized with species. So maybe someone can take my argument and express it more nicely in terms of species.

You wrote:

If you look at Theorem 3.12.1 in Lagarias’ paper, which is about random endofunctions, you’ll notice the proof contains the identity $\sum_{k = 1}^n \frac{k(n - 1)!}{n^k(n - k)!} = 1$

and then observed that if you change the $k$ in the numerator of the left-hand side to a $k^2$, you get the expression for $a_n$ in my previous comment.

Actually, I implicitly (re)proved this identity too. In my argument, I found the probability $p_k$ that a random endofunction of an $n$-element set has eventual image of cardinality $k$, for each $1 \leq k \leq n$. It’s

$p_k = \frac{k(n - 1)!}{n^k(n - k)!}.$

Two things follow. First, since they’re probabilities,

$\sum_{k = 1}^n p_k = 1.$

That’s the identity from Lagarias’s paper that you mention. Second, the expected cardinality of the eventual image of a random endofunction of an $n$-element set is

$\sum_{k = 1}^n p_k \cdot k.$

That’s the formula for $a_n$ proved in my comment.

Posted by: Tom Leinster on November 26, 2019 1:57 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

By the way, at first I was surprised that the eventual image (the set of periodic points) of a random endofunction would be so small. For instance, our result tells us that on a set with a million elements, the eventual image of typical endofunction has only about 1250 elements. When we apply our function over and over again, everything in the set eventually gets sucked into this subset that takes up little more than 0.1% of the whole.

But then I mentioned this to a friend who’s expert in dynamical systems, and she wasn’t surprised at all. Her point of view was that in general, attractors of dynamical systems take up only a small amount of the space concerned. (And an attractor of a dynamical system is exactly what the eventual image is.)

But she had no sense for why it should be a square root relationship — that is, why the number of periodic points of a random endofunction on a large set should be proportional to the square root of the cardinality of that set. I’d like to hear a good intuitive explanation of this!

Posted by: Tom Leinster on November 26, 2019 2:04 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I wrote:

But she had no sense for why it should be a square root relationship — that is, why the number of periodic points of a random endofunction on a large set should be proportional to the square root of the cardinality of that set. I’d like to hear a good intuitive explanation of this!

David Speyer sheds some light over here.

Posted by: Tom Leinster on December 27, 2019 3:25 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Tom wrote:

Maybe I should be ashamed of not using species :-)

Don’t be.

But I think my argument was fundamentally species-theoretic, and could be nicely phrased that way.

Probably. I think doing it may require the use of 2-variable generating functions or something like that. I’m not an expert at these! In future posts I want to show how people use them to tackle questions like this:

• What is the expected number of cycles in a random permutation of an $n$-element set?

• What is the expected number of cycles of length $k$ in a random permutation of an $n$-element set?

• If we choose an element of an $n$-element set, what is the probability that it lies in a cycle of length $k$?

Roughly speaking, they use one variable to keep track of elements and another to keep track of cycles. I’ll try to give a better explanation.

I think these fancier techniques might let us tackle questions like:

• How many iterations of $f$ are required, on average, before $f^k(x)$ lies in the eventual image?

• For each point in the eventual image, how many points not in the eventual image map to it under some power of $f$, on average?

(The second one is asking about the average size of the trees in the $End \cong Exp \circ Cyc \circ Tree$ picture of an endofunction.)

By the way, thanks for pointing out that

$\sum_{k = 1}^n \frac{k(n - 1)!}{n^k(n - k)!} = 1$

is just saying probabilities sum to one! That certainly helps demystify that!

And another puzzle:

But she had no sense for why it should be a square root relationship — that is, why the number of periodic points of a random endofunction on a large set should be proportional to the square root of the cardinality of that set. I’d like to hear a good intuitive explanation of this!

Here’s a crude first stab at it. What’s the average size of the image of an endofunction of an $n$-element set? I’m sad to say I have no idea. But it’s clearly an upper bound on the average size of the eventual image. If it’s already $O(\sqrt{n})$, that would explain why the eventual image is so small — though not why it’s not even smaller!

Sufficiently skilled mathematicians might have fun computing the average size of the image of $f$, the image of $f^2$, etc., and seeing how they converge to the average size of the eventual image.

Posted by: John Baez on November 27, 2019 7:31 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

What’s the average size of the image of an endofunction of an $n$-element set?

The probability that a given element is not in the image is $\left(\frac{n-1}{n}\right)^n,$ so the average size of the image is $n \left(1 - \left(\frac{n-1}{n}\right)^n\right) \approx n(1-e^{-1}).$

The average size of the image of $f^2$ and beyond would take more thought than I have time for at the moment.

Posted by: Mark Meckes on November 27, 2019 3:16 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Thanks! I got stuck right away because I didn’t think of counting the points that are not in the image.

Posted by: John Baez on November 27, 2019 5:55 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I figured this result on the size of the eventual image was unlikely to be new. Over on Twitter I’m starting to get some useful feedback from Louigi Addario-Berry:

You are veering into my territory! You can find some related results in Chapter 9 of Jim Pitman’s Combinatorial Stochastic Processes: https://www.stat.berkeley.edu/~aldous/206-Exch/Papers/pitman_CSP.pdf

Valentin Kolchin literally wrote the book on random mappings in 1986 (title: Random Mappings). His Lemma 3.1.4 gives the limiting distribution of $C_n/\sqrt{n}$, where $C_n$ is the # of cyclic points of a random function $f:[n] \to [n]$. He attributes this result to Bernard Harris (1960).

Harris’s paper: https://projecteuclid.org/euclid.aoms/1177705677

Harris efficiently crushes many simple questions about random endofunctions in Section 3 of this paper.

Posted by: John Baez on November 27, 2019 4:11 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I’ve seen this result (and other interesting results of similar type) in Random mapping statistics by P. Flajolet and A. M. Odlyzko (EUROCRYPT 1989). Crypto people are interested in this due to hash collisions and Pollard rho.

This is a very well-written article, which makes extensive use of generating functions (and thus implicitly species). It gives a much slicker proof than Tom’s and mine that if $f \colon X \to X$ is a random endofunction on an $n$-element set, the expected number of periodic points of $f$ is $\sim \sqrt{\pi n / 2}$. This proof uses generating functions!

This article also says that the expected cardinality of the image of $f^k$ is asymptotically

$(1 - \tau_k) n$

where

$\tau_0 = 0, \quad \tau_{k+1} = e^{\tau_k - 1}$

So, the expected size of the image of $f$ is

$\sim (1- e^{-1})n \approx 0.63 n$

as Mark Meckes pointed out, while the image of $f^2$ has expected size

$\sim (1 - e^{e^{-1} - 1}) n \approx 0.47 n$

and so on. Since each of these asymptotic formulae are only good in the $n \to \infty$ limit, and it typically takes more and more iterations of $f$ to reach the eventual image (the set of periodic points) as $n$ gets larger, we never notice from these formulae that the eventual image has average cardinality $O(\sqrt{n})$.

Posted by: John Baez on November 28, 2019 5:06 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Here’s a question in a similar spirit to the one I asked before, but maybe less likely to have been answered before.

Fix a finite field $k$. Let $T$ be a random linear endomorphism of an $n$-dimensional vector space over $k$. What is the expected dimension of its eventual image

$im^\infty(T) = \bigcap_{i = 0}^\infty im(T^i)?$

Or maybe a better question: what is the expected cardinality of the eventual image?

These are two different questions: expectations are arithmetic means, and the arithmetic mean of dimensions corresponds to the geometric mean of cardinalities. For example, if we take subspaces $U_1$ and $U_2$ of $k^n$ and write $q = \# k$ then $\# U_i = q^{\dim U_i}$, so

$\sqrt{\# U_1 \cdot \# U_2} = q^{\tfrac{1}{2}(\dim U_1 + \dim U_2)}.$

My dream is that if you take the formula for the expected cardinality of the eventual image of a linear endomorphism of an $n$-dimensional space over $k$, and evaluate at $q = 1$, you get the formula for the expected cardinality of the eventual image of a set-theoretic endomorphism of an $n$-element set — in the usual $\mathbb{F}_1$ way.

Posted by: Tom Leinster on November 28, 2019 5:02 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

We can do better:

Theorem. Let $a_n$ be the expected number of periodic points for a randomly chosen endofunction of an $n$-element set. Then

$\displaystyle{ a_n = \sqrt{ \frac{\pi}{2}} n^{\frac{1}{2}} - 3 - \frac{1}{12} \sqrt{\frac{\pi}{2}} n^{-\frac{1}{2}} - \frac{4}{135} n^{-1} + \frac{1}{288} \sqrt{ \frac{\pi}{2}} n^{-\frac{3}{2}} + O(n^{-2}) }$

Most of this follows easily from work of Purdom and Williams, who give this formula: However, Christopher D. Long, whom I trust, say that Purdom and Williams are wrong about the term involving the number 91/540. He says it should be 16/540, which equals 4/135.

Curiously, 16 upside down is 91. I wonder how this mistake crept in. Could dyslexia be involved?

Purdom and Williams cite Knuth, and now I’m wondering what Knuth actually wrote.

Posted by: John Baez on November 30, 2019 1:16 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I have a much more naive comment. The Euler constant $\gamma$ (which I’m used to calling the Euler-Mascheroni constant to distinguish it from $e$) has the value $\gamma \approx 0.577$. On the other hand, the value $e^{-\gamma}$ which appears in the answer to Puzzle 4 has the value $e^{-\gamma} \approx 0.561$. Is there “reason” why these numbers are “close” – within 2% of each other? Or is this just an instance of the law of small numbers?

By the way, according to Wikipedia the number $e^\gamma$ is known to be important in number theory. Wikipedia lists a few identities involving it, including the following: $e^\gamma = \lim_{n \to \infty} \frac 1 {\log p_n} \prod_{i=1}^n \frac{p_i}{p_i - 1}$ where the sum is over primes.

Posted by: Tim Campion on December 14, 2019 4:40 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

The only ‘reason’ I know that $\gamma$ is close to $e^{-\gamma}$ is that both these numbers are close to

$x = 0.5671432904097838729999686622103555497538157 \dots$

which is the positive number such that

$x = e^{-x}$

Indeed $x = W(1)$, where $W$ is the Lambert $W$ function, a function such that

$x = W(x) e^{W(x)}$

This was not very helpful, I know… but the Lambert $W$ function is nice to know about.

The formula you wrote relating $e^\gamma$ and prime numbers is much more interesting! It’s called Merten’s third theorem.

It’s interesting because there’s a deep set of analogies between how a randomly chosen large natural number is expressed as a product of primes and how a randomly chosen large permutation is decomposed into cycles:

I want to discuss this later in this series of posts, though it’s a bit of a digression.

The number $e^{-\gamma}$ shows up on both sides of this analogy. On the one side it shows up in a formula for the probability that the shortest cycle in a large permutation has length $\gt k$ — that’s what I discussed in this blog article. On the other side it shows up in a formula for the probability that the smallest prime factor in a large natural number is $\gt k$ — this follows from Merten’s third theorem.

Granville discusses this with some awe on page 5 of his paper.

As an analytic number theorist I find this particular piece of evidence rather special. You have to understand that the appearance of $\gamma$ in Mertens’ theorem is somewhat mysterious…

Posted by: John Baez on December 14, 2019 6:32 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I’m also charmed by Joyal’s bijective proof of the Cayley formula that Tom uses in the comments. Joyal’s proof is found in Proofs from THE BOOK, Ch. 30 – I don’t know if Joyal himself published it?

Cayley’s formula says that if $T_n$ is the number of tree structures on a labeled $n$-element set, then $T_n = n^{n-2}$, which is very suggestive of some relation to endofunctions, since there are $n^n$ endofunctions. Joyal makes this suggestion precise. Let $T_n^{++}$ be the number of trees on the set $\underline n = \{1,\dots,n\}$ equipped with an ordered pair of distinguished points (which need not be distinct). Then clearly $T_n^{++} = n^2 T_n$, so the goal is to show that $T_n^{++} = n^n$, which Joyal does via an explicit bijection to the set of endofunctions of $\underline n$.

The bijection uses, of course, the “structure theory of endofunctions” discussed in the comments above. Think of an endofunction $f: \underline n \to \underline n$ as a forest on $\underline n$ equipped with a root $r_i$ for each connected component of the forest and a permutation $\sigma$ of these roots (the set $I = \{r_1,\dots,r_k\}$ of roots is the eventual image of $f$). Then note that permutations are in bijection with linear orders: we send a permutation $\sigma: I \to I$ to the linear order $L: \sigma(r_1) \lt \sigma(r_2) \lt \dots, \lt \sigma(r_k)$ where the roots $r_1,\dots,r_k$ are ordered via the natural order on $\underline n$. To construct a tree on $\underline n$ with two distinguished points, use the forest structure for the points not in $I$, and for the points in $I$ use the linear order $L$; the distinguished points are the top and bottom of $L$.

So it really all boils down to the bijection between permutations and linear orders. Note that this bijection (or the one I wrote down above at any rate) is not equivariant under relabeling. So although there is a bijection between permutations and linear orders, I think there is not an isomorphism of species between the species of permutations and the species of linear orders! Assuming I’ve gotten that right, it’s a pretty striking fact to me. What are some other important examples of species with the same generating function which are not isomorphic?

Posted by: Tim Campion on December 14, 2019 6:39 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

You are correct that those species aren’t isomorphic. Species come with an $S_n$-action on objects of size $n$; the action on total orders is transitive, while the action on permutations is by conjugation and so has many orbits.

I’m not aware of any natural examples of nonisomorphic species with the same generating function that aren’t built from that one in some way.

Posted by: lambda on December 14, 2019 8:54 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

lambda wrote:

I’m not aware of any natural examples of nonisomorphic species with the same generating function that aren’t built from that one in some way

…where “that one” means total orders vs. permutations.

That’s a good challenge. Here’s what might be an example.

Any finite group has the same number of conjugacy classes as irreducible representations (over $\mathbb{C}$), but there’s no canonical bijection between the two things. So let’s define species $F$ and $H$ as follows: for a finite set $X$,

• $F(X)$ is the set of pairs $(\ast, C)$ where $\ast$ is a group structure on $X$ and $C$ is a conjugacy class of $(X, \ast)$

• $H(X)$ is the set of pairs $(\ast, \alpha)$ where $\ast$ is a group structure on $X$ and $\alpha$ is an isomorphism class of irreducible representations of $(X, \ast)$.

Then $F$ and $H$ have the same generating function, but I think they’re not isomorphic as species.

If one wanted to prove this, one would have to rule out the possibility that $F$ and $H$ are isomorphic as species in a way that changes the group structure. The point is that given a finite group $(X, \ast)$, conceivably there’s a different group structure $\star$ on $X$ such that the conjugacy classes of $(X, \ast)$ correspond naturally to the irreducible representations of $(X, \star)$.

I have a memory of someone telling me that for any finite group $(X, \ast)$, there is some associated group with the property just described, a kind of dual of $(X, \ast)$. (Is it the Langlands dual?) But it seems unlikely that this would have the same order as the original group, which it would have to in order to give an isomorphism between $F$ and $H$.

Posted by: Tom Leinster on December 15, 2019 5:48 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Posted by: David Corfield on December 16, 2019 9:50 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Posted by: Tom Leinster on December 16, 2019 1:39 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Tom wrote:

I have a memory of someone telling me that for any finite group $(X, \ast)$, there is some associated group with the property just described, a kind of dual of $(X, \ast)$.

David wrote:

You were wowed by that observation by David Ben-Zvi back here.

To be precise, Ben-Zvi didn’t say this was true for every finite group, just some large class, maybe reductive algebraic groups over finite fields. It still deserves a “wow!”

Posted by: John Baez on December 16, 2019 4:45 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Tim wrote:

I don’t know if Joyal himself published it?

He did. You can find a reference in my recent post about Joyal’s proof of Cayley’s theorem.

Posted by: Tom Leinster on December 14, 2019 11:22 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

lambda wrote:

Species come with an $S_n$-action on objects of size $n$; the action on total orders is transitive, while the action on permutations is by conjugation and so has many orbits.

I’m not aware of any natural examples of nonisomorphic species with the same generating function that aren’t built from that one in some way.

This is indeed the source of all the most exciting examples I know, except perhaps the example Tom mentioned. But it’s good to realize that we can crank out infinitely many examples at will. It’s enough to find two sets $X$ and $Y$ with actions of some $S_n$ that are isomorphic as sets but not as sets with $S_n$ action.

So, for example, $S_3$ acts on a 3-element set in the usual way, as permutations — but it also acts trivially on a 3-element set. These actions give two nonisomorphic species with the same generating function

$\frac{x^3}{2}$

The first of these is isomorphic to the species I call ‘being a pointed 3-element set’. This is a structure that can only be put on a set if it has cardinality 3, and then consists of choosing a point in that set. The second is isomorphic to ‘being a 3-element set all of whose elements are colored the same way: either red, green, or blue’.

In my combinatorics class I showed an easy result: every species is a coproduct of ‘indecomposable’ species — i.e., those that are not coproducts of other species in a nontrivial way. Moreover, every indecomposable species comes from a transitive action of $S_n$ on some set. Further, every transitive action of $S_n$ on a set is isomorphic to its left action on some set $S_n/H$, where $H \subseteq S_n$ is a subgroup. Two such transitive actions are isomorphic iff the subgroups are conjugate.

So, ‘classifying’ species boils down to classifying subgroups of symmetric groups up to conjugacy. This latter task is hopeless unless we restrict to symmetric groups $S_n$ with fairly small $n$. But this viewpoint on species is still a great alternative to the usual one! Instead of looking for charismatic examples of species and studying those in detail one at a time, we can treat species as built out of ‘atoms’ — transitive actions of symmetric groups — and take a more ‘reductionistic’ approach.

Posted by: John Baez on December 15, 2019 6:28 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Since this is the $n$-Category Café (incorporating the $1$-Category Café), let me make a categorical observation.

Earlier in this thread, we were discussing functions $f$ from a finite set $X$ to itself, and a certain subset of $X$ canonically associated with $f$. In that discussion, I usually caledl that subset the “eventual image” and thought of it as

$\bigcap_{n \geq 0} f^n X.$

John more often called it the set of periodic points. It’s a short exercise to see that the eventual image is, indeed, the set of periodic points.

The point I want to make now is that in a fairly precise sense, these two descriptions of the same set are dual. Here’s what I mean.

For any category $C$, there’s a category $Endo(C)$ in which an object is an object of $C$ equipped with an endomorphism, and a map is a map in $C$ making an obvious square commute. It has a full subcategory $Auto(C)$ consisting of just the automorphisms of objects of $C$. And there’s an evident inclusion functor

$Auto(C) \to Endo(C).$

For many categories $C$, this inclusion has a left adjoint and a right adjoint (for reasons to do with Kan extensions). But when $C = FinSet$, more is true: the left and right adjoints are the same. This two-sided adjoint takes as input an endomorphism $f$ of a finite set $X$, and produces as output the restriction of $f$ to the set of periodic points — or equivalently, the eventual image.

If you want to prove that this process defines a left adjoint, it’s easiest to think of it as outputting the eventual image. If you want to prove it defines a right adjoint, it’s easiest to think of it as outputting the set of periodic points. That’s what I mean about them being dual points of view.

I think you blogged about this stuff when I had just gone to the Centre of Quantum Technologies and just started my blog Azimuth. I was feeling very “applied” back then, trying not to think about pure math. So I was avoiding the $n$-Café. Now I’m regressing toward the mean — my own personal mean, that is.