Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

November 24, 2019

Random Permutations (Part 2)

Posted by John Baez

Here’s a somewhat harder pair of problems on random permutations, with beautiful answers.

Puzzle 3. In the limit as nn \to \infty, what is the probability that the shortest cycle in a randomly chosen permutation of an nn-element set has length >k\gt k?

We mainly need to count the permutations of an nn-element set whose shortest cycle has length >k\gt k. So, we define a functor

Perm >k:SFinSet Perm_{\gt k} \colon S \to FinSet

from the groupoid of finite sets to the category of finite sets, sending any finite set XX to the finite set

Perm >k(X)={permutationsofXwhoseshortestcyclehaslength>k} Perm_{\gt k}(X) = \{ permutations \; of \; X \; whose \; shortest \; cycle \; has \; length \gt k \}

A functor from SS to FinSetFinSet is called a finite species and this is what my course was about. We think of a finite species G:SFinSetG : S \to FinSet as a map sending any finite set XX to the set of ‘GG-structures’ we can put on XX. Any finite species GG has a ‘generating function’

|G|(x)= n0|G(n)|n!x n |G|(x) = \sum_{n \ge 0} \frac{|G(n)|}{n!} x^n

where in G(n)G(n) we are using nn as our notation for some nn-element set. Our basic strategy will be to compute the generating function of the species Perm >kPerm_{\gt k} and use that to count permutations of an nn-element set whose shortest cycle has length kk. But in fact, to save work, we will just work out the asymptotics as nn \to \infty.

We can compose species: a GHG \circ H structure on XX consists of a way of writing XX as a union of disjoint parts, putting a GG-structure on the set of parts, and putting an HH-structure on each part. We have

|GH|=|G||H| |G \circ H| = |G| \circ |H|

where at right we’re composing generating functions.

So, if we can write Perm >kPerm_{\gt k} as a composite of two species whose generating functions we know, we can work out |Perm >k||Perm_{\gt k}|.

In fact Perm >kPerm_{\gt k} is the composite of two species called ExpExp and Cyc >kCyc_{\gt k}. The species ExpExp is called ‘being a finite set’: we can put an ExpExp-structure on any finite set in exactly one way. The species Cyc >kCyc_{\gt k} is called ‘being a cyclically ordered set of cardinality >k\gt k’: we can put a Cyc >kCyc_{\gt k}-structure on a finite set only if its cardinality is >k\gt k, and then a Cyc >kCyc_{\gt k}-structure on it is just a cyclic ordering.

We have

Perm >kExpCyc >k Perm_{\gt k} \cong Exp \circ Cyc_{\gt k}

because a permutation with cycles of length >k\gt k on a finite set XX is the same as a way of chopping XX into subsets of cardinality >k\gt k and cyclically ordering each one. Thus, we have

|Perm >k|=|Exp||Cyc >k| |Perm_{\gt k}| = |Exp| \circ |Cyc_{\gt k}|

and we just need to compute the two generating functions on the right-hand side.

The generating function of ExpExp is easy to compute:

|Exp|(x)= n01n!x n=exp(x) |Exp|(x) = \sum_{n \ge 0} \frac{1}{n!} x^n = exp(x)

and this explains the name ExpExp. The generating function of Cyc >kCyc_{\gt k} is

|Cyc >k|(x)= n>k(n1)!n!x n= n>kx nn |Cyc_{\gt k}|(x) = \sum_{n \gt k} \frac{(n-1)!}{n!} x^n = \sum_{n \gt k} \frac{x^n}{n}

since there are (n1)!(n-1)! cyclic orderings on an nn-element set. We have

n1x nn=ln(11x) \sum_{n \ge 1} \frac{x^n}{n} = \ln\left(\frac{1}{1-x}\right)

(just integrate the geometric series) so we get

|Cyc >k|(x)=ln(11x)(x+x 22++x kk) |Cyc_{\gt k}|(x) = \ln\left(\frac{1}{1-x}\right) - \left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)

Since

|Perm >k|=|Exp||Cyc >k| |Perm_{\gt k}| = |Exp| \circ |Cyc_{\gt k}|

we get

|Perm >k|(x)=e ln(11x)(x+x 22++x kk) |Perm_{\gt k}|(x) = e^{ \ln\left(\frac{1}{1-x}\right) - \left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right) }

or simply

|Perm >k|(x)=e (x+x 22++x kk)1x |Perm_{\gt k}|(x) = \frac{e^{-\left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)}}{1-x}

This is really a kind of formula for |Perm >k(n)||Perm_{\gt k}(n)|, the number of permutations of an nn-element set with shortest cycle of length >k\gt k. After all, by the definition of generating function, this equation says

n0|Perm >k(n)|n!x n=e (x+x 22++x kk)1x \sum_{n \ge 0} \frac{|Perm_{\gt k}(n)|}{n!} x^n = \frac{e^{-\left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)}}{1-x}

Even better, the coefficient

|Perm >k(n)|n! \frac{|Perm_{\gt k}(n)|}{n!}

is the probability we’re after: the probability that the shortest cycle in a randomly chosen permutation of an nn-element set has length >k\gt k.

Unfortunately, working out these coefficients is hard.

Fortunately, we just want to know the limit of these coefficients as nn \to \infty. And for this we can use a little result from complex analysis. I’ll prove it later:

Lemma. Suppose

f(z)= n0c nz n f(z) = \sum_{n \ge 0} c_n z^n

is a meromorphic function defined on a disc of radius >1\gt 1 centered at the origin, which is analytic on this disc except for a simple pole at z=1z = 1. Then lim nc n \lim_{n \to \infty} c_n equals the residue of f-f at z=1z = 1.

If you want to understand this lemma, I urge that you consider an example: the function 1/(1z)1/(1-z). The coefficients c nc_n of its power series are all 11, and it has a pole at z=1z = 1 with residue 1-1. This is why we need to take the residue of f-f in the lemma. In fact, proving the lemma consists of reducing the general problem to this special case.

Given this lemma, the quantity we’re trying to compute:

lim n|Perm >k(n)|n! \lim_{n \to \infty} \frac{|Perm_{\gt k}(n)|}{n!}

is the residue of the function

e (z+z 22++z kk)z1 \frac{e^{-\left(z + \frac{z^2}{2} + \cdots + \frac{z^k}{k}\right)}}{z-1}

at the point z=1z = 1. This function is just minus the generating function of Perm >kPerm_{\gt k}, but I’ve changed xx to zz to emphasize that now we’re thinking of it as a function on the complex plane. It’s actually analytic everywhere except z=1z = 1, and since 1/(z1)1/(z-1) has a simple pole of residue 11 at this point, the residue of

e (z+z 22++z kk)z1 \frac{e^{-\left(z + \frac{z^2}{2} + \cdots + \frac{z^k}{k}\right)}}{z-1}

at z=1z = 1 is just the numerator evaluated at z=1z = 1, namely

e (1+12++1k) e^{-\left(1 + \frac{1}{2} + \cdots + \frac{1}{k}\right)}

So this number is the probability that the shortest cycle in a randomly chosen permutation of an nn-element set has length >k\gt k, in the nn \to \infty limit.

This is beautiful, I think. And it leads immediately to another beautiful fact.

Puzzle 4. Give a simpler asymptotic formula for the above probability as kk \to \infty.

Here we use the fact that

1+12++1k=lnk+γ+o(1) 1 + \frac{1}{2} + \cdots + \frac{1}{k} = \ln k + \gamma + o(1)

where γ0.57721\gamma \approx 0.57721 is Euler’s constant. So, as kk \to \infty, the probability we’ve just computed is

e (1+12++1k)e (lnk+γ)=1e γk e^{-\left(1 + \frac{1}{2} + \cdots + \frac{1}{k}\right)} \sim e^{-\left(\ln k + \gamma \right)} = \frac{1}{e^\gamma \, k}

When Euler discovered his constant γ\gamma in his 1776 paper “On a memorable number naturally occurring in the summation of the harmonic series”, he voiced his suspicion that e γe^\gamma would be interesting… and here is some evidence that it is!

By the way, I learned about these questions here:

They were first answered in 1952:

To wrap up, let me prove that lemma. It’s just a special case of a much more general useful result on the asymptotic behavior of the coefficients of a power series:

  • Philippe Flajolet and Robert Sedgewick, Analytic Combinatorics, Cambridge University Press, Cambridge, 2009.

But once you’ve seen the proof of this special case, it’s easy to generalize.

Lemma. Suppose

f(z)= n0c nz n f(z) = \sum_{n \ge 0} c_n z^n

is a meromorphic function defined on a disc of radius >1\gt 1 centered at the origin, which is analytic on this disc except for a simple pole at z=1z = 1. Then lim nc n \lim_{n \to \infty} c_n equals the residue of f-f at z=1z = 1.

Proof. The idea is to ‘subtract off the singularity’. We can write

f(z)=g(z)+a1z f(z) = g(z) + \frac{a}{1-z}

where gg is analytic on a disc of radius >1\gt 1 and aa is the residue of f-f at z=1z = 1. If we write

g(z)= n0d nz n g(z) = \sum_{n \ge 0} d_n z^n

then we have

c n=d n+a c_n = d_n + a

for all nn. But the Cauchy–Hadamard theorem says that

limsup n|d n| 1/n=1/R \limsup_{n \to \infty} |d_n|^{1/n} = 1/R

where RR is the radius of convergence of n0d nz n\sum_{n \ge 0} d_n z^n, and since R>1R \gt 1 we have d n0d_n \to 0 as nn \to \infty. Thus

c na c_n \to a

as was to be shown.   █

Posted at November 24, 2019 2:43 AM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3162

28 Comments & 0 Trackbacks

Re: Random Permutations (Part 2)

Wow, this is a gorgeous application of generating functors and generating functions. It highlights the differences in the approaches and where they both shine: the functors rely on explicit bijections and looking into the process of building up the thing you’re counting (so they feel very concrete and constructive), while the functions are truly analytical objects that can be worked with complex analytically (and so they partake in the magic that is complex analysis)!

Posted by: David Jaz Myers on November 24, 2019 10:22 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Thanks! Yes, I’m looking for examples that really illustrate the interaction between species and complex analysis. I’m also hoping there’s more than meets the eye here: maybe a kind of categorification of certain pieces of complex analysis.

For example, for any open set UU \subseteq \mathbb{C} there’s a ring A(U)A(U) of analytic functions on UU. But also for any UU containing the origin there’s a 2-rig of finite species whose generating functions lie in A(U)A(U), where the 2-rig operations are the usual addition and multiplication of species (that is, coproduct or more generally finite colimits, and the Day convolution tensor product). I wish this were a sheaf (or stack) of 2-rigs on \mathbb{C}, but it seems I only have access to these generating functions through their Taylor series at the origin. I need to think about how analytic continuation works at the categorified level.

Posted by: John Baez on November 24, 2019 10:30 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Here’s a question I don’t know the answer to.

Preamble: any map f:XXf: X \to X from a finite set to itself — not necessarily a permutation! — gives rise canonically to a permutation f| X ff|_{X_f} of a subset X fX_f of XX. It’s constructed like this: the chain of subsets

fXf 2Xf 3X f X \supseteq f^2 X \supseteq f^3 X \supseteq \cdots

can’t keep getting smaller forever (since they’re all finite), so it must stabilize somewhere. Call the set at which it stabilizes X fX_f. Thus,

X f= n0f nX. X_f = \bigcap_{n \geq 0} f^n X.

(This is the set of “eventually periodic” points of ff.) Then f| X f:X fX ff|_{X_f}: X_f \to X_f is surjective and therefore bijective.

Now the question: when XX is large and f:XXf: X \to X is a function chosen uniformly at random, how big do we expect X fX_f to be?

In words only: what is the expected cardinality of the set of eventually periodic points of a random endomorphism?

In notation: for a finite set nn, let a na_n denote the mean cardinality of X fX_f, taken over all endomorphisms ff of an nn-element set XX. What can be said about a na_n in terms of nn, for large nn?

I’ve phrased this in an open-ended way, because I don’t have a good sense of what a good precise question is. Obviously a na_n grows with nn, so a first attempt at a precise question is this: what is lim na nn \lim_{n \to \infty} \frac{a_n}{n} (if it exists)?

Posted by: Tom Leinster on November 25, 2019 2:45 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Thanks for a great puzzle, Tom!

I should be able to figure this out: it’s a good test of my generating function expertise. But if someone knows, they should just say.

Right now I’m guessing a na_n is of order n\sqrt{n}.

One reason I’m guessing this is that if we take a random permutation of a large nn-element set, the expected size of its largest cycle is λn\lambda n where λ0.624\lambda \cong 0.624 is the Golomb–Dickman constant, while if we take a random endofunction, the expected size of its largest cycle (“eventual cycle”) is much smaller, just λπn/2\lambda \sqrt{\pi n/2}.

But all this “largest cycle” stuff is distracting clutter when it comes to your question; your question is more fundamental.

Posted by: John Baez on November 25, 2019 3:29 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

By the way, here’s a fun thing I covered in my class.

There’s a species EndEnd where End(X)End(X) is the set of endomorphisms of a finite set XX. There’s a species PermPerm where Perm(X)Perm(X) is the set of permutations of XX. And there’s a species TreeTree where Tree(X)Tree(X) is the set of rooted trees with XX as vertices.

We can compose species, and we have

EndPermTree End \cong Perm \circ Tree

This says an endomorphism of XX is a way of chopping XX into parts, putting a permutation on the set of parts, and putting a rooted tree structure on each part. This expresses how as we repeatedly apply the endomorphism to the elements of XX, they flow down to the roots of trees, and from then on they get permuted.

Indeed there’s a species CycCyc where Cyc(X)Cyc(X) is the set of cyclic orderings of XX and a species ExpExp where Exp(X)Exp(X) is the one-element set, and we have

PermExpCyc Perm \cong Exp \circ Cyc

Combining these facts we get

EndExpCycTree End \cong Exp \circ \Cyc \circ Tree

This expresses how an endofunction is a disjoint union of ‘connected components’, each of which consists of a cycle and trees rooted at each point of the cycle.

I would like to use this to solve your question.

Posted by: John Baez on November 25, 2019 5:21 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

That is a fun thing!

I feel as if my question should be answerable, but I don’t have your skills in generating functions, sadly, nor can I think of any sneaky argument. Or not yet.

Posted by: Tom Leinster on November 25, 2019 8:00 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I shouldn’t overpromise: I’m trying to build up my skills with generating functions, but I’m not yet a master.

Even if this identity

EndExpCycTree End \cong \Exp \circ \Cyc \circ \Tree

doesn’t help solve your problem, I think it’s something more mathematicians should know: perhaps not phrased in the language of species, but simply in pictures. I bet more grad students know the classification of finite abelian groups than the structure of endofunctions on finite sets! That’s wack.

Posted by: John Baez on November 25, 2019 9:59 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I bet more grad students know the classification of finite abelian groups than the structure of endofunctions on finite sets! That’s wack.

That is, indeed, wack. I guess we teach group theory rather than monoid theory because there are so many more gems to be found in the former. (As far as I know; though maybe I wouldn’t, as like almost everyone else, I know more about groups than monoids.) But it seems like the balance has tilted too far towards groups/permutations and away from monoids/endofunctions.

Posted by: Tom Leinster on November 25, 2019 10:14 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I suspect your initial guess was right.

To recap: given an endofunction ff of a finite set XX, put

im (f)= n0f nX. \im^\infty(f) = \bigcap_{n \geq 0} f^n X.

(I called it X fX_f before.) For n1n \geq 1, let a na_n denote the expected cardinality of im (f)\im^\infty(f) for a random endofunction ff of an nn-element set. You more or less conjectured that

a nn1λπ/2=1.27594 \frac{a_n}{\sqrt{n}} \to \frac{1}{\lambda\sqrt{\pi/2}} = 1.27594\ldots

as nn \to \infty, where λ\lambda is the Golomb-Dickman constant. I believe that’s correct.

My argument is a combination of mathematical argument and computer experiment. Here goes.

For the mathematical part, I’ll want some terminology. Let ff be an endofunction of a finite set XX. I mentioned before that ff restricts to a permutation of im (f)\im^\infty(f). I’ll call this the permutation induced by ff, and I’ll call im (f)\im^\infty(f) the eventual image of ff.

Given 1kn1 \leq k \leq n and a permutation σ\sigma of {1,,k}\{1, \ldots, k\}, how many endofunctions of {1,,n}\{1, \ldots, n\} have σ\sigma as their induced permutation? By what we know about the structure of endofunctions on a finite set, the answer is actually independent of σ\sigma. It’s the number T n,kT_{n, k} of rooted forests with vertex-set {1,,n}\{1, \ldots, n\} and with {1,,k}\{1, \ldots, k\} as the set of roots.

Cayley’s formula easily implies that T n,k=kn nk1T_{n, k} = k n^{n - k - 1}. At least, so says Wikipedia, though I didn’t check.

So, given 1kn1 \leq k \leq n and a prescribed permutation σ\sigma of {1,,k}\{1, \ldots, k\}, the number of endofunctions of {1,,n}\{1, \ldots, n\} whose induced permutation is σ\sigma is kn nk1k n^{n - k - 1}.

Since there are k!k! permutations of {1,,k}\{1, \ldots, k\}, the number of endofunctions of {1,,n}\{1, \ldots, n\} with eventual image {1,,k}\{1, \ldots, k\} is k!kn nk1k! \cdot k n^{n - k - 1}.

Of course, the same is true for any other kk-element subset of {1,,n}\{1, \ldots, n\}. So, the number of endofunctions of {1,,n}\{1, \ldots, n\} whose eventual image has cardinality kk is

(nk)k!kn nk1=n!kn nk1(nk)!. \binom{n}{k} k! \cdot k n^{n - k - 1} = \frac{n! k n^{n - k - 1}}{(n - k)!}.

This is true for all 1kn1 \leq k \leq n.

There are n nn^n endofunctions of {1,,n}\{1, \ldots, n\}. So, choosing one uniformly at random, the expected cardinality of its eventual image is

1n n k=1 nn!kn nk1(nk)!k= k=1 nk 2(n1)!n k(nk)!. \frac{1}{n^n} \sum_{k = 1}^n \frac{n! k n^{n - k - 1}}{(n - k)!} \cdot k = \sum_{k = 1}^n \frac{k^2 (n - 1)!}{n^k (n - k)!}.

That expected cardinality is, by definition, a na_n. So we’ve shown that

a n= k=1 nk 2(n1)!n k(nk)!. a_n = \sum_{k = 1}^n \frac{k^2 (n - 1)!}{n^k (n - k)!}.

I was feeling too tired/stupid to attempt to find a closed formula for a na_n, or to try to figure out its asymptotics without the aid of a closed formula. So instead I just put some values of nn into sage.

What came out suggested that, as you predicted, a na_n grows like n\sqrt{n}. And so I made a table showing a na_n and a n/na_n/\sqrt{n} for various values of nn:

   n          a_n          a_n/sqrt(n)

    1    1.000000000000  1.00000000000000
    2    1.500000000000  1.06066017177982
    3    1.888888888889  1.09055050846929
    4    2.218750000000  1.10937500000000
    5    2.510400000000  1.12268501014309
   10    3.660215680000  1.15746182762620
  100   12.209960630216  1.22099606302160
 1000   39.303212926178  1.24287672209294
10000  124.999121868081  1.24999121868081
20000  176.912788799720  1.25096232638906

It looks as if a n/na_n/\sqrt{n} is converging. What to? Well, you mentioned the constant λπ/2\lambda\sqrt{\pi/2}, whose reciprocal is 1.275941.27594\ldots. The numerical data above makes it plausible that this is the limit of a n/na_n/\sqrt{n}. And if so, that’s the conjectured result.

Posted by: Tom Leinster on November 25, 2019 11:53 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Wow, Tom! You did what I wanted to do, but without explicitly talking about species — just sort of diving in there and thinking about all the ways you could flesh out a permutation of a subset of {1,,n}\{1,\dots, n\} to an endofunction on all of {1,,n}\{1,\dots, n\}. Great!

Now the fun part is studying the asymptotic behavior of your sum

a n= k=1 nk 2(n1)!n k(nk)! a_n = \sum_{k = 1}^n \frac{k^2 (n-1)!}{n^k (n-k)!}

as nn \to \infty.

If you look at Theorem 3.12.1 in Lagarias’ paper, which is about random endofunctions, you’ll notice the proof contains the identity

k=1 nk(n1)!n k(nk)!=1 \sum_{k = 1}^n \frac{k (n-1)!}{n^k (n-k)!} = 1

(after a trivial change of variables). This is not exactly your sum, since it contains a kk in the numerator instead of a k 2k^2. And of course your sum is growing as nn increases. But they’re scarily similar!

By the way, this proof is even more concerned with showing that

k=1 nln(k)(n1)!n k(nk)!12lnn \sum_{k = 1}^n \frac{ln(k)\,(n-1)!}{n^k (n-k)!} \; \sim \; \frac{1}{2} \ln n

as nn \to \infty.

So, I suspect that these folks — Lagarias, and the people whose work he’s discussing — could also handle the asymptotics of

a n= k=1 nk 2(n1)!n k(nk)! a_n = \sum_{k = 1}^n \frac{k^2 (n-1)!}{n^k (n-k)!}

as nn \to \infty. Whether we can is another matter.

By the way, I strongly believe that

a n= k=1 nk 2(n1)!n k(nk)!cn a_n = \sum_{k = 1}^n \frac{k^2 (n-1)!}{n^k (n-k)!} \sim c \sqrt{n}

but I’m not convinced yet that the constant cc is

1λπ/2 \frac{1}{\lambda \sqrt{\pi/2}}

Posted by: John Baez on November 26, 2019 1:23 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

In his discussion of random endofunctions, Lagarias refers to this paper:

They compute the mmth moment of this random variable: the length of the iith longest cycle of a randomly chosen endofunction on an nn-element set, asymptotically as nn \to \infty.

On page 549 they define a function

Q n(k)= j=1 n(n1)!j k(nj)!n j Q_n(k) = \sum_{j=1}^n \frac{(n-1)! j^k}{(n-j)! n^j}

I’m going to stick with their names for variables so I don’t get confused, but your a na_n is their Q n(2)Q_n(2).

On page 550 they say

Q n(2)=nQ n(0)Q_n(2) = n Q_n(0)

They also give an exact formula for Q n(0)Q_n(0) in terms of a special function called the incomplete gamma function, and also a simpler asymptotic formula good for nn \to \infty:

Q n(0)π/2n Q_n(0) \sim \sqrt{\pi / 2n}

Their formula says more — it includes lower-order terms — but it implies this.

Putting all this together we get

a ncn a_n \sim c \sqrt{n}

where

c=π/21.25331413732 c = \sqrt{\pi / 2} \approx 1.25331413732 \dots

This matches your computer calculations quite nicely. The Golomb–Dickman constant λ\lambda is a red herring here!

So, for those who haven’t been following carefully, we’ve proved this:

Theorem. Let a na_n be the expected number of periodic points for a randomly chosen endofunction of an nn-element set. Then

a nπn2 a_n \sim \sqrt{\frac{\pi n}{2}}

as nn \to \infty.

By the way, this gives an extremely impractical but pretty way to compute π\pi, sort of like Buffon’s needle but using random endofunctions.

Posted by: John Baez on November 26, 2019 6:39 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Splendid, so we’re done!

And your correction to my constant shows that convergence is faster than it would have been had my constant been correct (e.g. at n=20,000n = 20,000 we’re only about 0.002 off the actual value of lim na n/n\lim_{n \to \infty} a_n/\sqrt{n}, whereas it would have been further off with the constant I thought it was).

A couple of remarks:

You did what I wanted to do, but without explicitly talking about species — just sort of diving in there

Maybe I should be ashamed of not using species :-) I’m not as skilled with them as you are. But I think my argument was fundamentally species-theoretic, and could be nicely phrased that way. E.g. where I said this —

By what we know about the structure of endofunctions on a finite set…

— that’s probably best formalized with species. So maybe someone can take my argument and express it more nicely in terms of species.

You wrote:

If you look at Theorem 3.12.1 in Lagarias’ paper, which is about random endofunctions, you’ll notice the proof contains the identity k=1 nk(n1)!n k(nk)!=1 \sum_{k = 1}^n \frac{k(n - 1)!}{n^k(n - k)!} = 1

and then observed that if you change the kk in the numerator of the left-hand side to a k 2k^2, you get the expression for a na_n in my previous comment.

Actually, I implicitly (re)proved this identity too. In my argument, I found the probability p kp_k that a random endofunction of an nn-element set has eventual image of cardinality kk, for each 1kn1 \leq k \leq n. It’s

p k=k(n1)!n k(nk)!. p_k = \frac{k(n - 1)!}{n^k(n - k)!}.

Two things follow. First, since they’re probabilities,

k=1 np k=1. \sum_{k = 1}^n p_k = 1.

That’s the identity from Lagarias’s paper that you mention. Second, the expected cardinality of the eventual image of a random endofunction of an nn-element set is

k=1 np kk. \sum_{k = 1}^n p_k \cdot k.

That’s the formula for a na_n proved in my comment.

Posted by: Tom Leinster on November 26, 2019 1:57 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

By the way, at first I was surprised that the eventual image (the set of periodic points) of a random endofunction would be so small. For instance, our result tells us that on a set with a million elements, the eventual image of typical endofunction has only about 1250 elements. When we apply our function over and over again, everything in the set eventually gets sucked into this subset that takes up little more than 0.1% of the whole.

But then I mentioned this to a friend who’s expert in dynamical systems, and she wasn’t surprised at all. Her point of view was that in general, attractors of dynamical systems take up only a small amount of the space concerned. (And an attractor of a dynamical system is exactly what the eventual image is.)

But she had no sense for why it should be a square root relationship — that is, why the number of periodic points of a random endofunction on a large set should be proportional to the square root of the cardinality of that set. I’d like to hear a good intuitive explanation of this!

Posted by: Tom Leinster on November 26, 2019 2:04 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Tom wrote:

Maybe I should be ashamed of not using species :-)

Don’t be.

But I think my argument was fundamentally species-theoretic, and could be nicely phrased that way.

Probably. I think doing it may require the use of 2-variable generating functions or something like that. I’m not an expert at these! In future posts I want to show how people use them to tackle questions like this:

  • What is the expected number of cycles in a random permutation of an nn-element set?

  • What is the expected number of cycles of length kk in a random permutation of an nn-element set?

  • If we choose an element of an nn-element set, what is the probability that it lies in a cycle of length kk?

Roughly speaking, they use one variable to keep track of elements and another to keep track of cycles. I’ll try to give a better explanation.

I think these fancier techniques might let us tackle questions like:

  • How many iterations of ff are required, on average, before f k(x)f^k(x) lies in the eventual image?

  • For each point in the eventual image, how many points not in the eventual image map to it under some power of ff, on average?

(The second one is asking about the average size of the trees in the EndExpCycTreeEnd \cong Exp \circ Cyc \circ Tree picture of an endofunction.)

By the way, thanks for pointing out that

k=1 nk(n1)!n k(nk)!=1 \sum_{k = 1}^n \frac{k(n - 1)!}{n^k(n - k)!} = 1

is just saying probabilities sum to one! That certainly helps demystify that!

And another puzzle:

But she had no sense for why it should be a square root relationship — that is, why the number of periodic points of a random endofunction on a large set should be proportional to the square root of the cardinality of that set. I’d like to hear a good intuitive explanation of this!

Here’s a crude first stab at it. What’s the average size of the image of an endofunction of an nn-element set? I’m sad to say I have no idea. But it’s clearly an upper bound on the average size of the eventual image. If it’s already O(n)O(\sqrt{n}), that would explain why the eventual image is so small — though not why it’s not even smaller!

Sufficiently skilled mathematicians might have fun computing the average size of the image of ff, the image of f 2f^2, etc., and seeing how they converge to the average size of the eventual image.

Posted by: John Baez on November 27, 2019 7:31 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

What’s the average size of the image of an endofunction of an nn-element set?

The probability that a given element is not in the image is (n1n) n, \left(\frac{n-1}{n}\right)^n, so the average size of the image is n(1(n1n) n)n(1e 1). n \left(1 - \left(\frac{n-1}{n}\right)^n\right) \approx n(1-e^{-1}).

The average size of the image of f 2f^2 and beyond would take more thought than I have time for at the moment.

Posted by: Mark Meckes on November 27, 2019 3:16 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Thanks! I got stuck right away because I didn’t think of counting the points that are not in the image.

Posted by: John Baez on November 27, 2019 5:55 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I figured this result on the size of the eventual image was unlikely to be new. Over on Twitter I’m starting to get some useful feedback from Louigi Addario-Berry:

You are veering into my territory! You can find some related results in Chapter 9 of Jim Pitman’s Combinatorial Stochastic Processes: https://www.stat.berkeley.edu/~aldous/206-Exch/Papers/pitman_CSP.pdf

Valentin Kolchin literally wrote the book on random mappings in 1986 (title: Random Mappings). His Lemma 3.1.4 gives the limiting distribution of C n/nC_n/\sqrt{n}, where C nC_n is the # of cyclic points of a random function f:[n][n]f:[n] \to [n]. He attributes this result to Bernard Harris (1960).

Harris’s paper: https://projecteuclid.org/euclid.aoms/1177705677

Harris efficiently crushes many simple questions about random endofunctions in Section 3 of this paper.

Posted by: John Baez on November 27, 2019 4:11 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

On Twitter Steven Galbraith writes:

I’ve seen this result (and other interesting results of similar type) in Random mapping statistics by P. Flajolet and A. M. Odlyzko (EUROCRYPT 1989). Crypto people are interested in this due to hash collisions and Pollard rho.

This is a very well-written article, which makes extensive use of generating functions (and thus implicitly species). It gives a much slicker proof than Tom’s and mine that if f:XXf \colon X \to X is a random endofunction on an nn-element set, the expected number of periodic points of ff is πn/2\sim \sqrt{\pi n / 2}. This proof uses generating functions!

This article also says that the expected cardinality of the image of f kf^k is asymptotically

(1τ k)n (1 - \tau_k) n

where

τ 0=0,τ k+1=e τ k1 \tau_0 = 0, \quad \tau_{k+1} = e^{\tau_k - 1}

So, the expected size of the image of ff is

(1e 1)n0.63n \sim (1- e^{-1})n \approx 0.63 n

as Mark Meckes pointed out, while the image of f 2f^2 has expected size

(1e e 11)n0.47n \sim (1 - e^{e^{-1} - 1}) n \approx 0.47 n

and so on. Since each of these asymptotic formulae are only good in the nn \to \infty limit, and it typically takes more and more iterations of ff to reach the eventual image (the set of periodic points) as nn gets larger, we never notice from these formulae that the eventual image has average cardinality O(n)O(\sqrt{n}).

Posted by: John Baez on November 28, 2019 5:06 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Here’s a question in a similar spirit to the one I asked before, but maybe less likely to have been answered before.

Fix a finite field kk. Let TT be a random linear endomorphism of an nn-dimensional vector space over kk. What is the expected dimension of its eventual image

im (T)= i=0 im(T i)? im^\infty(T) = \bigcap_{i = 0}^\infty im(T^i)?

Or maybe a better question: what is the expected cardinality of the eventual image?

These are two different questions: expectations are arithmetic means, and the arithmetic mean of dimensions corresponds to the geometric mean of cardinalities. For example, if we take subspaces U 1U_1 and U 2U_2 of k nk^n and write q=#kq = \# k then #U i=q dimU i\# U_i = q^{\dim U_i}, so

#U 1#U 2=q 12(dimU 1+dimU 2). \sqrt{\# U_1 \cdot \# U_2} = q^{\tfrac{1}{2}(\dim U_1 + \dim U_2)}.

My dream is that if you take the formula for the expected cardinality of the eventual image of a linear endomorphism of an nn-dimensional space over kk, and evaluate at q=1q = 1, you get the formula for the expected cardinality of the eventual image of a set-theoretic endomorphism of an nn-element set — in the usual 𝔽 1\mathbb{F}_1 way.

Posted by: Tom Leinster on November 28, 2019 5:02 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

We can do better:

Theorem. Let a na_n be the expected number of periodic points for a randomly chosen endofunction of an nn-element set. Then

a n=π2n 123112π2n 124135n 1+1288π2n 32+O(n 2)\displaystyle{ a_n = \sqrt{ \frac{\pi}{2}} n^{\frac{1}{2}} - 3 - \frac{1}{12} \sqrt{\frac{\pi}{2}} n^{-\frac{1}{2}} - \frac{4}{135} n^{-1} + \frac{1}{288} \sqrt{ \frac{\pi}{2}} n^{-\frac{3}{2}} + O(n^{-2}) }

Most of this follows easily from work of Purdom and Williams, who give this formula:

However, Christopher D. Long, whom I trust, say that Purdom and Williams are wrong about the term involving the number 91/540. He says it should be 16/540, which equals 4/135.

Curiously, 16 upside down is 91. I wonder how this mistake crept in. Could dyslexia be involved?

Purdom and Williams cite Knuth, and now I’m wondering what Knuth actually wrote.

Posted by: John Baez on November 30, 2019 1:16 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I have a much more naive comment. The Euler constant γ\gamma (which I’m used to calling the Euler-Mascheroni constant to distinguish it from ee) has the value γ0.577\gamma \approx 0.577. On the other hand, the value e γe^{-\gamma} which appears in the answer to Puzzle 4 has the value e γ0.561e^{-\gamma} \approx 0.561. Is there “reason” why these numbers are “close” – within 2% of each other? Or is this just an instance of the law of small numbers?

By the way, according to Wikipedia the number e γe^\gamma is known to be important in number theory. Wikipedia lists a few identities involving it, including the following: e γ=lim n1logp n i=1 np ip i1e^\gamma = \lim_{n \to \infty} \frac 1 {\log p_n} \prod_{i=1}^n \frac{p_i}{p_i - 1} where the sum is over primes.

Posted by: Tim Campion on December 14, 2019 4:40 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

The only ‘reason’ I know that γ\gamma is close to e γe^{-\gamma} is that both these numbers are close to

x=0.5671432904097838729999686622103555497538157 x = 0.5671432904097838729999686622103555497538157 \dots

which is the positive number such that

x=e x x = e^{-x}

Indeed x=W(1)x = W(1), where WW is the Lambert WW function, a function such that

x=W(x)e W(x) x = W(x) e^{W(x)}

This was not very helpful, I know… but the Lambert WW function is nice to know about.

The formula you wrote relating e γe^\gamma and prime numbers is much more interesting! It’s called Merten’s third theorem.

It’s interesting because there’s a deep set of analogies between how a randomly chosen large natural number is expressed as a product of primes and how a randomly chosen large permutation is decomposed into cycles:

I want to discuss this later in this series of posts, though it’s a bit of a digression.

The number e γe^{-\gamma} shows up on both sides of this analogy. On the one side it shows up in a formula for the probability that the shortest cycle in a large permutation has length >k\gt k — that’s what I discussed in this blog article. On the other side it shows up in a formula for the probability that the smallest prime factor in a large natural number is >k\gt k — this follows from Merten’s third theorem.

Granville discusses this with some awe on page 5 of his paper.

As an analytic number theorist I find this particular piece of evidence rather special. You have to understand that the appearance of γ\gamma in Mertens’ theorem is somewhat mysterious…

Posted by: John Baez on December 14, 2019 6:32 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

I’m also charmed by Joyal’s bijective proof of the Cayley formula that Tom uses in the comments. Joyal’s proof is found in Proofs from THE BOOK, Ch. 30 – I don’t know if Joyal himself published it?

Cayley’s formula says that if T nT_n is the number of tree structures on a labeled nn-element set, then T n=n n2T_n = n^{n-2}, which is very suggestive of some relation to endofunctions, since there are n nn^n endofunctions. Joyal makes this suggestion precise. Let T n ++T_n^{++} be the number of trees on the set n̲={1,,n}\underline n = \{1,\dots,n\} equipped with an ordered pair of distinguished points (which need not be distinct). Then clearly T n ++=n 2T nT_n^{++} = n^2 T_n, so the goal is to show that T n ++=n nT_n^{++} = n^n, which Joyal does via an explicit bijection to the set of endofunctions of n̲\underline n.

The bijection uses, of course, the “structure theory of endofunctions” discussed in the comments above. Think of an endofunction f:n̲n̲f: \underline n \to \underline n as a forest on n̲\underline n equipped with a root r ir_i for each connected component of the forest and a permutation σ\sigma of these roots (the set I={r 1,,r k}I = \{r_1,\dots,r_k\} of roots is the eventual image of ff). Then note that permutations are in bijection with linear orders: we send a permutation σ:II\sigma: I \to I to the linear order L:σ(r 1)<σ(r 2)<,<σ(r k)L: \sigma(r_1) \lt \sigma(r_2) \lt \dots, \lt \sigma(r_k) where the roots r 1,,r kr_1,\dots,r_k are ordered via the natural order on n̲\underline n. To construct a tree on n̲\underline n with two distinguished points, use the forest structure for the points not in II, and for the points in II use the linear order LL; the distinguished points are the top and bottom of LL.

So it really all boils down to the bijection between permutations and linear orders. Note that this bijection (or the one I wrote down above at any rate) is not equivariant under relabeling. So although there is a bijection between permutations and linear orders, I think there is not an isomorphism of species between the species of permutations and the species of linear orders! Assuming I’ve gotten that right, it’s a pretty striking fact to me. What are some other important examples of species with the same generating function which are not isomorphic?

Posted by: Tim Campion on December 14, 2019 6:39 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

You are correct that those species aren’t isomorphic. Species come with an S nS_n-action on objects of size nn; the action on total orders is transitive, while the action on permutations is by conjugation and so has many orbits.

I’m not aware of any natural examples of nonisomorphic species with the same generating function that aren’t built from that one in some way.

Posted by: lambda on December 14, 2019 8:54 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

lambda wrote:

I’m not aware of any natural examples of nonisomorphic species with the same generating function that aren’t built from that one in some way

…where “that one” means total orders vs. permutations.

That’s a good challenge. Here’s what might be an example.

Any finite group has the same number of conjugacy classes as irreducible representations (over \mathbb{C}), but there’s no canonical bijection between the two things. So let’s define species FF and HH as follows: for a finite set XX,

  • F(X)F(X) is the set of pairs (*,C)(\ast, C) where *\ast is a group structure on XX and CC is a conjugacy class of (X,*)(X, \ast)

  • H(X)H(X) is the set of pairs (*,α)(\ast, \alpha) where *\ast is a group structure on XX and α\alpha is an isomorphism class of irreducible representations of (X,*)(X, \ast).

Then FF and HH have the same generating function, but I think they’re not isomorphic as species.

If one wanted to prove this, one would have to rule out the possibility that FF and HH are isomorphic as species in a way that changes the group structure. The point is that given a finite group (X,*)(X, \ast), conceivably there’s a different group structure \star on XX such that the conjugacy classes of (X,*)(X, \ast) correspond naturally to the irreducible representations of (X,)(X, \star).

I have a memory of someone telling me that for any finite group (X,*)(X, \ast), there is some associated group with the property just described, a kind of dual of (X,*)(X, \ast). (Is it the Langlands dual?) But it seems unlikely that this would have the same order as the original group, which it would have to in order to give an isomorphism between FF and HH.

Posted by: Tom Leinster on December 15, 2019 5:48 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

You were wowed by that observation by David Ben-Zvi back here.

Posted by: David Corfield on December 16, 2019 9:50 AM | Permalink | Reply to this

Re: Random Permutations (Part 2)

Tim wrote:

I don’t know if Joyal himself published it?

He did. You can find a reference in my recent post about Joyal’s proof of Cayley’s theorem.

Posted by: Tom Leinster on December 14, 2019 11:22 PM | Permalink | Reply to this

Re: Random Permutations (Part 2)

lambda wrote:

Species come with an S nS_n-action on objects of size nn; the action on total orders is transitive, while the action on permutations is by conjugation and so has many orbits.

I’m not aware of any natural examples of nonisomorphic species with the same generating function that aren’t built from that one in some way.

This is indeed the source of all the most exciting examples I know, except perhaps the example Tom mentioned. But it’s good to realize that we can crank out infinitely many examples at will. It’s enough to find two sets XX and YY with actions of some S nS_n that are isomorphic as sets but not as sets with S nS_n action.

So, for example, S 3S_3 acts on a 3-element set in the usual way, as permutations — but it also acts trivially on a 3-element set. These actions give two nonisomorphic species with the same generating function

x 32 \frac{x^3}{2}

The first of these is isomorphic to the species I call ‘being a pointed 3-element set’. This is a structure that can only be put on a set if it has cardinality 3, and then consists of choosing a point in that set. The second is isomorphic to ‘being a 3-element set all of whose elements are colored the same way: either red, green, or blue’.

In my combinatorics class I showed an easy result: every species is a coproduct of ‘indecomposable’ species — i.e., those that are not coproducts of other species in a nontrivial way. Moreover, every indecomposable species comes from a transitive action of S nS_n on some set. Further, every transitive action of S nS_n on a set is isomorphic to its left action on some set S n/HS_n/H, where HS nH \subseteq S_n is a subgroup. Two such transitive actions are isomorphic iff the subgroups are conjugate.

So, ‘classifying’ species boils down to classifying subgroups of symmetric groups up to conjugacy. This latter task is hopeless unless we restrict to symmetric groups S nS_n with fairly small nn. But this viewpoint on species is still a great alternative to the usual one! Instead of looking for charismatic examples of species and studying those in detail one at a time, we can treat species as built out of ‘atoms’ — transitive actions of symmetric groups — and take a more ‘reductionistic’ approach.

Posted by: John Baez on December 15, 2019 6:28 PM | Permalink | Reply to this

Post a New Comment