Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

December 20, 2019

A Joyal-Type Proof Of The Number Of Nilpotents

Posted by Tom Leinster

Let VV be a finite set. What’s the probability that a random endofunction of VV is eventually constant — that is, becomes constant after you iterate it enough times? It’s 1/#V1/\#V. That’s basically Cayley’s tree formula, Joyal’s beautiful proof of which I showed you recently.

Now suppose that VV has the structure of a vector space. What’s the probability that a random linear operator on VV is eventually constant — or as one usually says, nilpotent? It’s still 1/#V1/\#V. I recently showed you a proof of this using generating functions, but it gave no indication of why the answer is as simple as it is. There was some miraculous cancellation.

Today I’ll show you a much more direct proof of the probability of getting a nilpotent, or equivalently the number of nilpotents. The argument shadows Joyal’s proof of Cayley’s formula.

First let me explain what I just said about the probability of a random endofunction being eventually constant, and why the answer is essentially Cayley’s formula.

Let VV be a finite set. A function T:VVT : V \to V is eventually constant if T nT^n is constant for some n0n \geq 0, or equivalently T nT^n is constant for all n0n \gg 0, or equivalently TT has only one periodic point (which must then be a fixed point). An eventually constant function looks like this:

rooted tree with directed edges

Here I’ve drawn an arrow from vv to T(v)T(v) for every vVv \in V, except when vv is the eventual constant value, which I’ve circled and coloured in pink. Actually, the arrows add no information, because they all flow in the direction towards the pink vertex. So we might as well remove them:

rooted tree

What we have now is just a rooted tree with VV as its set of vertices. Cayley’s formula tells us how many unrooted trees there are with vertex-set VV: there are n n2n^{n - 2} of them, where n=#Vn = \#V. Hence there are n n1n^{n - 1} rooted trees, or equivalently n n1n^{n - 1} eventually constant endofunctions of VV. But since there are n nn^n endofunctions of VV in all, the probability that a randomly-selected endofunction is eventually constant is

n n1/n n=1/n=1/#V. n^{n - 1}/n^n = 1/n = 1/\#V.

That’s the situation for plain, unadorned sets. But this post is about vector spaces. Of course, when we’re dealing with vector spaces we only consider the linear endomorphisms, and by linearity, the eventually constant ones are exactly the nilpotents. So:

nilpotent operators are the linear analogue of rooted trees.

The root of the tree corresponds to the origin of the vector space.

nilpotent operator

I’m going to show you a proof that a random linear operator on a finite vector space VV is eventually constant (nilpotent) with probability 1/#V1/\#V.

First, though, let me mention a mystery, in the hope that someone can shed some light. You’ll have noticed from the beginning of the post that two probabilities are the same: for a finite vector space VV, the probability of a random endomorphism being eventually constant isn’t affected by whether we’re consider all endofunctions or just the linear ones. In probabilistic notation,

P(eventually constant | linear) = P(eventually constant).

So the conditions “eventually constant” and “linear” on an endofunction are independent, in the sense of probability theory. I have no intuition for why this should be true. Does anyone else?

Background linear algebra

For the proof, we’ll need a few linear algebra facts that are absolutely elementary and general, but not as well-known as they could be.

One way in which vector spaces are more complex than sets is that every subset of a set has just one complement, a subspace of a vector space has many. By a “complement” of a linear subspace YY I mean what’s usually called a complementary subspace, that is, another subspace XX such that XY={0}X \cap Y = \{0\} and X+YX + Y is the whole space.

So, given a subspace YY of VV and any two subspaces X,WX, W complementary to YY

XY=V=WY X \oplus Y = V = W \oplus Y

— it may be that XX and WW are not equal. However, they are canonically isomorphic. Indeed, we have projection maps

V=XYX,V=WYW V = X \oplus Y \to X, \qquad V = W \oplus Y \to W

both with kernel YY, so by the first isomorphism theorem, there’s a unique isomorphism i:XWi: X \to W commuting with the projections:

isomorphism between X and W

For a point xx of XX, we can characterize i(x)i(x) as the unique point of WW such that i(x)xYi(x) - x \in Y. So ii moves points in a direction parallel to YY.

Now here’s a crucial fact: for vector spaces XX and YY,

Lin(X,Y){Lin(X, Y) \cong \{subspaces of XYX \oplus Y complementary to Y}Y\}.

This correspondence is given from left to right by the graph construction. That is, given a linear map f:XYf: X \to Y, we have its graph

W f={(x,y)XY:y=f(x)}, W_f = \{(x, y) \in X \oplus Y : y = f(x) \},

which is a subspace of XYX \oplus Y complementary to YY.

Conversely, given a subspace WW of XYX \oplus Y complementary to YY, we have the canonical isomorphism i:XWi: X \to W defined above — and as we noted, it gives a map xi(x)xx \mapsto i(x) - x from XX to YY. All we’re doing here is reading a function off from its graph in the usual way. Starting from a point xXx \in X, you take the unique point (x,y)(x, y) on the graph (WW) that’s above xx, and then y=f(x)y = f(x).

Aside From this correspondence we can easily work out how many complements any given subspace has, if we’re talking about finite-dimensional spaces over finite fields. Let YY be a kk-dimensional subspace of an nn-dimensional space VV over a field with qq elements. Choose, arbitrarily, a complementary subspace XX of YY in VV. Then the fact just proved implies that there are the same number of complementary subspaces of YY in VV as there are linear maps XYX \to Y. Since dimX=nkdim X = n - k, there are q k(nk)q^{k(n - k)} such maps. Hence YY has q k(nk)q^{k(n - k)} complements.

This formula fits with the folklore idea that a vector space over the mythical field with one element is a set (or should I say pointed set?). For if we put q=1q = 1 in the formula just established, we come to the conclusion that a subset of a set has exactly 11 complement — which is indeed correct.

The next linear algebra fact we’ll need is this:

endomorphism=nilpotentautomorphism. endomorphism = nilpotent \oplus automorphism.

That is, for any linear operator TT on a finite-dimensional vector space VV (over any field), there are unique subspaces WW and YY of VV such that:

  • V=WYV = W \oplus Y
  • TT restricts to a nilpotent operator on WW
  • TT restricts to an automorphism of YY.

For today it doesn’t matter what WW and YY actually are, but if you’re interested you can find the answer here or here: they’re the eventual kernel and eventual image, respectively.

Finally, it’s a fact that for any finite-dimensional vector space YY, there’s a (non-canonical) bijection

Aut(Y){Aut(Y) \cong \{ordered bases of Y}Y\}.

This is closely analogous to the fact that for a finite sets, the permutations are in bijection with the total orders. Both are proved using a torsor argument: we’ve got a group acting freely and transitively on a nonempty set, which implies that the group and set are in bijection.

Concretely, in this case, the proof goes like this. Choose an ordered basis y=(y 1,,y k)\mathbf{y} = (y_1, \ldots, y_k) of YY. For any automorphism SS of YY, we get a new ordered basis Sy=(S(y 1),,S(y k))S\mathbf{y} = (S(y_1), \ldots, S(y_k)). And it’s easy to see that the assignment SSyS \mapsto S\mathbf{y} defines a bijection between automorphisms and ordered bases.

Pre-proof throat-clearing

Let VV be a finite-dimensional vector space over any field. Write Nil(V)Nil(V) and End(V)End(V) for the set of nilpotent operators on VV and endomorphisms of (linear operators on) VV. We’re going to show that there’s a bijection

Nil(V)×VEnd(V). Nil(V) \times V \cong End(V).

When the base field is finite, this immediately implies that the probability of a random linear operator on VV being nilpotent is 1/#V1/#V. It also gives a formula for the number of nilpotents on VV. For if VV is nn-dimensional over a field with qq elements, there are (q n) n=q n 2(q^n)^n = q^{n^2} linear operators on VV, since to specify one we have to choose for each of nn basis elements one of the q nq^n elements of VV. Hence there are

q n 2/#V=q n 2n=q n(n1) q^{n^2}/\#V = q^{n^2 - n} = q^{n(n - 1)}

nilpotents on VV.

The bijection is not canonical. We’re going to need to choose, for each linear subspace YY of VV:

  • a complementary subspace Y Y^\bot
  • a bijection between the ordered bases of YY and the automorphisms of YY.

If we were working over \mathbb{R} or \mathbb{C} then we could choose the complementary subspaces by choosing an inner product on VV — but we’re probably not. In any case, it’s certainly true that every subspace has at least one complement, and we can just choose arbitrarily.

Aside The situation in Joyal’s proof of Cayley’s formula is very similar. There, to get a bijection between endomorphisms and vertebrates (bipointed trees), we have to choose for each subset YY a bijection between the total orders on YY and the permutations of YY. It’s unsurprising that we don’t also have to choose a complement of YY, since in the world of sets, subsets have a unique complement anyway!

Once these choices have been made, we get a bijection between Nil(V)×VNil(V) \times V and End(V)End(V) as follows.

The proof

Start with a pair (T,y)(T, y), consisting of a nilpotent TT on VV and an element yy of VV:

nilpotent and element

(I don’t want to clutter up this proof with further discussion of the analogy with Joyal’s proof of Cayley’s formula, but I’ve chosen the colours in these diagrams to match the ones in my presentation of Joyal’s proof. For full effect, you might want to open that post in another window and look at the two arguments side by side.)

Since TT is nilpotent, there’s a least k0k \geq 0 such that T ky=0T^k y = 0. It’s easy to show that

y,Ty,T 2y,,T k1y y, T y, T^2 y, \ldots, T^{k - 1}y

are linearly independent, so they form a basis for the subspace YY that they span. Here’s a picture with k=4k = 4:

nilpotent and iteration

(This picture shouldn’t be taken too seriously: I’ve drawn YY as if it were one-dimensional, in which case there’s no way we can have k=4k = 4.)

Let’s write y i=T iyy_i = T^i y:

nilpotent and ordered basis

It looks as if we’ve thrown away yy, but we haven’t: we recover it from the ordered basis y=(y 0,,y k1)\mathbf{y} = (y_0, \ldots, y_{k - 1}) as y 0y_0 if k>0k \gt 0, or as 00 otherwise.

Now TT is completely determined on YY by y\mathbf{y}, since it’s given by y iy i+1y_i \mapsto y_{i + 1} if i<k1i \lt k - 1, and by y k10y_{k - 1} \mapsto 0. So given YY and its ordered basis y\mathbf{y}, to specify TT we only need to specify its restriction to Y Y^\bot. This restriction is a map Y V=Y YY^\bot \to V = Y^\bot \oplus Y, which in turn is determined by its two components

T Y Y :Y Y ,T Y Y:Y Y. T_{Y^\bot Y^\bot} : Y^\bot \to Y^\bot, \qquad T_{Y^\bot Y} : Y^\bot \to Y.

So, our original pair (T,y)(T, y) is uniquely determined by the subspace YY, its ordered basis y\mathbf{y}, and the linear maps T Y Y T_{Y^\bot Y^\bot} and T Y YT_{Y^\bot Y}. The original operator TT is nilpotent iff T Y Y T_{Y^\bot Y^\bot} is nilpotent, as you can easily check. (The background fact is this: given a linear operator TT on a direct sum XYX \oplus Y such that TYYT Y \subseteq Y, and writing (P 0 R S)\begin{pmatrix} P &0 \\ R & S\end{pmatrix} for its block decomposition, TT is nilpotent iff PP and SS are nilpotent.) So what we’ve now got is data like this:

ordered basis and two maps

We saw earlier that linear maps Y YY^\bot \to Y correspond one-to-one with subspaces of Y Y=VY^\bot \oplus Y = V complementary to YY, via the graph construction. So we can equivalently replace T Y YT_{Y^\bot Y} by a subspace WW of VV complementary to YY:

ordered basis, complementary subspace and nilpotent on another

On the other hand, we also saw that Y Y^\bot and WW — both being subspaces of VV complementary to YY — are canonically isomorphic. So we can equivalently replace the nilpotent operator T Y Y T_{Y^\bot Y^\bot} on Y Y^\bot by a nilpotent operator RR on WW:

ordered basis, complementary subspace, and nilpotent on it

Next, ordered bases of YY are in bijection with automorphisms of YY. (There’s no canonical bijection, but that’s OK: we chose one earlier.) So we can equivalently replace the ordered basis y\mathbf{y} by an automorphism SS of y\mathbf{y}:

nilpotent and automorphism

So far we’ve shown that

pairs (nilpotent TT on VV, element yy of VV)

are in bijection with

quadruples (subspace WW of VV, nilpotent RR on WW, subspace YY of VV, automorphism SS of YY) with WY=VW \oplus Y = V.

But since every linear operator on VV decomposes uniquely as the direct sum of a nilpotent and an automorphism, such quadruples are in canonical bijection with linear operators on VV:

linear operator

Hence pairs (nilpotent TT on VV, element yy of VV) are in bijection with endomorphisms of VV:

Nil(V)×VEnd(V), Nil(V) \times V \cong End(V),

as required.

Posted at December 20, 2019 3:43 PM UTC

TrackBack URL for this Entry:

13 Comments & 0 Trackbacks

Re: A Joyal-Type Proof Of The Number Of Nilpotents

Some historical comments: the theorem on the number of nilpotents on a finite vector space (or equivalently the probability that a random linear operator is nilpotent) was apparently first proved by Fine and Herstein:

Nathan Jacob Fine and Israel Nathan Herstein, The probability that a matrix be nilpotent. Illinois Journal of Mathematics 2 (1958), 499–504.

Their proof is surprisingly complicated and computational. By 2006, things were much more streamlined, as shown by this paper of Crabb:

Michael C. Crabb, Counting nilpotent endomorphisms. Finite Fields and Their Applications 12 (2006), 151–154.

Crabb gives a quick slick proof there, which I haven’t yet absorbed.

Posted by: Tom Leinster on December 20, 2019 10:35 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

Here is a presentation of vector spaces and finite sets on the same footing, inspired by the paper of Crabb which Tom points out. Let XX be either a finite set or a vector space over a finite field, and consider either arbitrary or linear self maps of XX. I’ll use the term “size” to refer to “cardinality” or “dimension”, accordingly.

Fix a maximal chain F F_{\bullet} of nonempty subobjects of XX, where F iF_i has size ii.

Let ϕ:XX\phi : X \to X. Here are two ways to define a decreasing chain of subobjects. First,  define X 0=XX_0 = X and X k+1=ϕ(X k)X_{k+1} = \phi(X_k). Second, define Y 0=XY_0 = X and Y k+1=ϕ(F size(Y k))Y_{k+1} = \phi(F_{\mathrm{size}(Y_k)}).

I claim that the probability that a particular chain X=Z 0Z 1X = Z_0 \supseteq Z_1 \supseteq \cdots occurs as X X_{\bullet} or Y Y_{\bullet} are equal. Let ψ\psi be an automorphism of XX with ψ(F size(Z k))=Z k\psi(F_{\mathrm{size}(Z_k)}) = Z_k. We have Z =X Z_{\bullet} = X_{\bullet} if and only if ϕ\phi restricts to a surjection Z kZ k+1Z_{k} \to Z_{k+1} for each kk; we have Z =X Z_{\bullet} = X_{\bullet} if and only if ϕ\phi restricts to a surjection F size(Z k)Z k+1F_{\mathrm{size}(Z_k)} \to Z_{k+1} for each kk.  We see that ϕ\phi obeys the first condition if and only if ϕψ\phi \circ \psi obeys the second.

By definition, ϕ\phi is eventually constant/nilpotent if and only if X X_{\bullet} is eventually size zero/one. By the above, we can compute the probability that Y Y_{\bullet} is eventually size zero/one.

I claim that the eventual size of Y Y_{\bullet} is the minimal kk for which ϕ| F k+1\phi|_{F_{k+1}} is noninjective. Indeed, if ϕ| F k\phi|_{F_k} is injective, then it is easy to prove that ϕ(F k)Y j\phi(F_k) \subseteq Y_j for all jj so Y jY_j has size at least kk. However, if ϕ| F k\phi|_{F_k} is not injective, then it is easy to prove that the size of Y Y_{\bullet} strictly decreases until it drops below kk.

So, the probability that a random self map of [n][n] is eventually constant is the same as the probability that a random map [2][n][2] \to [n] is not injective, namely 1/n1/n.

And the probability that a random linear self map of 𝔽 q n\mathbb{F}_q^n is eventually zero is the same as the probability that a random map 𝔽 q𝔽 q n\mathbb{F}_q \to \mathbb{F}_q^n is not injective, namely 1/q n1/q^n.

A few posts back, I remember someone saying that the eventual image of a self map of [n][n] was of order n\sqrt{n} and wanting intuition for this. From this perspective, this is just the birthday paradox: Once knk \approx \sqrt{n}, one expects a map [k][n][k] \to [n] not to be injective.

Posted by: David Speyer on December 27, 2019 4:42 AM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

The q-binomial coefficients are generated by a noncommutative version of the binomial theorem. That is: suppose BA=qABBA=qAB. Then the coefficient of A iB niA^i B^{n-i} in (A+B) n(A+B)^n is the number of ii-dimensional subspaces of an nn-dimensional vector space over a field with q elements. Maybe this particular hare is running in the wrong direction, but it would be nice to see this noncommutativity used in a simple proof somehow.

Posted by: Gavin Wraith on December 21, 2019 1:27 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

That’s maybe related to a part of the proof above that I find slightly dissatisfying.

Lemma: let TT be a linear operator on a direct sum of vector spaces XYX \oplus Y, with block decomposition (P 0 R S)\begin{pmatrix} P&0\\R&S\end{pmatrix}. (The 00 in the top-right position means that YY is TT-invariant.) Then

TT is nilpotent \iff PP and SS are nilpotent.

The lemma follows from the formula

T r=(P r 0 i+j=r1S iRP j S r), T^r = \begin{pmatrix} P^r &0\\ \sum_{i + j = r - 1} S^i R P^j&S^r \end{pmatrix},

which is easily proved by induction. QED.

What I find dissatisfying is that it’s the only part of the proof that requires calculation. The rest of the proof, you could explain to a friend on a walk in the park, in such a way that they’d be completely convinced and wouldn’t have to take anything on trust.

At some point I did try breaking this calculation down into steps that were something like the fact you mention: e.g. if AA and BB are nilpotent and commute up to a scalar factor, then A+BA + B is nilpotent. In the case at hand, we’re taking A=(P 0 0 0)A = \begin{pmatrix} P&0\\0&0\end{pmatrix} and B=(0 0 R 0)B = \begin{pmatrix} 0&0\\R&0\end{pmatrix}, etc. But that didn’t really help.

qq-binomial coefficients and similar qq-things appear in the other proof of the number of nilpotents that I wrote about a couple of weeks ago.

Posted by: Tom Leinster on December 21, 2019 2:31 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

Is the only dissatisfying part having to compute the lower-left block? The diagonals of powers of a triangular matrix are well-known and can be worked out without matrix notation.

(With some abuse of notation) the block decomposition specifies that π XTπ Y=0\pi_X T \pi_Y = 0, so that π XT=π XTπ X=P\pi_X T = \pi_X T \pi_X = P Tπ Y=π YTπ Y=ST \pi_Y = \pi_Y T \pi_Y = S This is enough to show P r=π XT rP^r = \pi_X T^r, S r=T rπ YS^r = T^r\pi_Y. If P r=S r=0P^r=S^r=0 then T 2r=T r(π X+π Y)T r=0T^{2r}=T^r(\pi_X + \pi_Y)T^r=0.

Posted by: unekdoud on December 23, 2019 8:44 AM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

Is the only dissatisfying part having to compute the lower-left block?

Yes, that’s what I meant. It’s not that the calculation is hard, just that it stretches the important criterion “you can convince someone while walking in the park”.

Incidentally, one thing about this fact — that

(P 0 R S) \begin{pmatrix} P &0 \\ R &S \end{pmatrix}

is nilpotent iff PP and SS are — is that we’re doing noncommutative algebra. So although “everyone” knows that over a commutative ring, a triangular matrix with 0s on the diagonal is nilpotent, and knowing that, it’s not hard to guess that over a commutative ring, a triangular matrix is nilpotent iff the diagonal entries are nilpotent, it’s maybe not so obvious that this is also true over noncommutative rings. At least, I don’t know how to explain it without simply doing the calculation — either in matrix notation or in the style you’ve done it.

All that said, I do like the way you’ve managed to avoid computing the lower-left entry. That’s nice! Thanks for that.

Posted by: Tom Leinster on December 23, 2019 4:19 PM | Permalink | Reply to this

Isn’t it just that if XX and YY are the spaces giving the block decomposition then the TT sends (x,y)(x,y) to (Px,y)(Px,y') for some yy' in YY. So if P n=0P^n=0 then the image of T nT^n is contained in YY. But TT acts on YY like SS so if S m=0S^m=0 then the T n+mT^{n+m} acts as zero on XYX\oplus Y?

Posted by: Simon Wadsley on January 8, 2020 4:27 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

Thanks, I think that’s the best way to put it. When I wrote it up I put something very similar as the proof of Lemma 2.3.

Posted by: Tom Leinster on January 8, 2020 4:53 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

I’m curious how the refinements of the number of Cayley trees and (its further refinement) might play into your perspective. (The entry for the number of Cayley trees is

Posted by: Thomas C Copeland on December 22, 2019 2:42 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

Thanks for the thought, but I’m having trouble understanding that first sequence you mention. E.g. what does it mean for a partial self-map of a finite set to be nilpotent? My guess: some iterate of the map is everywhere undefined. Is that right?

In any case, the proof in my post does establish a more refined result than I actually stated. It tells us that for a finite-dimensional vector space VV and an integer kk, the pairs

(T,y) (T, y)

where T:VVT: V \to V is nilpotent and yVy \in V has nilpotence degree kk are in bijection with the endomorphisms of VV whose eventual image has dimension kk. By the “nilpotence degree” of an element yVy \in V I mean the smallest kk such that T ky=0T^k y = 0.

Posted by: Tom Leinster on December 22, 2019 6:45 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

One reason I asked the question is that I can’t follow the arguments in the cited papers for that entry. Certainly, the entry is related to trees as is Lagrange inversion, but the nilpotent arguments in this case are unfamiliar to me. Was hoping you could point me towards enlightenment and/or add to the entry.

Posted by: Thomas C Copeland on December 22, 2019 8:23 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

For Joyal’s proof, you only need to make one arbitrary choice: a total order on the full set. This gives you a total order on each subset, which gives the needed bijection between permutations and total orders on that subset.

I guess you can do a similar thing here. Fix an ordered basis v\mathbf v of VV, and this gives you a canonical ordered basis for each subspace (namely the unique one which is in reduced echelon form when expanded in terms of v\mathbf v). Once you have an ordered basis, you get the bijection between automorphisms and ordered bases. You can get complements easily enough too: given a subspace YY, you can always find a complement spanned by a subset of v\mathbf v, so just take the lexicographically first subset that works.

This is complicated enough that just saying “pick all of these things in some unspecified way” is probably the way to go though.

Posted by: lambda on December 23, 2019 4:36 PM | Permalink | Reply to this

Re: A Joyal-Type Proof Of The Number Of Nilpotents

Thanks, lambda, for setting that out. I ummed and ahhed about how to present the choice aspect.

A funny thing is that in the linear case, I figured out an algorithm producing a basis of any given subspace of VV from a basis of VV itself… but I didn’t realize that the algorithm was simply reduced echelon form!

Posted by: Tom Leinster on December 23, 2019 6:26 PM | Permalink | Reply to this

Post a New Comment