## October 4, 2021

### Stirling’s Formula

#### Posted by John Baez Stirling’s formula says

$\displaystyle{ n! \sim \sqrt{2 \pi n} \, \left(\frac{n}{e}\right)^n }$

where $\sim$ means that the ratio of the two quantities goes to $1$ as $n \to \infty.$

Where does this formula come from? In particular, how does the number $2\pi$ get involved? Where is the circle here?

To understand these things, I think a nonrigorous argument that can be made rigorous is more useful than a rigorous proof with all the ε’s dotted and the δ’s crossed. It’s important, I think, to keep the argument short. So let me do that.

The punchline will be that the $2\pi$ comes from this formula:

$\displaystyle{ \int_{-\infty}^\infty e^{-x^2/2} \, d x = \sqrt{2 \pi} }$

And this, I hope you know, comes from squaring both sides and converting the left side into a double integral that you can do in polar coordinates, pulling out a factor of $2 \pi$ because the thing you’re integrating only depends on $r,$ not $\theta.$

$\displaystyle{ \int_0^\infty x^n e^{-x} \, d x = n! }$

This is easy to show using repeated integration by parts.

Next, we do this:

$\begin{array}{ccl} n! &=& \displaystyle{ \int_0^\infty x^n e^{-x} \, d x } \\ \\ &=& \displaystyle{ \int_0^\infty e^{n \ln x -x} \, d x } \\ \\ &=& \displaystyle{ n\int_0^\infty e^{n \ln (n y) -n y} \, d y } \\ \\ \end{array}$

In first step we’re writing $x^n$ as $e^{n \ln x}.$ In the second we’re changing variables: $x = n y.$

Next we use $\ln(n y) = \ln n + \ln y$ to bust things up:

$\displaystyle{ n! = n e^{n \ln n} \int_0^\infty e^{n \ln y -n y} \, d y }$

All the hard work will come in showing this:

$\displaystyle{ \int_0^\infty e^{n \ln y -n y} \, d y \sim \sqrt{\frac{2 \pi}{n}} \; e^{-n} }$

Given this, we get

$\displaystyle{ n! \sim n e^{n \ln n}\; \sqrt{\frac{2 \pi}{n}} \; e^{-n} }$

and simplifying we get Stirling’s formulas:

$\displaystyle{ n! \sim \sqrt{2 \pi n} \, \left(\frac{n}{e}\right)^n}$

### Laplace’s method

So to prove Stirling’s formula, the big question is: how do we get

$\displaystyle{ \int_0^\infty e^{n \ln y -n y} \, d y \sim \sqrt{\frac{2 \pi}{n}} \; e^{-n} } \; ?$

Let’s write it like this:

$\displaystyle{ \int_0^\infty e^{-n (y - \ln y)} \, d y \sim \sqrt{\frac{2 \pi}{n}} \; e^{-n} }$

The trick is to note that as $n$ gets big, the integral will become dominated by the point where $y - \ln y$ is as small as possible. We can then approximate the integral by a Gaussian peaked at that point!

Notice that

$\displaystyle{ \frac{d}{d y} (y - \ln y) = 1 - y^{-1} }$

$\displaystyle{ \frac{d^2}{d y^2} (y - \ln y) = y^{-2} }$

so the function $y - \ln y$ has a critical point at $y = 1$ and its second derivative is $1$ there, so it’s a local minimum. Indeed this point is the unique minimum of our function on the whole interval $(0,\infty).$

Then we use this:

Laplace’s Method. Suppose $f \colon [a,b] \to \mathbb{R}$ has a unique minimum at some point $x_0 \in (a,b)$ and $f\prime\prime(x_0) > 0.$ Then

$\displaystyle{ \int_a^b e^{-n f(x)} d x \sim \sqrt{\frac{2 \pi}{n f\prime\prime(x_0)}} \; e^{-n f(x_0)} }$

as $n \to \infty.$

This says that asymptotically, the integral equals what we’d get if we replaced $f$ by the quadratic function whose value, first derivative and second derivative all match that of $f$ at the point $x_0.$ With this quadratic replacing $f,$ you can do the integral by hand—it’s the integral of a Gaussian—and you get the right hand side.

Applying this formula to the problem at hand we get

$\displaystyle{ \int_a^b e^{-n (y - \ln y)} d y \sim \sqrt{\frac{2 \pi}{n f\prime\prime(y_0)}} \; e^{-n f(y_0)} }$

where $f(y) = y - \ln y,$ $y_0 = 1,$ $f(y_0) = 1$ and $f\prime\prime(y_0) = 1.$ So we get

$\displaystyle{ \int_a^b e^{-n (y - \ln y)} d y \sim \sqrt{\frac{2 \pi}{n}} \; e^{-n} }$

and then letting $a = 0, b \to +\infty$ we get what we want.

So, from this viewpoint—and there are others—the key to Stirling’s formula is Laplace’s method of approximating an integral like

$\displaystyle{ \int_a^b e^{-n f(x)} d x }$

with a Gaussian integral. And in the end, the crucial calculation is where we do that Gaussian integral, using

$\displaystyle{ \int_{-\infty}^\infty e^{-x^2/2} \, d x = \sqrt{2 \pi} }$

You can see the whole proof of Laplace’s method here:

Physicists who have done quantum field theory will know that when push comes to shove it’s largely about Gaussian integrals. The limit $n \to \infty$ we’re seeing here is like a ‘classical limit’ where $\hbar \to 0.$ So they will be familiar with this idea.

There should be some deeper moral here, about how $n!$ is related to a Gaussian process of some sort, but I don’t know it—even though I know how binomial coefficients approximate a Gaussian distribution. Do you know some deeper explanation, maybe in terms of probability theory and combinatorics, of why $n!$ winds up being asymptotically described by an integral of a Gaussian?

For a very nice account of some cruder versions of Stirling’s formula, try this blog article:

His ‘note’, which you can find there, will give you more intuition for why something like Stirling’s formula should be true. But I think the above argument explains the $\sqrt{2 \pi}$ better than Ahlfors’ argument.

Posted at October 4, 2021 2:52 AM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3356

### Re: Stirling’s Formula

Nice! Now that you say it, I suppose it’s inevitable that the $\sqrt{\pi}$ in Stirling’s formula comes from the integral of $e^{-x^2}$.

Typos: just before the heading “Laplace’s method”, there are a couple of equals signs that should be $\sim$.

Posted by: Tom Leinster on October 4, 2021 11:55 AM | Permalink | Reply to this

### Re: Stirling’s Formula

Posted by: John Baez on October 4, 2021 4:39 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Regarding probabilistic interpretations, Dan Piponi’s comment here gives me a small scrap of an idea. Stirling’s formula can be rearranged as

$\frac{n!}{n^n} \sim \sqrt{2\pi n}\,e^{-n},$

the left-hand side being the probability that a random endomorphism of an $n$-element set is an automorphism. But why should the right-hand side be what it is? I have no idea. We need a Gaussian!

Posted by: Tom Leinster on October 4, 2021 12:01 PM | Permalink | Reply to this

### Re: Stirling’s Formula

I can’t believe I didn’t notice that combinatorial interpretation of

$\frac{n!}{n^n}$

given how much we were talking about similar things.

I did think of this post as a first small step toward finding a category whose cardinality is $\sqrt{2 \pi}$, or something like that.

I think

$\frac{n^n}{n!}$

is the cardinality of the groupoid whose objects are functions $f: n \to n$, where a morphism from $f$ to $f'$ is a bijection $g: n \to n$ such that

$f' = g \circ f \circ g^{-1}$

That is, the groupoid of “endofunctions weakly mod conjugation by permutations”.

So it seems the cardinality of this groupoid is asymptotic to

$\frac{e^n}{\sqrt{2 \pi n}}$

Can you get the reciprocal

$\frac{n!}{n^n} \sim \frac{\sqrt{2 \pi n}}{e^n}$

as the cardinality of a category in some nice way, Tom?

Posted by: John Baez on October 4, 2021 5:00 PM | Permalink | Reply to this

### Re: Stirling’s Formula

I wonder if, in line with the common effusion that “$e^{i\pi} + 1 = 0$ unifies the five most important constants in mathematics”, you can manage to fit Cayley’s formula in there somewhere ….

Posted by: L Spice on October 4, 2021 11:39 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Posted by: John Baez on October 5, 2021 4:32 PM | Permalink | Reply to this

### Re: Stirling’s Formula

You’re attributing Stirling’s formula to Cayley?

No—I’m sorry if I wrote so carelessly that I seemed to suggest that (or if you were making a joke that I missed). Rather, I was wondering if a formula that involved $n^n$ could be made to involve $n^2\cdot n^{n - 2}$, and to be interpreted somehow in terms of Cayley’s formula.

Posted by: L Spice on October 5, 2021 8:36 PM | Permalink | Reply to this

### Re: Stirling’s Formula

A weak form of Stirling’s approximation can essentially be derived from Cayley’s formula plus some basic complex analysis. The generating function $R(z)$ for rooted trees satisfies the functional equation $R(z) = z e^{R(z)}$, which can be rearranged to show that $R(z)$ is the compositional inverse of the function $f(z) = z e^{-z}$. Now $f'(z) = (z - 1)e^{-z}$, which vanishes only at $z = 1$. Thus the inverse becomes multivalued at the point $f(z) = 1/e$, so this is the radius of convergence of $R(z)$.

On the other hand, Cayley’s formula tells us the number of rooted trees on $n$ vertices is $n^{n-1}$, so $R(z) = \sum_{n \ge 1} \frac{n^{n-1}}{n!}z^n$ and the fact that the radius of convergence is $1/e$ is equivalent to $\frac{n^{n-1}}{n!} \sim e^n \cdot \text{(subexponential factors)}$ where Stirling’s formula tells us that the subexponential part is $\frac{1}{\sqrt{2\pi n^3}}$.

Posted by: lambda on October 7, 2021 6:25 AM | Permalink | Reply to this

### Re: Stirling’s Formula

Uh, I guess that should actually say $f'(z) = (1-z)e^{-z}$, not that this affects anything else!

Posted by: lambda on October 7, 2021 6:29 AM | Permalink | Reply to this

### Re: Stirling’s Formula

Oh. I completely misunderstood you: I thought you were suggesting Stirling’s formula should replace $e^{i\pi} + 1 = 0$ as our favorite formula… which perhaps it should… but then accidentally calling it Cayley’s formula!

Your actual comment is much more interesting.

As you may know, the connection between Cayley’s formula and the $n^n$ maps from the $n$-element set to itself is real and very exciting:

Joyal notes that there’s a bijection between maps $f : n \to n$ and bipointed trees with $n$ vertices.

Maybe one can get permutations into the game here somehow.

Posted by: John Baez on October 5, 2021 8:45 PM | Permalink | Reply to this

### Re: Stirling’s Formula

As you may know, the connection between Cayley’s formula and the n nn^n maps from the nn-element set to itself is real and very exciting:

Joyal notes that there’s a bijection between maps $f : n \to n$ and bipointed trees with $n$ vertices.

Indeed, that post, and Tom’s earlier comment about the probability that an endomorphism is an automorphism, are exactly what made me think of Cayley’s formula here.

Posted by: L Spice on October 5, 2021 9:40 PM | Permalink | Reply to this

### Re: Stirling’s Formula

By the way, we’ve probably all heard about the birthday problem, where you take a bunch of randomly chosen people and ask if any two have the same birthday. Let’s ignore leap years. Then by the time you get 23 people, the probability of a shared birthday is over 50%.

But what if you’ve got 365 people? Then two share a birthday except when the map from people to days of the year is a bijection. So, the probability that none share a birthday is the probability that a randomly chosen map from a 365-element set to itself is a bijection, namely

$\frac{365!}{365^{365}}$

Now Stirling’s formula kicks in and says this is close to

$\sqrt{2 \pi \cdot 365}\; e^{-365} \approx 1.45 \cdot 10^{-157}$

Posted by: John Baez on October 5, 2021 4:44 PM | Permalink | Reply to this

### Re: Stirling’s Formula

This post contains a quick probabilistic proof of Stirling’s formula using the central limit theorem. From the perspective of this proof, the deeper explanation is very roughly that:

1. $n!$ is a value of the $\Gamma$ function,

2. which is used to describe the $\Gamma$ distribution,

3. which is the distribution of a sum of independent exponential random variables,

4. which (by the CLT) is approximately Gaussian in distribution.

Posted by: Mark Meckes on October 4, 2021 1:39 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Ah, excellent. And I was hoping you might appear and explain what’s going on probabilistically.

In particular, that post shows how the $\sqrt{n}$ in Stirling’s formula comes from the $\sqrt{n}$ in the central limit theorem.

Posted by: Tom Leinster on October 4, 2021 2:59 PM | Permalink | Reply to this

### Re: Stirling’s Formula

I haven’t looked at the post yet, so I’ll mention this silly idea I had and see if it’s at all related:

One could try to estimate $n!$ by estimating the mean of the product of $n$ independent random variables each chosen uniformly from $[1,n]$.

One problem is deciding what sort of mean to use. Arithmetic? Geometric might be easier, since the geometric mean of a product of random variables is the exponential of the arithmetic mean of their logarithms.

Another problem is estimating how close this mean is to $n!$.

So I don’t see this as a good way to estimate $n!$. The way I presented is a lot easier. But at least it gets a bunch of independent random variables in the picture.

Okay, now I’ll look at Aditya Ghosh’s post.

Posted by: John Baez on October 4, 2021 4:49 PM | Permalink | Reply to this

### Re: Stirling’s Formula

This idea isn’t so far from what’s done in Aditya Ghosh’s post (which incidentally is an exposition of a 1974 proof by Rasul Khan). But instead of the mean of the product of uniform random variables, it looks at the mean of the sum of recentered exponential random variables.

This unfortunately makes the connection with $n!$ more indirect.

Posted by: Mark Meckes on October 4, 2021 5:00 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Another simple way to connect $n!$ to a Gaussian distribution is that the Euclidean norm squared of a two-dimensional Gaussian random vector has an exponential distribution, which has factorial moments. (The former of these two observations is closely related to the computation John mentioned of the Gaussian integral by squaring it and using polar coordinates.)

Therefore, if $X$ and $Y$ are independent Gaussian random variables with mean $0$ and variance $1/2$, then $n! = \E (X^2 + Y^2)^n.$ I don’t immediately see a way to get from this to Stirling’s formula, though.

Posted by: Mark Meckes on October 4, 2021 3:12 PM | Permalink | Reply to this

I am reminded of this post on Joyal’s isomorphism between doubly rooted labeled trees and endomorphisms. https://golem.ph.utexas.edu/category/2019/12/avisualtellingofjoyals_pro.html

Posted by: swill stroganoff on October 5, 2021 3:19 PM | Permalink | Reply to this

### Re: Stirling’s Formula

A theorem of Harer and Zagier asserts that the orbifold Euler characteristic of the moduli space $\M_{g,1}$ of genus $g$ smooth curves with one marked point equals $ζ(1 − 2g) = − B_{2g}/2g ,$ where $ζ$ is the Riemann zeta function and $B_{2g}$ is a Bernoulli number (indexing conventions vary). On the other hand Stirling’s formula comes from the leading term in an asymptotic expansion $\log \Gamma (z) = z \log z - z + 1/2 \log (2\pi/z) - \sum_{g \geq 1} \zeta(1 - 2g) \frac{z^{1-2g}}{2g-1} + \dots$ of the Gamma function. Kontsevich [Comm. Math. Phys. 92] and Mirzakhani [JAMS 07] and others have made good use of this by interpreting the orbifold Euler characteristic as a volume. Homotopy theorists will recognize $\zeta(1- 2g)/2$ mod $Z$ as the image of the $J$-homomorphism in the stable homotopy group $\pi^S_{4g-1}$ of spheres. These are widely thought not to be coincidences.
Posted by: jackjohnson on October 5, 2021 9:39 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Cool! So nobody knows for sure what the relationship is?

I spent a little time trying to learn how Bernoulli numbers showed up in describing the image of the $J$-homomorphism:

But I mainly learned that:

• Adams proved that the order of $im J$ inside the $(4n-1)$st stable homotopy group of spheres is the denominator of $B_{2n}/4n$.

• He proved this using Von Staudt’s theorem, which says that the denominator of $B_{2n}$ is the product of all primes $p$ such that $p-1$ divides $2n$.

So the connection is somewhat indirect, mediated by Von Staudt’s theorem. That’s not satisfying to me!

I suspect he was just using Von Staudt’s theorem to prove something that people already suspected for other reasons — other reasons that more directly involve Bernoulli numbers, not just the prime factors of their denominators. But I haven’t managed to learn those other reasons yet. Does anyone here know what they are?

I wouldn’t be surprised if there was more progress after Adams. But I have some catching up to do.

Just for curious bystanders, let’s take $n = 1$. Then we get: the image of the $J$-homomorphism in the 3rd stable homotopy group of spheres has order equal to 4 times the product of all primes $p$ such that $p-1$ divides 2. Those primes are 2 and 3, so the order is $2 \times 3 \times 4 = 24$. And in this case the J-homomorphism is onto. So: the 3rd stable homotopy group of spheres has order 24.

This accounts for many appearances of the number 24 in mathematics.

And what Jack just said shows that this 24 is deeply connected to the 12 here:

$n! = \sqrt{2 \pi n} \, \left( \frac{n}{e} \right)^n \, \left(1 + \frac{1}{12n} + o\left(\frac{1}{n}\right) \right)$

which can be shown using a more refined version of Laplace’s method.

Posted by: John Baez on October 6, 2021 4:48 PM | Permalink | Reply to this

### Re: Stirling’s Formula

I have always wondered this too, and I don’t claim to have a compelling answer for why. But I can say a little bit about where the answer comes from.

As you know, Adams’s explanation involves giving an LOWER bound, using an invariant which relies on the Atiyah-Bott-Shapiro orientation of K-theory and Adams operations in K-theory, which is how the Bernoulli numbers $B_{2k}/4k$ come in, and an UPPER bound, via another argument (the “Adams conjecture”) which also relies on K-theory and Adams operations. The LOWER bound turns out to be the denominator of $B_{2k}/4k$, while the UPPER bound turns out to be the GCD of the set of integers of the form

(1)$m^{\text{large}} (m^{2k}-1), \qquad (m\geq 1).$

That these are the same is just a number theoretic observation (related to the von Staudt theorem). That they are produced by such similar techniques, but are otherwise not obviously the same number except after the fact, seems mysterious. Maybe someone has a good explanation for this, but I have not seen one.

The funny thing is that there is another famous result which Adams had a hand in, whose nature has always seemed mysterious to me, but whose proof uses the same kinds of ideas: the vector fields on spheres problem. Here, there is a LOWER bound on the number of independent vector fields on a sphere (the “Radon-Hurwitz numbers”), obtained by an explict construction of such vector fields from modules over Clifford algebras (and so related to K-theory), and an UPPER bound (proved by Adams), by topological arguments using K-theory and Adams operations. Again, it’s a mystery to me why the two bounds are the same.

In fact, these problems are very closely related: Adams’ proof of the vector field problem can be expressed in terms of J(X) and solved by Adams’ ideas about it, as he was very aware. I highly recommend Adams’ announcement of his results about J(X) at the 1962 ICM:

Applications of the Grothendieck-Atiyah-Hirzebruch functor K(X)

which makes the connection between the J-homomorphism and vector fields on spheres precise.

Posted by: Charles Rezk on October 31, 2021 6:58 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Thanks, Charles! I’ve been poring over your book on K-theory, hoping the answer to the Bernoulli mystery lay between the lines somewhere, and not finding it — but learning a lot of great stuff in the meantime. It’s useful to hear your perspective.

Since I’m not a good homotopy theorist, I think I should concentrate on the lower bound, which is provided by an explicit construction that I can try to relate to other branches of mathematics, rather than worrying about the upper bound, which is ultimately all about why some things don’t exist.

This strategy has paid off well for me when it comes to vector fields on spheres. As an undergrad I wrote a paper about the construction of vector fields on spheres from Clifford algebra representations, which provides the lower bound in that problem. I did this in a class on Lie groups with Valentine Bargmann (formerly Einstein’s assistant, though I had no idea at the time), and I got all my information from Husemoller’s book Fibre Bundles. I’ve been fond of Clifford algebras and the algebraic aspects of Bott periodicity ever since, and that helped me a lot when I started trying to understand the octonions. But I never got around to fully understanding the proof that there aren’t more vector fields on spheres than provided by Clifford algebra representations. I can’t say I regret it — though the math surrounding this proof looks great.

So, I know about three nice proofs that there are only four finite-dimensional normed real division algebras, but I’ve never gotten around to fully understanding the proof that there are only four finite-dimensional real division algebras.

On the other hand, there’s a continuum of two-dimensional real division algebras, and I’ve never need to use any of them except the normed one!

So, I will focus on the Atiyah–Bott–Shapiro orientation of K-theory and the Adams operations in K-theory.

Posted by: John Baez on October 31, 2021 7:47 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Thanks. Though I don’t think I’ve ever written a book on K-theory: are you thinking of someone else?

Posted by: Charles Rezk on November 2, 2021 2:56 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Posted by: John Baez on November 3, 2021 2:38 AM | Permalink | Reply to this

### Re: Stirling’s Formula

In Adams’ third paper on $\mathrm{im} J$ (IIRC) there’s an argument using the Chern character (i.e. over $\mathbb{Q}$ rather than working one prime at a time) which shows that the image of the subgroup generated by half $\zeta(1-2k)$ in $\mathbb{Q}/\mathbb{Z}$ is isomorphic to $\mathrm{im} J_{4k-1}$ on the nose: not just that the two groups have the same order, but that the number-theoretic generator maps to the image of the Bott generator. Speaking of things that are not understood, this one gives me goosebumps.

High five from afar BTW re 24. Also the point about Laplace’s method is very well-taken. Kontsevich’s argument (which IMHO got him the Fields medal) is basically Laplace’s method exploited via Wick’s theorem (about power series expansions of $\exp f(x)$ indexed by graphs).

Posted by: jackjohnson on October 7, 2021 2:58 AM | Permalink | Reply to this

### Re: Stirling’s Formula

PS re 24: the stable homotopy group $\pi^S_3 \cong \mathbb{Z}_{24}$ is generated by the Hopf fibration $S^7 \to S^4$ defined by taking the quotient of the unit Cayley numbers by the unit quaternions $S^3 =$ SU(2). Similarly, $\pi^S_1 \cong \mathbb{Z}_2$ is generated by the fibration $S^3 \to S^2$ defined by the quotient of the unit quaternions by the unit complex numbers $S^1$.

OTOH the Hurewicz homomorphism

$\pi^S_1 \ni -1 \mapsto -1 \in K^{alg}_1(\mathbb{Z}) = \mathbb{Z}^\times = \{\pm 1\}$

to the algebraic K-theory of the integers is an isomorphism in degree one, while in degree three the Hurewicz homomorphism

$\pi^S_3 \cong \mathbb{Z}_{24} \to K^{alg}_3(\mathbb{Z}) \cong \mathbb{Z}_{48}$

is injective. Ausoni, Dundas, and Rognes, Divisibility of the Dirac magnetic monopole as a two-vector bundle over the three-sphere, Doc. Math. 2008 propose an interesting interpretation of this.

Posted by: jackjohnson on October 7, 2021 3:00 AM | Permalink | Reply to this

### Re: Stirling’s Formula

For the Schreiberian understanding in M/F-Theory as Mf-Theory concerning

the interpretation of $\mathbb{Z} \subset \pi_7(S^4) \to \pi_3(\mathbb{S}) \simeq \mathbb{Z}/24$ as reflecting, under Hypothesis H, the folklore result that in M-theory on K3 must have M-branes appearing in multiples of 24.

In the introduction this is p. 16, then in the conclusion it’s p. 86 with an index of further cross-relations on p. 85.

On p. 83 is a remark (turned into a footnote) on how this reflects, as far as one can say this, an argument by McNamara & Vafa.

Posted by: David Corfield on October 7, 2021 12:10 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Thanks very much for posting this link!

It’s a beautiful and fascinating fact [?] that a twenty-four times punctured compact homotopy K3 four-manifold (arguably a quaternionic analog of an elliptic curve in complex geometry) has a stably trivializable tangent bundle. I’m only peripherally a member of the differential topology community but I believe it is not widely understood there that this has a brane-theoretic interpretation.

I’m only guessing, but I suspect this may be because the mathematical literature around branes centers around questions of mirror symmetry (complex versus symplectic geometry, derived categories of coherent sheaves, stuff like that) which seems to me pretty disjoint from the language used closer to physics (eg ADS/CFT ?). Sorting things out around this particular beautiful example could possibly be extremely helpful.

[I hope I’m not just shooting my mouth off.]

Posted by: jackjohnson on October 7, 2021 11:40 PM | Permalink | Reply to this

### Re: Stirling’s Formula

You’d think the lesson from the Atiyah-Witten partnership was that mathematics and physics can’t get close enough.

Posted by: David Corfield on October 8, 2021 7:23 AM | Permalink | Reply to this

### branes and algebraic topology

(prompted by David C.’s relaying of the previous message to the nForum, here):

The trouble with focusing on homological mirror symmetry (in the sense of equivalence of derived/$A_\infty$-categories associated with CYs) is that the interpretation of these derived categories as categories of D-branes in string theory remains based on particularly vague and hand-wavy folklore arguments:

What is fairly well established is that these derived/$A_\infty$ categories encode the branes of the “topological string” (that’s due to an argument by Konsevich which was seminally elaborated on by Costello (here). But to get from there to the physical string of string theory one needs to identify Bridgeland stability conditions with actual stability of physical D-branes. The state of the art of the argument for that seems not to have progressed since the original articles of Douglas and Aspinwall (here). With all due respect to these authors and their vision, back then, I believe that anyone who reads these articles closely will find that it is fair to say that there are large leaps of faith driven by much handwaving in there.

The section Bridgeland stability condition – As stability of BPS D-branes in the nLab entry on Bridgeland stability is my rendering of what I think, after having spent some time with his articles, is what Aspinwall’s intuition was getting at, back then. I find this a cute argument (at least the way I tried to lay this out in the entry – it is not so immediate to extract this clear picture from the old articles), but cute as the story may be, anyone will agree that this is a really vague and arguably simplistic physics story, with just the most handwaving relation to any first-principles analysis of D-brane stability by what would be their defining worldsheet CFT derivation in perturbative string theory, let alone any enhancement of that to the elusive non-perturbative string theory that is expected to be necessary to really get to the bottom of any of the nature of branes.

Also notice that another part of the community, of people who did worry more explicitly about physical stability of D-branes, ended up adopting another folklore altogether, namely that of D-brane charge quantization in K-theory. This, too, remains a handwavy conjecture with various indications that it’s not completely correct (here), but at least here people did make concrete worldsheet CFT computations in simple examples in order to check if this first-principles derivation from the axioms of (perturbative!) string theory matches the conjectured K-theory classification (upshot: sometimes it seems that it does).

So now there are two traditional proposals: the old one says that D-brane charge is in Bridgeland-stable subcategories of derived categories of the compactfication space, and a less old one says that it is in the (twisted equivariant differential) K-theory of that compatification space.

There is essentially no existing discussion of the relation of these two proposals. What one does see from time to time is an author claiming that “derived categories are better/richer/cooler than K-theory” or that “K-theory is better/richer/cooler than derived categories”.

But all this is moot until there is an actual theoretical foundation of what “D-brane” really means, in non-perturbative string theory, which is where their actual home must be. This is the glaring gap in all contemporary string theory ever since its “second revolution”: It’s meant to be all about non-perturbative effects of branes, but there is no actual theory of these, just a web (vast and suggestive as it may be) of more or less vague plausibility arguments.

The hypothesis that instead of K-theory one must use Cobordism cohomology (ultimately unstable and framed) is of course new, this is Hypothesis H, though the relevance of Cobordism and its plausbible relation to 24 brane punctures in K3-compactifications of M/F-theory also appears in the recent field of “Swampland studies” around Vafa, as referenced on that p. 83.

Posted by: Urs Schreiber on October 8, 2021 10:59 AM | Permalink | Reply to this

### Re: branes and algebraic topology

…but there is no actual theory of these, just a web (vast and suggestive as it may be) of more or less vague plausibility arguments’:

Thanks for the remarks about stability conditions (and much else) which I am completely ignorant of. I believe your assessment of the situation is quite reasonable, so far as an outsider can tell.

What I was trying to suggest is that connecting the differential topology literature of homotopy K3s with the physics literature of branes in/on them would be good for people on both sides of the math/physics boundary. I’m afraid this sounds naive because smart people have been exploring these questions for decades, but clarifying the state of play could be very helpful, especially in the current stage of uncertainty in high-energy physics.

BTW thanks to the n-maintainers for coping with my fumbled attempts at posting here.

Posted by: jackjohnson on October 8, 2021 12:06 PM | Permalink | Reply to this

### Re: Stirling’s Formula

All the way back in 2007, Dan Piponi listed appearances of 24 and its friends and suggested that there could be a whole book in the subject. Interestingly, occurrences in the stable homotopy groups of spheres were in the “not sure if this is related but it’s still interesting” part of the list. I guess that now 24-ology has actually made an advance since then!

Posted by: Blake Stacey on October 10, 2021 7:19 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Some years ago Mikhail Kapranov vouchsafed me his temptation to write a book called ‘24’ (not to be confused with the TV series!), perhaps inspired by Serge Lang’s book ‘$SL_2(\mathbb{R})$’. We wonders, yes we wonders…

Posted by: jackjohnson on October 10, 2021 9:24 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Posted by: Blake Stacey on October 10, 2021 10:16 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Also check out Section 3 here, which discusses many different articles on Stirling’s formula in the American Mathematical Monthly:

• Jonathan M. Borwein and Robert M. Corless, Gamma and factorial in the Monthly.

Judging by the number of articles in the Monthly on the subject, Stirling’s formula approximating $n!$ for large $n$ is by far the most popular aspect of the $\Gamma$ function. There are “some remarks”, “notes”, more “remarks”; there are “simple proofs”, “direct proofs”, “new proofs”, “corrections”, “short proofs”, “very short proofs”, “elementary” proofs, “probabilistic” proofs, “new derivations”, and (our favourite title) “The (n+1)th proof”.

That should be “(n+1)st”. But I want to check out the probabilistic proofs!

Posted by: John Baez on October 6, 2021 4:11 PM | Permalink | Reply to this

### Re: Stirling’s Formula

It turns out Ilya Razenshteyn offered a probabilistic proof of Stirling’s formula on Twitter on April 6, 2021. (I’m lucky to have a friend who noticed it!)

Here is a short proof of Stirling’s formula n! ~ (n/e)^n √2πn using probability (using an appropriate version of the Central Limit Theorem (CLT)). How do we end up with e and π approximating a factorial? Please like and retweet if you like such content. (1/n)

The starting point is to consider a sum of n iid random variables distributed according to the Poisson distribution with parameter 1. Denote their sum by X. On one hand, X is just Pois(n) (easy to show by induction). So, Pr[X = n] = n^n * e^{-n} / n! . Looks familiar? (2/n)

This expression looks almost like Stirling. And we would be done if we can show Pr[X = n] ~ 1 / √2πn. This follows from a version of CLT. Heuristically, X should behave like a Gaussian G with mean n and variance n, so Pr[X = n] should be ≈ to pdf of G in n (3/n)

… which is exactly equal to 1 / √2πn. To make this argument rigorous, we need a local discrete version of CLT, which can be found here (Theorem 7): https://terrytao.wordpress.com/2015/11/19/275a-notes-5-variants-of-the-central-limit-theorem/#more-8566 We are done (modulo a few remarks)! (4/n)

First, local discrete CLT uses Fourier analysis and is not a trivial statement. One needs to carefully check that we don’t use the Stirling’s formula anywhere in its proof (we don’t). (5/n)

Second, this probabilistic proof of the Stirling’s formula is arguably quite unnatural (how did we know we need to look at the sum of n iid Poissons?). The standard textbook proof (based on numerical integration and Wallis integrals) is clumsier but arguably more natural. (6/n)

Let me know if something is not explained well (having some side effects from the vaccine…), I’ll try to expand the relevant part. The end! (7/7)

Posted by: John Baez on October 7, 2021 3:19 AM | Permalink | Reply to this

### Re: Stirling’s Formula

On a heuristic level, this is almost exactly the same idea as Khan’s proof I outlined above, using Poisson random variables as a discrete analogue of Gamma random variables. The discreteness means on the one hand that the basic idea is a bit simpler to explain, but on the other hand we need a more sophisticated version of the CLT to make the argument work.

However, Khan’s proof is also susceptible to the criticism that relying on any version of the CLT makes the whole argument rather less elementary than other approaches to Stirling. In fact, I see in the review by Borwein and Corless that you linked to above that exactly that criticism was leveled against Khan’s paper, among others.

Posted by: Mark Meckes on October 7, 2021 4:07 AM | Permalink | Reply to this

### Re: Stirling’s Formula

I think the proof I presented strikes a nice balance between being elementary and clear; Laplace’s Method takes a little work to prove but it’s quite believable, and the rest is basic calculus and not much of it.

What I’d hope for from a probabilistic proof is something else: a deeper insight into what Stirling’s formula ‘really means’ (whatever that means). I think the proof using a sum of Poisson distributions might be on the right track but it needs a better story to go along with it.

I can’t help but think about how the number of cycles of length $k$ in a random permutation of an $n$-element set approaches a Poisson distribution with mean $1/k$ as $n \to \infty$, and how these different random variables (the number of $k$ cycles for different $k$) become independent in the $n \to \infty$ limit. But I don’t see how to connect this up.

Posted by: John Baez on October 7, 2021 4:26 AM | Permalink | Reply to this

### Re: Stirling’s Formula

I think from a probabilist’s point of view you couldn’t hope for a better story than the sum of Poissons. This argument says that Stirling’s formula is simply a manifestation of the fact that $Poisson(n) \approx N(n,\sqrt{n})$ for large $n$, one of the most fundamental distributional approximations in probability.

Although this approximation can be seen as a special case of the central limit theorem since $Poisson(n)$ is the distribution of the sum of $n$ independent $Poisson(1)$ random variables, it can be proved independently. In fact the usual way to give a direct proof of a version of this approximation would be to use Stirling’s formula. So in this sense, Stirling’s formula is equivalent to that distributional approximation.

Posted by: Mark Meckes on October 7, 2021 2:09 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Mark wrote:

So in this sense, Stirling’s formula is equivalent to that distributional approximation.

Great! So let me check to see if I’ve got this right. This is supposed to be a statement that’s equivalent to Stirling’s formula (or at least implies it). I’m trying to phrase it a bit informally:

Suppose raindrops are randomly landing on your patio at a constant average rate that you know. You start your stopwatch, wait until you expect $n$ drops have landed, and record how many have actually landed. Do this lots of times and get a probability distribution of results. In the limit as $n \to \infty$, this probability distribution is a Gaussian with mean $n$ and standard deviation $\sqrt{n}$.

Here of course I’m alluding to a Poisson distribution, and I’m being sloppy about the sense in which the probability distribution is converging.

If this works, what’s nice is that it doesn’t even mention factorials, $e$, or $\sqrt{2\pi}$… but then if a mathematician makes it precise and tries to check it, they’ll find Stirling’s formula (or something that implies it).

Posted by: John Baez on October 7, 2021 8:07 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Sounds good to me! And I agree, it’s especially nice that this is a “reason” for Stirling’s formula that doesn’t need to mention $n!$ or any funny constants explicitly.

I do still find it disappointing that it implicitly relies on a quite strong form of distributional approximation, although as I said it’s not necessary to rely on the central limit theorem here. In fact, now that I think of it, I bet one could give a proof of a sufficiently strong version of Poisson–Gaussian approximation, without using Stirling’s formula, that’s about as elementary as your Laplace’s method argument.

Posted by: Mark Meckes on October 7, 2021 9:28 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Don’t forget the regularized formula “$\infty!=\sqrt{2\pi}$” coming from $\zeta'(0)=-\log\sqrt{2\pi}$.

(FWIW this can be thought of as the determinant of an operator with eigenvalues 1, 2, 3, … so it arises when regularising certain path integrals…)

Posted by: Dan Piponi on October 7, 2021 4:41 AM | Permalink | Reply to this

### Re: Stirlings Formula

Can $\infty^{\infty}$ be regularized too?

Posted by: David Corfield on October 7, 2021 9:14 AM | Permalink | Reply to this

### Re: Stirlings Formula

Indulging in some numerology…

In the spirit of Tao’s excellent article The Euler-Maclaurin formula, Bernoulli numbers, the zeta function, and real-variable analytic continuation we want to somehow regularise $\log(N^N)=N\log N$ as $N\rightarrow\infty$.

Note that $N\log N=\int_1^N(1+\log x)d x$. Define

$I(a,N)=N\log N=\int_1^N(1+\log x)\exp(-a x)d x.$

We don’t have the limit $lim_{N\rightarrow\infty}I(0,N)$ but we can find the next best thing and study $lim_{N\rightarrow\infty}I(a,N)$ for $a$ in a neighbourhood of $0$.

Mathematica tells me that $lim_{N\rightarrow\infty}I(a,N)=(\exp(-a)+Γ(0,a))/a$. ($Γ$ is the incomplete gamma integral.) Near zero we get

$I(a,\infty)=(1-γ-\log a)/a+a/4-a^2/9+....$

The constant term is zero. So the answer, in Tao’s sense, is zero.

But $I(a,\infty)$ isn’t meromorphic in $a$ around zero so I don’t know how meaningful this is. (Eg. does it vary if I change the smoother $\exp(-a x)$? A few random examples I tried also gave zero, sometimes in an interesting way.)

Posted by: Dan Piponi on October 7, 2021 4:48 PM | Permalink | Reply to this

### Re: Stirlings Formula

Should $I(a, N) = N\log N = \int_1^N (1 + \log x)\exp(-a x)d x$ be $I(a, N) = N\log N - \int_1^N (1 + \log x)\exp(-a x)d x$?

Posted by: L Spice on October 8, 2021 5:27 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Does this imply

$\infty_C! = 2\pi \; ?$

[I think $2\pi i$ is known in some circles as Deligne’s motive’…]

Posted by: jackjohnson on October 7, 2021 11:53 AM | Permalink | Reply to this

### Re: Stirling’s Formula

Posted by: James Borger on October 20, 2021 9:40 PM | Permalink | Reply to this

### Re: Stirling’s Formula

H. Suyari (2004) published a paper on a q-algebraic derivation of Stirling’s formula based on a q-version of the factorial (ie. Tsallis statistics in non-extensive versions of thermodynamics).. This generalization may provide you with some additional insights relating to q-generalizations of the Gaussian distribution.

Posted by: James Juniper on October 17, 2021 2:20 AM | Permalink | Reply to this

### Re: Stirling’s Formula

Posted by: John Baez on October 19, 2021 4:44 AM | Permalink | Reply to this

### Re: Stirling’s Formula

I’d been thinking about Stirling’s formula recently, and then remembered this Café posting. I’ll take this opportunity to record another reasonably simple proof of Stirling’s formula. The stuff in the beginning is really well-known and appears in many calculus textbooks. The end is not something I’ve seen put quite this way before, although the basic plan of attack seems fairly natural to me.

Trapezoidal sum

My own favorite way of remembering Stirling’s formula is by working through a trapezoidal sum estimate for the integral

$I_n = \int_1^n \log x\; dx = \left. x\log x - x\right|_1^n = n \log n - n + 1 = 1 + \log \frac{n^n}{e^n}$ where $\log$ is of course the natural logarithm. Partitioning the interval $[1, n]$ into subintervals of unit length, the corresponding trapezoidal estimate is

$T_n = \frac{\log 1 + \log 2}{2} + \frac{\log 2 + \log 3}{2} + \cdots + \frac{\log (n-1) + \log n}{2} = \log (n!) - \frac1{2} \log n = \log \frac{n!}{\sqrt{n}}.$

$T_n$ is an underestimate of $I_n$ because the graph of $\log$ is concave down. In other words, the error terms

$\int_k^{k+1} \log x\; dx - \frac{\log k + \log (k+1)}{2} \qquad (1)$ are positive, and it follows that the sum of these error terms from $k=1$ to $n-1$, which is $I_n - T_n$, increases with $n$. However, using the error term estimate for the trapezoidal rule, the term (1) is on the order of $1/k^2$ (the maximum value of $|(\log)''(x)| = 1/x^2$ over $[k, k+1]$). Since $\sum 1/k^2$ converges, we see that the increasing sequence $I_n - T_n$ has a uniform upper bound and therefore a limit. Hence

$\underset{n \to \infty}{\lim} (I_n - 1) - T_n = \underset{n \to \infty}{\lim} \log \frac{n^n \sqrt{n}}{e^n \cdot n!}$ exists.

We have thus shown that $n! \sim C\frac{n^n \sqrt{n}}{e^n}$ for some constant $C$.

Connection with Wallis’s formula

It remains to see that $C = \sqrt{2\pi}$. By some scaling analysis, $n! \sim C\frac{n^n \sqrt{n}}{e^n}$ leads to

$\frac{(2n)!}{n! \cdot n!} \sim \frac{2^{2n} \sqrt{2n}}{C n}.$ We have

$\frac1{C} \sim \frac{(2n)!}{2^{2n} n! \cdot n!} \cdot \frac{n}{\sqrt{2n}} = \frac{1 \cdot 3 \cdot 5 \cdot \ldots \cdot (2n-1)}{2 \cdot 4 \cdot 6 \cdot \ldots \cdot 2n} \cdot \frac{n}{\sqrt{2n}}\; , \qquad (2)$

$\frac{2}{C} \sim \frac{3 \cdot 5 \cdot \ldots \cdot (2n-1)(2n+1)}{2 \cdot 4 \cdot 6 \cdot \ldots \cdot 2n} \cdot \frac1{\sqrt{2n}}\; ,$ and now by multiplying the previous two lines,

$\frac{2}{C^2} \sim \frac{1 \cdot 3}{2 \cdot 2} \cdot \frac{3 \cdot 5}{4 \cdot 4} \cdot \ldots \cdot \frac{(2n-1)(2n+1)}{2n \cdot 2n} \cdot \frac1{2}.$

This last is recognizable as a product appearing in a ‘Wallis formula’ for $\pi$. One way of evaluating its limit is first by rewriting it as

$\frac1{2} \prod_{k=1}^n \frac{(2k-1)(2k+1)}{(2k)^2} = \frac1{2}\prod_{k=1}^n \left(1 - \frac1{(2k)^2}\right)$ and then by using the product formula for the sine function,

$\frac{\sin \pi x}{\pi x} = \prod_{k = 1}^\infty \left(1 - \frac{x^2}{k^2}\right).$ But the product formula is a whole other story, involving things like the reflection formula for the Gamma function. Let’s not get too sidetracked: remember that the original goal was to connect $C$ directly to the Gaussian integral

$\int_{-\infty}^\infty e^{-x^2/2}\; dx$ and that’s what we should do here as well.

Calculation with Gaussian integrals and the Gamma function

Let’s start afresh. The Gaussian integral is a somewhat protean object, and is inevitably connected with the Gamma function, for example with the value $\Gamma(\frac1{2})$. Indeed,

$\Gamma\left(\frac1{2}\right) = \int_0^\infty x^{-1/2} e^{-x}\; dx = 2\int_0^\infty e^{-u^2} \; du = \int_{-\infty}^\infty e^{-u^2}\; du$ (substitute $u^2$ for $x$) and here I think we may take it for granted that we know the classic and beautiful proof that

$\int_{-\infty}^\infty e^{-u^2}\; du = \sqrt{\pi}$ by squaring both sides and changing to polar coordinates. We can also take for granted the functional equation

$\Gamma(x+1) = x\Gamma(x).$ Thinking asymptotically, one may then guess the interpolation

$\Gamma(x + \frac1{2}) \approx \sqrt{x} \cdot \Gamma(x).$ Let’s see where this takes us. For large integers $n$, consider

$\frac{\Gamma(n + \frac1{2})}{\Gamma(n)\sqrt{n}} = \frac{(n-\frac1{2})(n-\frac{3}{2}) \ldots \frac1{2}\Gamma(1/2)}{(n-1)!\sqrt{n}} = \frac{(2n-1)(2n-3) \ldots 1 \cdot \sqrt{\pi}}{(2n-2)(2n-4)\ldots 2 \cdot \sqrt{n}}. \qquad (3)$ We know the limit of (3) as $n \to \infty$ exists, based on our similar expression for $\frac1{C}$:

$\frac1{C} = \underset{n \to \infty}{\lim} \frac{1 \cdot 3 \cdot 5 \cdot \ldots \cdot (2n-1)}{2 \cdot 4 \cdot 6 \cdot \ldots \cdot 2n} \cdot \frac{n}{\sqrt{2n}}. \qquad (2)$

All that remains is to show the limit of (3) is $1$ (as we suspect from our asymptotic guess), because this is equivalent to $C = \sqrt{2\pi}$. For that, all we need to do is show that we get the same limit

$L = \underset{x \to \infty}{\lim} \frac{\Gamma(x+ \frac1{2})}{\sqrt{x} \cdot \Gamma(x)} \qquad (4)$ for half-integers $x$ as we do for whole integers $x$, for then

$\frac{\Gamma(n+1)}{\sqrt{n+(1/2)}\cdot \Gamma(n + (1/2))} \cdot \frac{\Gamma(n + (1/2))}{\sqrt{n}\Gamma(n)} = \frac{n\Gamma(n)}{\sqrt{(n+1/2)n}\cdot \Gamma(n)} = \frac{n}{\sqrt{n^2 + (n/2)}} \to 1$ implies $L^2 = 1$, whence $L = 1$: exactly what we want.

The crucial fact we need here is the log-convexity of $\Gamma$, i.e., the fact that the function $\log \Gamma(x)$ is convex over $0\lt x \lt \infty$. For the moment we take this for granted. The immediate consequence of log-convexity is that for $x \gt 0$,

$\Gamma(x+ 1/2)^2 \leq \Gamma(x)\Gamma(x+1)$ so that

$\frac{\Gamma(x-1/2)}{\Gamma(x - 1)} \leq \frac{\Gamma(x)}{\Gamma(x-1/2)} \leq \frac{\Gamma(x+1/2)}{\Gamma(x)}.$ This means that these values for half-integers $x$ are nestled between values for whole integers $x$ on either side. It follows easily that we get the same limit (4) for half-integers $x$ as we do for whole integers. And now we’re done.

Log-convexity of Gamma function

This is actually not very difficult, but the basic fact on which it rests is in some sense profound:

Theorem: Log-convex functions $(a, b) \to [0, \infty)$ form a rig over the rig of nonnegative real numbers, closed under sups.

That log-convex functions are closed under pointwise multiplication is very easy. Another easy fact that can be left as an exercise is that log-convex functions are closed under taking pointwise suprema. But the real kicker is that log-convex functions are closed under pointwise addition!

I’ll just point to the nLab for that. In essence, it boils down to an application of Hölder’s inequality.

Note that for fixed positive $x$, the function $t \mapsto x^{t-1}$ is log-convex (it is log-linear after all). The integral

$\Gamma(t) = \int_0^\infty x^{t-1} e^{-x}\;dx$ can be expressed as a sup over Riemann sums

$\sum_{i=1}^n x_i^{t-1} e^{x_i} \Delta x_i$ which are positive linear combinations of log-convex functions. It follows from the theorem that $\Gamma(t)$ is log-convex.

One last thing I’ll recall is that $\Gamma$ is uniquely characterized as a real-valued function $f(x)$ defined for all real $x$ except $x = 0, -1, -2, \ldots$ and satisfying log-convexity, the functional equation $f(x+1) = x f(x)$, and $f(1) = 1$. This is the Bohr-Mollerup theorem. Various convexity arguments are put to use with a kind of terrifying vise-like effect, to squeeze down $f$ to a single possibility. Emil Artin made the Bohr-Mollerup characterization the centerpiece of his famous notes on the Gamma function, which may be found here.

Afterword

While I was thinking about this and researching this, I came across some interesting articles. One of them, “From the Wallis formula to the Gaussian distribution” by Mark Reeder, appears to be a write-up for students that is rich in historical notes, for example following the trail of Wallis in his amazing efforts toward “quadrature of the circle”, and connecting up with Euler, Gauss, and even Stieltjes. I didn’t know, for instance, that the symbol $\pi$ was first introduced by a Welsh mathematician named William Jones in 1706, after Wallis had died. (Sheesh, really? What did the old codgers like Newton and Leibniz use?) Wallis himself denoted the quantity we’d call $4/\pi$ by a symbol which, if I had to name it in TeX, I’d call \boxdot! A box with a dot inside it.

Another is a very short proof of Stirling’s formula due to Dan Romik that appeared in the American Mathematical Monthly. Normally I’m not a huge fan of “World’s Shortest Proofs”, like Don Zagier’s “one-line proof” of the two-squares theorem, because (a) usually they look a little show-offy, and (b) they often leave a lot of work to the reader to actually grok the damned thing. Rabbits out of hats and so forth. [Incidentally, a much kinder explanation of Zagier’s proof can be found somewhere on YouTube, maybe by Mathologer; I forget. IIRC it’s about 30 minutes long, but it really gives the intuition, which is indeed lovely.] But anyway, in the case of Romik, I’ll admit that my jaw dropped. To me his proof has a very Euler-MacLaurin feel to it, but very slickly arranged/optimized.

Finally, I’ll mention A Probabilistic Proof of Wallis’s Formula for $\pi$ by Stephen J. Miller, which involves Student’s t- and seems to have something in common with Romik’s set-up, although I’d need to think longer to be clear on that.

Posted by: Todd Trimble on June 17, 2022 7:35 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Nice stuff, Todd! It kicked off this random thought.

Recently Owen Lynch, Joe Moeller and I came up with a general operadic approach to thermodynamics based on convex sets:

The key idea was that the set of states of a thermodynamic system should be a convex space $X$, defined as an algebra of the Lawvere theory where the operations are convex linear combinations, equipped with a concave function

$S : X \to [-\infty,\infty]$

that describes the entropy of each state. (For this to make sense, we need to make $[-\infty,\infty]$ into a convex space in a certain way. There are a couple of ways to do it extending the usual convex structure on $(-\infty,\infty)$, but you need to pick the right one.)

Now, if $S$ is concave, $\exp(-S)$ should be log-convex. We could try to work with $\exp(-S)$ rather than $S$, and there might be an advantage, now that you’ve told us log-convex functions form a rig.

However, I have no idea what it means, physically, to add two exponentiated entropy functions $\exp(-S_1)$ and $\exp(-S_2)$.

Posted by: John Baez on June 18, 2022 11:16 PM | Permalink | Reply to this

### Re: Stirling’s Formula

Thanks! Some of this operadic stuff has also been a bit on my mind recently, and somehow Stirling’s formula got folded into the mix. (I’ve been meaning to give Tai-Danae’s recent talk a listen as well.)

I’m not equipped to say what the physical significance of adding exponentiated entropy functions would be.

I like how you use the phrase “concave function”, and it makes me think how you (or at least I) rarely hear grown-up mathematicians say, outside of a calculus class, that a function is “concave up” or “concave down”, as I did in my comment. It reminds me of something Halmos once said in his automathography:

I’d […] tell my students that the exponential that $2$ is the logarithm of is not $10^2$ but $e^2$; that’s how mathematicians use the language. The use of $\ln$ is a textbook vulgarization. Did you ever hear a mathematician speak of the Riemann surface of $\ln z$?

Posted by: Todd Trimble on June 19, 2022 5:21 PM | Permalink | Reply to this

### Re: Stirlings Formula

Boltzmann’s $S = \log W$ (in natural units where $k_B = 1$) means $\exp(-S) = 1/W$ right? So you are taking a harmonic mean of the number of microstates?

Posted by: Steve Huntsman on June 20, 2022 2:36 PM | Permalink | Reply to this

Post a New Comment