Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

September 7, 2024

The Space of Physical Frameworks (Part 2)

Posted by John Baez

I’m trying to work out how classical statistical mechanics can reduce to thermodynamics in a certain limit. I sketched out the game plan in Part 1 but there are a lot of details to hammer out. While I’m doing this, let me stall for time by explaining more precisely what I mean by ‘thermodynamics’. Thermodynamics is a big subject, but I mean something more precise and limited in scope.

Thermostatic systems

A lot of what we call ‘thermodynamics’, or more precisely ‘classical thermodynamics’, has nothing to do with dynamics. It’s really about systems in equilibrium, not changing, so it actually deserves to be called ‘thermostatics’. Here’s one attempt to formalize a core idea:

Definition. A thermostatic system is a convex space XX together with a concave function S:X[,]S \colon X \to [-\infty,\infty]. We call XX the space of states, and call S(x)S(x) the entropy of the state xXx \in X.

There’s a lot packed into this definition:

  1. The general concept of convex space: it’s roughly a set where you can take convex combinations of points x,yx,y, like ax+(1a)ya x + (1-a) y where 0a10 \le a \le 1.
  2. How we make [,][-\infty,\infty] into a convex space: it’s pretty obvious, except that -\infty beats \infty in convex combinations, like 13()+23=\frac{1}{3} (-\infty) + \frac{2}{3} \infty = -\infty.
  3. What is a ‘concave’ function S:X[,]S \colon X \to [-\infty,\infty]: it’s a function with

S(ax+(1a)y)aS(x)+(1a)S(y)for0a1 S(a x + (1-a) y) \ge a S(x) + (1-a) S(y) \qquad \text{for} \; 0 \le a \le 1

To see all the details spelled out with lots of examples, try this:

We actually defined a category of thermostatic systems and maps between them.

What you can do with a thermostatic system

For now I will only consider thermostatic systems where X=X = \mathbb{R}, made into a convex set in the usual way. In these examples a state is solely determined by its energy EE \in \mathbb{R}. I’m trying to keep things as simple as possible, and generalize later only if my overall plan actually works.

Here’s what people do in this very simple setting. Our thermostatic system is a concave function

S:[,] S \colon \mathbb{R} \to [-\infty, \infty]

describing the entropy S(E)S(E) of our system when it has energy EE. But often entropy is also a strictly increasing function of energy, with S(E)S(E) \to \infty as EE \to \infty. In this case, it’s impossible for a system to literally maximize entropy. What it does instead is maximize ‘entropy minus how much it spends on energy’ — just as you might try to maximize the pleasure you get from eating doughnuts minus your displeasure at spending money. Thus, if CC is the ‘cost’ of energy, our system tries to maximize

S(E)CE S(E) - C E

The cost CC is the reciprocal of a quantity called temperature:

C=1T C = \frac{1}{T}

So, CC should be called inverse temperature, and the rough intuition you should have is this. When it’s hot, energy is cheap and our system’s energy can afford to be high. When it’s cold, energy costs a lot and our system will not let its energy get too high.

If S(E)CES(E) - C E as a function of EE is differentiable and has a maximum, the maximum must occur at a point where

ddE(S(E)CE)=0 \frac{d}{d E} \left(S(E) - C E \right) = 0

or

ddES(E)=C \frac{d}{d E} S(E) = C

This gives the fundamental relation between energy, entropy and temperature:

ddES(E)=1T \frac{d}{d E} S(E) = \frac{1}{T}

However, the math will work better for us if we use the inverse temperature.

Suppose we have a system maximizing S(E)CES(E) - C E for some value of CC. The maximum value of S(E)CES(E) - C E is called free entropy and denoted Φ\Phi. In short:

Φ(C)=sup E(S(E)CE) \Phi(C) = \sup_E \left(S(E) - C E \right)

or if you prefer

Φ(C)=inf E(CES(E)) -\Phi(C) = \inf_E \left(C E - S(E) \right)

This way of defining Φ-\Phi in terms of SS is called a Legendre–Fenchel transform, though conventions vary about the precise definition of this transform, and also its name. Since I’m lazy, I’ll just call it the Legendre transform. For more, read this:

  • Simon Willerton, Enrichment and the Legendre–Fenchel transform: Part I, Part II.

The great thing about Legendre transforms is that if a function is convex and lower semicontinuous, when you take its Legendre transform twice you get that function back! This is part of the Fenchel–Moreau theorem. So under these conditions we automatically get another formula that looks very much like the one we’ve just seen:

S(E)=inf C(CE+Φ(C)) S(E) = \inf_C \left(C E + \Phi(C) \right)

When CE+Φ(C)C E + \Phi(C) has a minimum as a function of CC and it’s differentiable there, this minimum must occur at a point where

ddC(CE+Φ(C))=0 \frac{d}{d C} \left(C E + \Phi(C) \right) = 0

or

ddCΦ(C)=E \frac{d}{d C} \Phi(C) = -E

Summary

I’m plotting a difficult course between sticking with historical conventions in thermodynamics and trying to make everything mathematically elegant. Everything above looks more elegant if we work with minus the free entropy, Ψ=Φ\Psi = -\Phi. Starting from a thermostatic system S:[,]S \colon \mathbb{R} \to [-\infty,\infty] we then get a beautifully symmetrical pair of relations:

Ψ(C)=inf E(CES(E)) \Psi(C) = \inf_E \left(C E - S(E) \right) S(E)=inf C(CEΨ(C)) S(E) = \inf_C \left(C E - \Psi(C) \right)

If the first infimum is achieved at some energy EE and SS is differentiable there, then

S(E)=C S'(E) = C

at this value of EE, and this formula lets us compute the inverse temperature CC as a function of EE. Similarly, if the second infimum is achieved at some CC and Ψ\Psi is differentiable there, then

Ψ(C)=E \Psi'(C) = E

at this value of CC, and this formula lets us compute EE as a function of CC.

When we describe a thermostatic system as a limit of classical statistical mechanical systems, these are the formulas we’d like to see emerge in the limit!

Appendix: the traditional formalism

If you’ve never heard of ‘free entropy’, you may be relieved to hear it’s a repackaging of the more familiar concept of ‘free energy’. The free energy FF, or more specifically the Helmholtz free energy, is related to the free entropy by

F=TΦ F = - T \Phi

Unless you’re a real die-hard fan of thermodynamics, don’t read the following stuff, since it will only further complicate the picture I’ve tried to paint above, which is already blemished by the fact that physicists prefer Φ\Phi to Φ=Ψ-\Phi = \Psi. I will not provide any profound new insights: I will merely relate what I’ve already explained to an equivalent but more traditional formalism.

I’ve been treating entropy as a function of energy: this is the so-called entropy scheme. But it’s traditional to treat energy as a function of entropy: this is called the energy scheme.

The entropy scheme generalizes better. In thermodynamics we often want to think about situations where entropy is a function of several variables: energy, volume, the amounts of various chemicals, and so on. Then we should work with a thermostatic system S:X[,]S \colon X \to [-\infty,\infty] where XX is a convex subset of n\mathbb{R}^n. Everything I did generalizes nicely to that situation, and now Ψ\Psi will be one of nn quantities that arise by taking a Legendre transform of SS.

But when entropy is a function of just one variable, energy, people often turn the tables and try to treat energy as a function of entropy, say E(S)E(S). They then define the free energy as a function of temperature by

F(T)=inf S(E(S)TS) F(T) = \inf_S (E(S) - T S)

This is essentially a Legendre transform — but notice that inside the parentheses we have E(S)TSE(S) - T S instead of TSE(S)T S - E(S). We can fix this by using a sup instead of an inf, and writing

F(T)=sup S(TSE(S)) -F(T) = \sup_S (T S - E(S))

It’s actually very common to define the Legendre transform using a sup instead of an inf, so that’s fine. The only wrinkle is that this Legendre transform gives us F-F instead of FF.

When the supremum is achieved at a point where EE is differentiable we have

ddSE(S)=T \displaystyle{ \frac{d}{d S} E(S) = T }

at that point. When EE is concave and lower semicontinuous, taking its Legendre transform twice gets us back where we started:

E(S)=sup T(TS+F(T)) E(S) = \sup_T (T S + F(T))

And when this supremum is achieved at a point where FF is differentiable, we have

ddTF(T)=S \displaystyle{ \frac{d}{d T} F(T) = - S }

To top it off, physicists tend to assume SS and TT take values where the suprema above are achieved, and not explicitly write what is a function of what. So they would summarize everything I just said with these equations:

F=ETS,E=TS+F F = E - T S , \qquad E = T S + F

dFdT=S,dSdE=T \displaystyle{ \frac{d F}{d T} = - S , \qquad \frac{d S}{d E} = T }

If instead we take the approach I’ve described, where entropy is treated as a function of energy, it’s natural to focus on the negative free entropy Ψ\Psi and inverse temperature CC. If we write the equations governing these in the same slapdash style as those above, they look like this:

Ψ=CES,S=CEΨ \Psi = C E - S, \qquad S = C E - \Psi

dΨdC=E,dSdE=C \displaystyle{ \frac{d \Psi}{d C} = E, \qquad \frac{d S}{d E} = C }

Less familiar, but more symmetrical! The two approaches are related by

C=1T,Ψ=FT \displaystyle{ C = \frac{1}{T}, \qquad \Psi = \frac{F}{T} }

Thermodynamics is a funny subject. The first time you go through it, you don’t understand it at all. The second time you go through it, you think you understand it, except for one or two points.The third time you go through it, you know you don’t understand it, but by that time you are so used to the subject, it doesn’t bother you anymore. — Arnold Sommerfeld

Posted at September 7, 2024 12:00 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3554

7 Comments & 2 Trackbacks

Re: The Space of Physical Frameworks (Part 2)

If you had the misfortune to read the above post before I wrote this comment here, please know that it had some serious mistakes which I’ve fixed now. I’ve also added a lot to the final section, including a funny quote at the end.

Posted by: John Baez on September 8, 2024 1:29 PM | Permalink | Reply to this

Re: The Space of Physical Frameworks (Part 2)

I love the quote, which actually applies to many subjects. I frequently bring up the von Neumann quote:

“… in mathematics you don’t understand things. You just get used to them.”

when I’m talking to students who get so hung up on trying to “understand” a concept that they are held back from trying to work with it.

Posted by: Mark Meckes on September 8, 2024 2:44 PM | Permalink | Reply to this

Re: The Space of Physical Frameworks (Part 2)

I never liked that von Neumann quote because I think sometimes in math you do understand things, and I treasure that. But maybe I was interpreting it wrong.

Posted by: John Baez on September 10, 2024 3:32 AM | Permalink | Reply to this

Re: The Space of Physical Frameworks (Part 2)

The von Neumann quote is an overstatement for sure. Yes, there are many things we can understand in mathematics, and we should always strive for understanding. But frequently you need to get used to working with a definition / theorem / algorithm etc first, before you can achieve what understanding there is to be had. I suspect that’s the spirit in which von Neumann intended that aphorism.

Posted by: Mark Meckes on September 10, 2024 4:03 AM | Permalink | Reply to this

Re: The Space of Physical Frameworks (Part 2)

As a youth I took it literally, felt confirmed in my opinion that von Neumann was a brutal calculating machine by anecdotes like “I summed the series”, and resolved to steer a different course. But of course von Neumann understood a lot of math.

Posted by: John Baez on September 10, 2024 1:34 PM | Permalink | Reply to this

Re: The Space of Physical Frameworks (Part 2)

By the way, I always thought of the “I summed the series” anecdote (about the total distance flown by a fly oscillating between two trains on a collision course, or some such thing) as actually giving a cute real-world-ish argument to show what the sum of the series is, if you haven’t previously memorized it. In a similar spirit to your recent post on Stirling’s Formula from Statistical Mechanics.

Posted by: Mark Meckes on September 10, 2024 3:21 PM | Permalink | Reply to this

Re: The Space of Physical Frameworks (Part 2)

I’m still worried that people won’t understand my next posts due to the unfamiliarity of free entropy Φ\Phi, much less the negative free entropy Ψ=Φ\Psi = -\Phi. It’s probably unrealistic to hope people can enjoy explanations of research I’m still struggling with myself, much less expect category theorists to enjoy the equations of thermodynamics. But for the benefit of those who dare to try, I added a bit to the Appendix comparing the key formulas involving Ψ\Psi to the more familiar ones involving the free energy FF:

To top it off, physicists tend to assume SS and TT take values where the suprema above are achieved, and not explicitly write what is a function of what. So they would summarize everything I just said with these equations:

F=ETS,E=TS+F F = E - T S , \qquad E = T S + F

dFdT=S,dSdE=T \displaystyle{ \frac{d F}{d T} = - S , \qquad \frac{d S}{d E} = T }

If instead we take the approach I’ve described, where entropy is treated as a function of energy, it’s natural to focus on the negative free entropy Ψ\Psi and inverse temperature CC. If we write the equations governing these in the same slapdash style as those above, they look like this:

Ψ=CES,S=CEΨ \Psi = C E - S, \qquad S = C E - \Psi

dΨdC=E,dSdE=C \displaystyle{ \frac{d \Psi}{d C} = E, \qquad \frac{d S}{d E} = C }

Less familiar, but more symmetrical! The two approaches are related by

C=1T,Ψ=FT \displaystyle{ C = \frac{1}{T}, \qquad \Psi = \frac{F}{T} }

Posted by: John Baez on September 10, 2024 1:53 PM | Permalink | Reply to this
Read the post The Space of Physical Frameworks (Part 4)
Weblog: The n-Category Café
Excerpt: I'll explain exactly what I mean by 'classical statistical mechanics', and how free entropy is defined in this subject. Then I'll show that as Boltzmann's constant approaches zero, this free entropy approaches the free entropy we've already seen in t...
Tracked: September 30, 2024 9:18 PM
Read the post The Space of Physical Frameworks (Part 5)
Weblog: The n-Category Café
Excerpt: Let's think about how classical statistical mechanics reduces to thermodynamics in the limit where Boltzmann's constant \(k\) approaches zero, by looking at an example.
Tracked: October 2, 2024 4:15 AM

Post a New Comment