Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

July 25, 2022

How to Apply Category Theory to Thermodynamics

Posted by Emily Riehl

guest post by Nandan Kulkarni and Chad Harper

This blog post discusses the paper “Compositional Thermostatics” by John Baez, Owen Lynch, and Joe Moeller. The series of posts on Dr. Baez’s blog gives a more thorough overview of the topics in the paper, and is probably a better primer if you intend to read it. Like the posts on Dr. Baez’s blog, this blog post also explains some aspects of the framework in an introductory manner. However, it takes the approach of emphasizes particular interesting details, and concludes in the treatment of a particular quantum system using ideas from the paper.

Thermostatics is the study of thermodynamic systems at equilibrium. For example, we can treat a box of gas as a thermodynamic system by describing it using a few of its “characteristic” properties — the number of particles, average volume, and average temperature. The box is said to be at equilibrium when these properties do not seem to change — If you were to take a thermometer and measure the temperature of the box at equilibrium, the thermometer’s reading should not change over time.

The paper “Compositional Thermostatics” introduces a framework in category theory for doing thermostatics. If the properties of some thermodynamic systems can be represented as a convex space, and given an entropy function for these systems, this framework can be used to compute the entropy function of a new system formed by coupling these systems together somehow.

In this post, we will offer an interpretation of this framework, and then apply the framework to a non-trivial problem in quantum mechanics involving density matrices. In other words, we will attempt to respond to the following questions, in order: How should a scientist think about this framework? Why and how would a scientist use this framework?

The importance of interpretation

If we wanted to, we could use the framework unthinkingly, as follows:

Flowchart of how the framework might be used. The state spaces and entropy functions, along with the way A and B are coupled together are the input. They pass through a black box called the framework, which gives an entropy function for the new system as the output.

Figure 1

In that case, the only “interpretive” work we’d have to do would be

  1. Figure out how to mathematically represent the physical systems using a convex state space and entropy function.
  2. Understanding what the output of the framework means.

For most simple thermodynamic systems, this would work pretty well, because the framework was built keeping these systems in mind. However, we would then lose out on the really interesting part — the inner workings of the framework itself.

A rectangle with the title Framework showing various equations and diagrams.

Figure 2

An “interpretation” of what all this machinery is doing offers an interpretation of the theory of thermostatics itself. I think this is what makes applied category theory really interesting. It is as much about figuring out how to think about a subject such as thermodynamics as building computational tools to study it. I hope that existing knowledge of category theory may then push our understanding of the subject even further through the interpretations we attach to the frameworks we build.

Setting up a thermostatic system: State spaces and entropy

When we describe the possible states of a thermodynamic system using points in set AA, such as using points in × >0× >0\mathbb{N}\times \mathbb{R}_{\gt 0}\times \mathbb{R}_{\gt 0} to refer to the number of particles, volume, and temperature of a box of gas, we call AA the “state space” of the physical system we are studying.

An entropy function associated with state space AA is a function S:A¯S : A \to \overline{\mathbb{R}}, where ¯\overline{\mathbb{R}} is the extended reals. Entropy can have many interpretations. According to the second law of thermodynamics, if the system can change from one state, represented by point aAa \in A, to another state, represented by bAb \in A, then S(b)S(a)0S(b) - S(a) \geq 0. When dealing with certain thermodynamic systems, we may extend this to the maximum entropy principle: A thermodynamic system with state space AA will evolve to a subset of AA where the entropy function is maximized (a set of equilibrium states), and remain there. Imagine a stretched rubber band pulling itself back to its original shape. We might say that the entropy of the rubber band is maximized when in the state of its original shape.

Thermodynamic systems are often studied under the effects of various constraints that limit their state space to a subset of the full space. For example, suppose you begin with two tanks of incompressible fluid, with tank #1 described by its energy U 1 >0U_1 \in \mathbb{R}_{\gt 0} and tank #2 similarly described by U 2U_2. Physics tells us that if a valve is opened allowing free exchange between these boxes, the new thermodynamic system is described by the sum of their energies U 1,2=U 1+U 2U_{1,2} = U_1 + U_2 due to energy conservation. The state space of the new system is >0\mathbb{R}_{\gt 0}.

When we consider the combined system, which we will call YY, made from tanks called X 1X_1 and X 2X_2, each point in the state space of YY corresponds to a whole range of states of X 1X_1 and X 2X_2 — An energy of u >0u \in \mathbb{R}_{\gt 0} for YY corresponds to {(u 1,u 2)|u 1+u 2=u} >0× >0\{(u_1, u_2)\ |\ u_1 + u_2 = u\} \subset \mathbb{R}_{\gt 0} \times \mathbb{R}_{\gt 0}, where a point in >0× >0\mathbb{R}_{\gt 0} \times \mathbb{R}_{\gt 0} represents the state of X 1X_1 together with a state of X 2X_2.

two rectangles colored in with blue, connected by a channel. One rectangle is labelled u1, and one is labelled u2. Both rectangles together with the channel is labelled u1 plus u2

Figure 3

The way to write this in the framework for thermodynamic systems in general as follows: Suppose you build a system YY with state space BB from system XX with state space AA. You can describe how YY is built from XX using a relation R:ABR : A \to B, where RA×BR\subset A \times B. In Figure 4, point pBp\in B represents a state of YY. If we observe pp, it could mean the system XX is at any state in the blue region. The knowledge of what exactly that blue region looks like comes from RR, which carries information about how exactly YY is formed from XX. Thus, for now, we may think of systems as morphisms in the category SetRel\mathsf{SetRel} — The category of sets and relations.

a circle labelled A and another circle labelled B. a cone is drawn from a single point in the circle B to a region of circle A.

Figure 4

So, knowing the entropy function SS on AA, and knowing R:ABR : A \to B, can we calculate the entropy for system YY with state represented by pBp\in B? This is possible using our understanding of how physical systems behave: We can interpret the maximum entropy principle to say that if a system described by R:ABR:A\to B is observed at a state respresented by pBp \in B, then we expect it to exist at a point in the blue region with maximum entropy. Thus, the entropy function S:B¯S' : B \to \overline{\mathbb{R}} for the new system is

S(p)=sup x{x|(x,p)R}S(x).S'(p) = \sup_{x \in \{x'\ |\ (x',p) \in R\}} S(x).

A more nuanced interpretation of the maximum entropy principle is that the system can and will evolve from any point in the blue region to a subset of the blue region with maximum entropy. More generally, since many entropy functions may exist for different systems described by the same state spaces, such as the tanks of fluid X 1X_1, X 2X_2, and YY all having state space >0\mathbb{R}_{\gt 0} in the earlier example, we might interpret the maximum entropy principle more strongly as follows: A thermodynamic system can evolve from any state corresponding to a point in the blue region to any other state with its point in the blue region, and additionally, the system will evolve to a collection of states corresponding to a subset of the blue region that maximizes entropy and remain there.

In the paper’s framework, we deal with sets equipped with “canonical paths” between points: For points aa and bb in state space AA, we have c(a,b):[0,1]Ac(a,b) : [0,1] \to A that follows certain rules:

Writing c λ(a,b)c_\lambda (a,b) for c(a,b)(λ)c(a,b)(\lambda), c 1(a,b) =a c λ(a,a) =a c λ(a,b) =c 1λ(b,a) \begin{aligned} c_1(a,b)&= a \\ c_\lambda (a,a) &= a \\ c_\lambda (a,b) &= c_{1-\lambda} (b,a) \\ \end{aligned} And where λ=λ*μ\lambda' = \lambda *\mu and 1λ=(1λ)(1μ)1-\lambda = (1-\lambda')(1-\mu'), c λ(c μ(a,b),c) =c λ(a,c μ(b,c)). \begin{aligned} c_\lambda (c_\mu (a,b),c) &= c_{\lambda'} (a, c_{\mu'} (b,c)). \\ \end{aligned}

By requiring that the state spaces be convex (ie. every two points a,ba,b in state space AA is associated with a path c(a,b):[0,1]Ac(a,b) : [0,1] \to A) and that relations R:ABR : A\to B be convex relations, we guarantee that for any RR, all “blue regions” of RR are convex as well. Thus, we choose to work in ConvRel\mathsf{ConvRel} rather than SetRel\mathsf{SetRel}. In this framework, if a subset of a state space is convex, then every state of the system in that subset is accessible from every other state.

An interpretation of this may be that the paths c(a,b)c(a,b) represent thermodynamic processes that can occur. If c(a,b)c(a,b) exists for some a,bAa, b \in A, then aa is accessible from bb by a known process, parameterized by λ[0,1]\lambda \in [0,1]. These are not necessarily the only allowed processes. But they are sufficient to describe which states are accessible from which others. This interpretation has some limitations. For one, the framework does not allow in general for blue regions that are just the range of c(a,b)c(a,b) for some a,ba,bc(a,b)c(a,b) is not necessarily convex, but a process from bb to aa would imply a process from c λ 1(a,b)c_{\lambda_1}(a,b) to c λ 2(a,b)c_{\lambda_2}(a,b) for some λ 1\lambda_1, λ 2\lambda_2.

Thus, we now have a full description of what should constitute a thermostatic system in the framework: A convex state space BB for the system YY of interest, convex relation R:ABR: A\to B describing how YY is constructed from some system XX with state space AA, and an entropy function S:A¯S : A \to \overline{\mathbb{R}}. If the system XX is used to build XX itself (ie. we are studying system XX on its own), let B=AB=A and R={(a,a)|aA}R = \{(a,a)\ |\ a\in A\}, so Ent[R](S)=S\mathrm{Ent}[R](S) = S. This says the entropy function for XX is the entropy function for XX.

Properties of entropy: Extensivity and concavity

The entropy function SS is not arbitrary. Two properties of entropy other than the maximum entropy principle that commonly appear are “extensivity” and “concavity”. There are thermodynamical theories which first assume extensivity and then prove concavity for certain classes of systems, but we assume both of these properties immediately and build them into the framework.

Extensivity is the property of how entropy functions of systems are composed when those systems are put together. It means the entropy of two systems together is the sum of the entropies of the individual systems. In the example of the tanks of incompressible fluid, if the entropy of X 1X_1 is S 1S_1 and the entropy of X 2X_2 is S 2S_2, then the entropy of the tanks coupled together is S 1+S 2S_1 + S_2. It is a prescription for the entropy function of a coupled system given the entropy functions on its individual components. This is built into the framework and is explained later in this blog post in the section on operads.

Concavity is a property that limits the shape entropy functions can have. An entropy function S:A¯S:A \to \overline{\mathbb{R}} is concave if for all λ\lambda, S(c λ(a,b))c λ(S(a),S(b))S(c_\lambda (a,b)) \geq c_\lambda (S(a),S(b)). Then, having the following convex structure on the extended reals,

c λ(a,b) =(1λ)b+λa for a,b c λ(a,) = for a c λ(a,) = for a c λ(,) = for 0<λ<1\begin{aligned} c_{\lambda}(a,b) &= (1-\lambda)b + \lambda a &\ \text{ for }\ a,b\in \mathbb{R} \\ c_\lambda (a,\infty) &= \infty &\ \text{ for }\ a \in \mathbb{R} \\ c_\lambda (a, -\infty) &= -\infty &\ \text{ for }\ a\in \mathbb{R} \\ c_\lambda (-\infty, \infty) &= -\infty &\ \text{ for }\ 0 \lt \lambda \lt 1 \end{aligned}

forces certain shapes for the allowed entropy functions:

several graphs. honestly not important.

Figure 5

While concavity of entropy is a property of many common thermodynamic systems, it has some limitation. In some theoretical foundations for thermodynamics, such as that in the book “Statistical Mechanics of Lattice Systems: A Concrete Mathematical Introduction” by Sacha Friedli and Yvan Velenik, concavity of entropy is not immediately assumed, and rather arises out of certain properties of the lattice systems they deal with and the extensivity of entropy.

Mathematical machinery: Functors and operads

The Ent\mathrm{Ent} functor

The functor Ent:ConvRelSet\mathrm{Ent}:\mathsf{ConvRel}\to \mathsf{Set} is defined as follows:

  • For convex space AA, Ent(A)\mathrm{Ent}(A) takes AA to the set of entropy functions on AA, which is a an object in Set\mathsf{Set}.
  • For morphism R:ABR : A \to B, Ent[R]\mathrm{Ent}[R] is a morphism Ent[R]:Ent(A)Ent(B)\mathrm{Ent}[R] : \mathrm{Ent}(A)\to \mathrm{Ent}(B). For entropy function S:A¯S : A \to \overline{\mathbb{R}}, we define

Ent[R](S)(p)=sup {a|(a,p)R}S(a).\begin{aligned}&\mathrm{Ent}[R](S)(p) = \sup_{\{a\ |\ (a,p)\in R\}} S(a).\end{aligned} Thus, the maximum entropy principle is built into this framework.

When we say Ent\mathrm{Ent} is a functor, we mean that it has certain nice properties: It preserves identity morphisms and compositionality. Identity preservation means that for an unconstrained system, represented by {(a,a)|aA}:AA\{(a,a)\ |\ a\in A\} : A \to A, and given an entropy function S:A¯S : A\to \overline{\mathbb{R}} on the system, the entropy function for the new system AA should still be SS. This is of course reasonable. Preserving compositionality means that given a series of relations R 1:AB 1,R 2:B 1B 2,...,R n:B n1B nR_1 : A \to B_1,\ R_2 : B_1 \to B_2, ...,\ R_n : B_{n-1}\to B_n, and an entropy function SS on AA, we have an equivalence between

  1. “Building” the entire system R nR n1...R 1R_n \circ R_{n-1} \circ ... \circ R_1 and calculating its entropy function.
  2. Calculating the entropy function of the system based on the entropy functions for each component of the system considered one at a time, ie. R 1R_1, then R 2R 1R_2\circ R_1, …, then finally R n...R 2R 1R_n\circ ... \circ R_2\circ R_1.

This ensures Ent\mathrm{Ent} works consistently for all state spaces and relations. We will never have a situation where computing the entropy function for R n...R 1R_n \circ ... \circ R_1 one way will give a different result from computing it any other way.

Because they’re everywhere in mathematics, we know a lot about functors in general in category theory. Thus, using functors in the framework also suggests opportunities to apply all our knowledge of category theory to deepen our interpretation of the framework.


Once again, I have been lying. The places we do thermostatics aren’t ConvRel\mathsf{ConvRel} and Set\mathsf{Set}, but rather Op(ConvRel)Op(\mathsf{ConvRel}) and Op(Set)Op(\mathsf{Set}) — the operads built over ConvRel\mathsf{ConvRel} and Set\mathsf{Set}. We simply use operads as a type of structure, like a category, where we can have state spaces and systems made from those state spaces represented by operations (the operad name for morphisms) between them.

The special thing about operads is that operations can take “inputs” from multiple types. For categories, we represent the morphisms from object AA to object BB as ABA\to B. For operad 𝕆\mathbb{O}, we represent the operations from types A 1,...,A nA_1, ..., A_n to BB as 𝕆(A 1,...,A n;B)\mathbb{O}(A_1, ..., A_n; B). We can also compose operations

Because ConvRel\mathsf{ConvRel} and Set\mathsf{Set} are symmetric monoidal categories (with a notion of tensoring objects together and some “obvious” symmetries), there are “obvious” operads we can build from them.

For a symmetric monoidal category C\mathbf{C} with tensor product \otimes, we can construct Op(C)Op(\mathbf{C}) as follows:

  1. The types in Op(C)Op(\mathbf{C}) are exactly the objects of C\mathbf{C}.
  2. For types A 1,...,A n,BA_1,...,A_n, B, operations of Op(C)(A 1,...,A n;B)Op(\mathbf{C})(A_1, ..., A_n; B) are exactly the morphisms A 1...A nBA_1\otimes ... \otimes A_n \to B.
  3. For operations f 1 Op(C)(A 1,1,A 2,1,...,A n 1,1;B 1)(ie.f 1:A 1,1...A n 1,1B 1) f 2 Op(C)(A 1,2,A 2,2,...,A n 2,2;B 2) f m Op(C)(A 1,m,A 2,m,...,A n m,m;B m) g Op(C)(B 1,...,B m;C)\begin{aligned} f_1 &\in Op(\mathbf{C})(A_{1,1}, A_{2,1}, ..., A_{n_1,1}; B_1)\ \text{(ie.}\ f_1: A_{1,1}\otimes...\otimes A_{n_1,1}\to B_1\text{)} \\ f_2 &\in Op(\mathbf{C})(A_{1,2}, A_{2,2}, ..., A_{n_2,2}; B_2) \\ \vdots & \\ f_m &\in Op(\mathbf{C})(A_{1,m}, A_{2,m}, ..., A_{n_m,m}; B_m) \\ g &\in Op(\mathbf{C})(B_1,...,B_m; C) \end{aligned} The composition of gg and f 1,...,f mf_1,...,f_m is g(f 1,...,f m)Op(C)(A 1,1,A 1,2,...,A n m,m;C)g(f_1,...,f_m) \in Op(\mathbf{C}) (A_{1,1},A_{1,2},...,A_{n_m, m}; C), ie. g(f 1,...,f m):A 1,1...A n m,mCg(f_1,...,f_m):A_{1,1}\otimes ...\otimes A_{n_m,m}\to C, defined as g(f 1,...,f m)=g(f 1...f m).g(f_1,...,f_m) = g \circ (f_1\otimes ...\otimes f_m).

We now demonstrate how this is applied to ConvRel\mathsf{ConvRel} and Set\mathsf{Set}.

The tensor product of ConvRel\mathsf{ConvRel} is the convex product ×\times defined as follows: For convex spaces A 1A_1, A 2A_2 (formed from sets also called A 1A_1 and A 2A_2) and convex structures defined by c 1(,)c^1(\cdot,\cdot) and c 2(,)c^2(\cdot, \cdot) respectively, A 1×A 2A_1\times A_2 is given by the set {(a 1,a 2)|a 1A 1,a 2A 2}\{(a_1,a_2)\ |\ a_1\in A_1, a_2\in A_2\} and the convex structure c 1,2(,)c^{1,2}(\cdot, \cdot) defined by c λ 1,2((a 1,a 2),(b 1,b 2))=(c λ 1(a 1,a 2),c λ 2(b 1,b 2))c^{1,2}_\lambda ((a_1, a_2),(b_1,b_2)) = (c^1_\lambda(a_1,a_2), c^2_\lambda(b_1,b_2)). The tensor product of Set\mathsf{Set} is just the set product, which we will also call ×\times for simplicity. The meaning of ×\times will be clear from context.

Thus, the types of Op(ConvRel)Op(\mathsf{ConvRel}) are convex spaces and elements of Op(ConvRel)(A 1,...,A n;B)Op(\mathsf{ConvRel})(A_1, ..., A_n;B) are convex relations, in particular the convex subsets of (A 1×...×A n)×B(A_1\times ...\times A_n)\times B. A convex relation R(A 1×...×A n)×BR\subset (A_1\times ...\times A_n)\times B represents a way of coupling together the systems represented by state spaces A 1,...,A nA_1,...,A_n to form the system represented by state space BB. The types in operad Op(Set)Op(\mathsf{Set}) are sets and elements of Op(Set)(A 1,...,A n;B)Op(\mathsf{Set})(A_1, ..., A_n; B) are the functions from A 1×...×A nA_1\times ...\times A_n to BB.

The symmetric monoidal structure of ConvRel\mathsf{ConvRel} has certain symmetries that can be interpreted to state that

  1. Coupling thermodynamic system XX to thermodynamic system YY is equivalent to coupling system XX to system YY — The isomorphism “braiding” β A,B:A×BB×A\beta_{A,B} : A\times B \to B \times A implies that Op(ConvRel)(A,B;C)Op(\mathsf{ConvRel})(A,B;C) and Op(ConvRel)(B,A;C)Op(\mathsf{ConvRel})(B,A;C) can represent the same couplings of systems.
  2. Coupling thermodynamic system XX to system YY and then coupling that new system to ZZ is equivalent to coupling YY to system ZZ and then coupling XX to that new system — The isomorphism “associator” α A,B,Z:(A×B)×CA×(B×C)\alpha_{A,B,Z} : (A\times B)\times C \to A\times (B\times C) lets Op(ConvRel)(A,B,C;D)Op(\mathsf{ConvRel})(A,B,C;D) be written unambiguously.

Suppose we have a system with state space C 1C_1 built from A 1,...,A nA_1,...,\ A_n, a system with state space C 2C_2 built from B 1,...,B mB_1,...,\ B_m, and a system with state space DD built from C 1C_1 and C 2C_2.

By itself, the category ConvRel\mathsf{ConvRel} does not have the right structure to make our framework functional for coupling multiple systems in an elegant way. In ConvRel\mathsf{ConvRel}, we could build R A:A 1×...×A nC 1R_A : A_1\times ...\times A_n \to C_1, R B:B 1×...×B mC 2R_B : B_1\times ...\times B_m\to C_2, and R C:C 1×C 2DR_C : C_1\times C_2 \to D, but there is no prescription in the framework for how to compose R CR_C with both R AR_A and R BR_B. We would have to build the system all at once “by hand”, with relation R:A 1×...×A n×B 1×...×B mDR : A_1 \times ... \times A_n \times B_1 \times ... \times B_m \to D calculated from R AR_A, R BR_B, and R CR_C.

The operadic approach, on the other hand, has the process of getting RR from R AR_A, R BR_B, and R CR_C built into it through how composition is defined. Using Op(ConvRel)Op(\mathsf{ConvRel}), the process of coupling A 1,...,A nA_1, ..., A_n to form C 1C_1 and B 1,...,B mB_1,...,B_m to form C 2C_2, and then coupling C 1C_1 and C 2C_2 to form DD would look like this:

a wiring diagram with wires coming from A1 ... An to C1, from B1 ... Bm to C2, and from C1 C2 to D

Figure 6

Here, each group of arrows represents a relation. Much more intuitive and elegant.

The work we did in ConvRel\mathsf{ConvRel} was still important, because the functor Ent\mathrm{Ent} we defined can be used to define a similar “map of operads” (a functor between operads) from Op(ConvRel)Op(\mathsf{ConvRel}) to Op(Set)Op(\mathsf{Set}). We’ll call this map of operads OpEnt\mathrm{OpEnt}. We define it as follows:

  • For type AA, OpEnt(A)\mathrm{OpEnt}(A) takes AA to Ent(A)\mathrm{Ent}(A), the set of entropy functions on AA, which is a type in Op(Set)Op(\mathsf{Set}).
  • For operation R:Op(ConvRel)(A 1,...,A n;B)R : Op(\mathsf{ConvRel})(A_1, ..., A_n; B), OpEnt[R]\mathrm{OpEnt}[R] is an operation in Op(Set)(Ent(A 1),...,Ent(A n);Ent(B))Op(\mathsf{Set})(\mathrm{Ent}(A_1), ..., \mathrm{Ent}(A_n); \mathrm{Ent}(B)). For entropy functions S 1,...,S n on A 1,...,A nS_1,...,S_n\ \text{ on }\ A_1,...,A_n, we define OpEnt[R](S 1,...,S n)(p) =Ent[R]ϵ A 1,...,A n(S 1,...,S n)(p) =sup {a 1,...,a n|((a 1,...,a n),p)R}S 1(a 1)+...+S n(a n).\begin{aligned}\mathrm{OpEnt}[R](S_1,...,S_n)(p) &= \mathrm{Ent}[R]\circ \epsilon_{A_1,...,A_n} (S_1,..., S_n)(p) \\ &= \sup_{\{a_1,...,a_n\ |\ ((a_1,...,a_n),p)\in R\}} S_1(a_1)+...+S_n (a_n).\end{aligned} The details of how ϵ\epsilon is defined are not important here, but ϵ\epsilon is called the “laxator” and it’s what builds extensivity of entropy into the framework.

One could say in summary that the operadic structure allows systems to be coupled together, Ent\mathrm{Ent} adds the maximum entropy principle to the framework, and ϵ\epsilon adds extensivity.

Putting it all together

The final operad-based framework for doing thermostatics looks like this when applied to example of tanks of incompressible fluids:

To reiterate, we can describe tanks X 1X_1 and X 2X_2 using the state space >0\mathbb{R}_{\gt 0}. We can couple these tanks to get a system also described by >0\mathbb{R}_{\gt 0}. This new system formed from coupling the two tanks is described as a relation

R={((u 1,u 2),u)|u >0,u 1 >0,u 2 >0,u 1+u 2=u}( >0× >0)× >0.R = \{\left((u_1, u_2), u\right)\ |\ u\in \mathbb{R}_{\gt 0},\ u_1\in \mathbb{R}_{\gt 0},\ u_2\in \mathbb{R}_{\gt 0},\ u_1+u_2 = u\}\subset (\mathbb{R}_{\gt 0}\times \mathbb{R}_{\gt 0})\times \mathbb{R}_{\gt 0}.

This is an operation R:Op(ConvRel)( >0, >0; >0)R : Op(\mathsf{ConvRel})(\mathbb{R}_{\gt 0}, \mathbb{R}_{\gt 0} ; \mathbb{R}_{\gt 0}).

We have

OpEnt[R]:Op(Set)(OpEnt( >0),OpEnt( >0);OpEnt( >0))\mathrm{OpEnt}[R] : Op(\mathsf{Set})(\mathrm{OpEnt}(\mathbb{R}_{\gt 0}), \mathrm{OpEnt}(\mathbb{R}_{\gt 0}); \mathrm{OpEnt}(\mathbb{R}_{\gt 0}))

given by

OpEnt[R](S 1,S 2)(u) =Ent[R]ϵ >0, >0(S 1,S 2)(u) =sup ((u 1,u 2),u)RS 1(u 1)+S 2(u 2).\begin{aligned}\mathrm{OpEnt}[R](S_1, S_2)(u) &=\mathrm{Ent}[R]\circ \epsilon_{\mathbb{R}_{\gt 0},\mathbb{R}_{\gt 0}}(S_1, S_2)(u)\\ &= \sup_{((u_1, u_2), u)\in R} S_1(u_1) + S_2(u_2).\end{aligned}

A physicist tells us that the entropy functions S 1S_1 and S 2S_2 for X 1X_1 and X 2X_2 respectively are given by S 1(x)=C 1log(x)S_1(x) = C_1\log(x) and S 2(x)=C 2log(x)S_2(x) = C_2\log(x). Using the method of Lagrange multipliers, we can then calculate that S 1(u 1)+S 2(u 2)S_1(u_1) + S_2(u_2) is maximized when u 1=C 1C 1+C 2uu_1 = \frac{C_1}{C_1+C_2} u and u 2=C 2C 1+C 2uu_2 = \frac{C_2}{C_1+C_2} u.

Thus, we can calculate that the entropy function on the new coupled system is

OpEnt[R](C 1log(x),C 2log(x)) =C 1log(C 1C 1+C 2u)+C 2log(C 2C 1+C 2u) =(C 1+C 2)log(u)+C 1log(C 1C 1+C 2u) +C 2log(C 2C 1+C 2u).\begin{aligned}\mathrm{OpEnt}[R](C_1\log(x), C_2\log(x)) &= C_1\log\left(\frac{C_1}{C_1+C_2} u\right) + C_2\log\left(\frac{C_2}{C_1+C_2} u\right) \\ &= (C_1+C_2)\log(u) + C_1\log\left(\frac{C_1}{C_1+C_2} u\right)\\ & + C_2\log\left(\frac{C_2}{C_1+C_2} u\right). \end{aligned}

Having introduced the framework with some simple, expository examples, we will now explore how this framework can be applied to quantum systems.

Application: Quantum Systems And A Thermal Bath

With a picture of the category theoretic framework in hand we move to an application. We will compose two thermostatic systems: a heat bath and a system from the realm of quantum mechanics. However, before composing the two systems we will give a brief primer on some of the relevant concepts in quantum mechanics.

Quantum Systems

In quantum mechanics, physicists will call a complex-valued function 𝑓&#119891; defined on 1\mathbb{R}^1 (if an electron is confined to a line, for example) or on 2\mathbb{R}^2 (if the electron is confined to a plane) where |𝑓| 2=1\int|&#119891;|^2=1

a wave function.

Some examples of wave functions and their applications are as follows:

For a normalized position wave function ψ(x)\psi(x) in the configuration space of – for example – an electron, |ψ(x)| 2|\psi(x)|^2 is the probability density for making measurements of that configuration on an ensemble of such systems.

So then, for a wave function E(x,t)E(x,t) of a light wave, where xx is position and tt is time, |E| 2|E|^2 is the energy density, where EE is the electric field intensity.

Given a wave function in position space ψ(x)\psi(x), one can take a Fourier transform to get the associated wave function in momentum space , ϕ(p)\phi(p): ϕ(p)=12πe ipx/ψ(x)dx\phi(p) = {1\over{\sqrt{2\pi\hbar}}}\int e^{ipx/\hbar} \psi(x) dx This is true in the discrete or continous case, though in the discrete case one would use the discrete Fourier transform.

Physicists will refer to the space of complex normed functions (wave functions) as a Hilbert space. Our focus will be finite-dimensional spaces which can still be called Hilbert spaces because completeness comes automatically in finite dimensions.

Hilbert Spaces

An inner product ,\langle \cdot , \cdot\rangle over a vector space HH gives rise to a norm

x=x,x\lVert x\rVert =\sqrt{\langle x, x\rangle }

If every Cauchy sequence in HH with respect to this norm converges in HH, we say that HH is complete and a Hilbert space. The first of the six postulates of quantum mechanics (pg. 18) states that every physical system is assocated with a Hilbert space.

Completeness is important because it guarantees that we can approximate smooth functions and allows us to prove the Reiz Representation Theorem.

The Reiz Representation Theorem states that if TT is a bounded linear functional on a Hilbert space HH, then there exists some gHg \in H such that for every fHf \in H we have T(f)=f,gT(f) =\langle f, g \rangle. Moreover, T=g\lVert T\rVert = \lVert g\rVert.

This result establishes the dual correspondence to HH and therefore use bracket notation ψ|ψ\langle\psi|\psi \rangle, where a “ket” |ψ\left\vert \psi\right\rangle is a state vector representing some state of the system.

Hilbert spaces obey the Parallelogram Law and therefore are uniformily convex Banach spaces. That is, they are convex spaces.

The Density Matrix (Density Operator)

The Second Postulate of Quantum Mechanics: Every state of a physical system is associated with a density operator ρ&#961; acting on Hilbert space, which is a Hermitian, nonnegative definite operator of unit trace, tr(ρ)=1tr(&#961;) = 1.

An operator A^\hat{A} is an “object” that maps one state vector, |ψ\left\vert \psi\right\rangle, into another, |ϕ\left\vert \phi\right\rangle , so A^|ψ=|ϕ\hat{A}\left\vert \psi\right\rangle = \left\vert \phi \right\rangle.

For us we are concerned primarily with projective operators: An operator PP is projective if it is an observable that satisfies P=P 2.P=P^2.

Density Operators It turns out that state vector/wave function representions of physical states are subject to phase conventions and can only represent pure states in a Hilbert space. The interested reader can learn more about pure states here but they can be thought of as elements of a Hilbert space with norm 1.

When a system is not in a pure state – maybe you are analyzing an ensemble of electrons that aren’t polarized in any particular direction – you cannot describe the system with a state vector. However, such mixed states can be described as convex combinations of projection operators.

In this case, a more general formalism is useful: associate our physical state with a positive semi-definite Hermitian operator of trace one acting on the Hilbert space of the system. This is called a density operator. It provides a useful way to characterize the state of the ensemble of quantum systems. ρ= ip i|ψ iψ i| (discrete)\rho=\sum_{i}p_{i}\left\vert \psi_i\right\rangle \left\langle\psi_i\right\vert \ \text{ (discrete)}

ρ=f(λ)|ψ(λ)ψ(λ)|dλ (continuous)\rho=\int f(\lambda)\left\vert \psi(\lambda)\right\rangle \left\langle \psi(\lambda)\right\vert \ d\lambda\ \text{ (continuous)} When ρ\rho represents a pure state, ρ 2 =ρ tr(ρ) =1 (It has a purity of one)\begin{aligned} \rho^2&=\rho \\ tr(\rho)&=1 \ \text{ (It has a purity of one)} \end{aligned} For mixed states, the eigenvalues of the matrix representation of the associated density operator sum to one. For pure states, the principal eigenvalue will be 1 and the others will be 0. Pure states cannot be written as convex combinations, but mixed states can be written as convex combinations of pure states.

In an interpretation of quantum mechanics, the density operator ρ\rho, which is measurable on an ensemble of identically prepared systems, allows us to predict the expectation value (outcome) of any experiments performed on that system A=tr(ρA)\langle A\rangle=tr(\rho A) for some observable AA.


For a brief introduction to entropy see the blog post written by Manoja Namuduri and Lia Yeh.

The entropy of a density matrix looks analagous to the Shannon entropy) of a probability distribution:

S(ρ A)=Tr[ρ Alog(ρ A)]S(\rho_A)=-Tr\left[\rho_A \log(\rho_A)\right]

It’s shown here that this expression – called the von Neumann Entropy – is concave, ie.

tS(ρ 1)+(1t)S(ρ 2)S(tρ 1+(1t)ρ 2)=S(ρ(t))t S(&#961;_1) + (1 &#8722; t)S(&#961;_2) \leq S(t&#961;_1 + (1 &#8722; t)&#961;_2) = S(&#961;(t))

More generally,

ip iS(ρ i)S(ρ)\sum_i p_i S(\rho_i)\leq S(\rho)

Therefore we can see that when we mix systems, the entropy will only increase.


…of two different thermostatic systems

For a quantum system, take an observable with some value on a mixed state. Couple that observable with a heatbath and we should get a new system that’s analagous to the canonical distribution.

Following exactly the work done in Example 36, we consider two thermostatics systems: 1. A heatbath with State space: Entropy function:S(U bath)=U bathT\begin{aligned} &\text{State space:}\ \mathbb{R} \\ &\text{Entropy function:}\ S(U_{bath})={U_{bath}\over{T}} \end{aligned} 2. A quantum system with State space:X Entropy function:S VN=Tr[ρ Alog(ρ A)]\begin{aligned} &\text{State space:}\ X \\ &\text{Entropy function:}\ S_{VN}=-Tr\left[\rho_A \log(\rho_A)\right] \end{aligned}

Where XX is the set of density matrices and ρ\rho is the density operator for a discrete ensemble of pure states |ψ i\left\vert &#968;_i\right\rangle with statistical weights p ip_i.

We construct the convex relation demanding that the expectation value of the Hamiltonian operartor, H, of our quantum system is equal to the heat loss from the heat bath: H=ΔU\langle \mathbf{H}\rangle=-\Delta U

With our earlier defintion of the expectation value, the above equation implies UT=Tr[ ip i|ψ iψ i|H]{U\over{T}}=-Tr\left[\sum_{i}p_{i}\left\vert \psi_i\right\rangle \left\langle \psi_i\right\vert \mathbf{H}\right]

so that the new entropy is given by sup ρ(S VNTr[H ip i|ψ iψ i|]H)\sup_{\rho}\left(S_{VN}-Tr\left[\mathbf{H}\sum_{i}p_{i}\left\vert \psi_i\right\rangle \left\langle\psi_i\right\vert \right]\mathbf{H}\right)

Posted at July 25, 2022 9:54 PM UTC

TrackBack URL for this Entry:

5 Comments & 0 Trackbacks

Re: How to apply category theory to thermodynamics

Thanks for the post.

There’s a broken link here:

For a brief introduction to entropy see the blog post written by Manoja Namuduri and Lia Yeh.

The link destination is empty.

Posted by: Tom Leinster on July 25, 2022 11:56 PM | Permalink | Reply to this

Re: How to apply category theory to thermodynamics

Hi Dr. Leinster, thanks for the reminder. The post we’d like to link to does not exist yet. I’ll try to update the link once the it is uploaded to n-Category Café.

Posted by: Nandan K on August 3, 2022 7:11 AM | Permalink | Reply to this

Re: How to apply category theory to thermodynamics

Could nonequilibrium thermodynamics be made to fit within this categorical framework?

Posted by: Madeleine Birchfield on July 26, 2022 1:36 AM | Permalink | Reply to this

Re: How to apply category theory to thermodynamics

Sorry for taking so long to respond!

I think the nature of a particular framework for non-equilibrium thermodynamics may depend on the particular physical systems it is designed to study. What characteristics of these systems when out of equilibrium should we try to capture in the categorical framework? The time-evolution of a deterministic system? The possible steady/cyclic states of a system? How should the environment of a system be considered? Since I haven’t really worked closely with a variety of non-equilibrium thermodynamical systems, I hesitate to say.

However, in the context of the framework discussed in this blog post, my understanding of non-equilibrium thermodynamics is that we would have to sacrifice the maximum entropy principle, since the interpretation of that principle only applies to systems at equilibrium. Our construction of Op(ConvRel)Op(\mathsf{ConvRel}) only deals with state spaces and how they can be coupled together, so I would expect that to still make sense in the context of non-equilibrium thermodynamics. Of course, the Ent\mathrm{Ent} functor (or map of operads) would no longer apply, and I think we might need more information about the systems than just their state spaces to describe how they behave out of equilibrium.

I tried to generalize this framework a bit during research week, and I settled on characterizing thermodynamic systems by their states plus information about which states are accessible from which other states. Maybe including information about the environment of the system and the evolution of the system as its environment changes over time can help treat non-equilibrium thermodynamics with a similar framework.

In conclusion, a more capable framework for thermodynamics in general would require a lot more machinery than this thermostatics framework as it is. I would love to hear more perspectives of non-equilibrium thermodynamics from people.

Posted by: Nandan K on September 3, 2022 9:19 PM | Permalink | Reply to this

Re: How to Apply Category Theory to Thermodynamics

There’s a typo in the section “Putting it all together”. The entropy of the coupled system of tanks should read

OpEnt[R](C 1log(x),C 2log(x)) =C 1log(C 1C 1+C 2u)+C 2log(C 2C 1+C 2u) =(C 1+C 2)log(u)+C 1log(C 1C 1+C 2) +C 2log(C 2C 1+C 2).\begin{aligned}\mathrm{OpEnt}[R](C_1\log(x), C_2\log(x)) &= C_1\log\left(\frac{C_1}{C_1+C_2} u\right) + C_2\log\left(\frac{C_2}{C_1+C_2} u\right) \\ &= (C_1+C_2)\log(u) + C_1\log\left(\frac{C_1}{C_1+C_2}\right)\\ & + C_2\log\left(\frac{C_2}{C_1+C_2}\right). \end{aligned}

Posted by: Nandan K on October 12, 2022 5:57 PM | Permalink | Reply to this

Post a New Comment