How to Apply Category Theory to Thermodynamics

Posted by Emily Riehl

$MathML-enabled post (click for more details).$

guest post by Nandan Kulkarni and Chad Harper

This blog post discusses the paper “Compositional Thermostatics” by John Baez, Owen Lynch, and Joe Moeller. The series of posts on Dr. Baez’s blog gives a more thorough overview of the topics in the paper, and is probably a better primer if you intend to read it. Like the posts on Dr. Baez’s blog, this blog post also explains some aspects of the framework in an introductory manner. However, it takes the approach of emphasizes particular interesting details, and concludes in the treatment of a particular quantum system using ideas from the paper.

$MathML-enabled post (click for more details).$

Thermostatics is the study of thermodynamic systems at equilibrium. For example, we can treat a box of gas as a thermodynamic system by describing it using a few of its “characteristic” properties — the number of particles, average volume, and average temperature. The box is said to be at equilibrium when these properties do not seem to change — If you were to take a thermometer and measure the temperature of the box at equilibrium, the thermometer’s reading should not change over time.

The paper “Compositional Thermostatics” introduces a framework in category theory for doing thermostatics. If the properties of some thermodynamic systems can be represented as a convex space, and given an entropy function for these systems, this framework can be used to compute the entropy function of a new system formed by coupling these systems together somehow.

In this post, we will offer an interpretation of this framework, and then apply the framework to a non-trivial problem in quantum mechanics involving density matrices. In other words, we will attempt to respond to the following questions, in order: How should a scientist think about this framework? Why and how would a scientist use this framework?

The importance of interpretation

If we wanted to, we could use the framework unthinkingly, as follows:

Flowchart of how the framework might be used. The state spaces and entropy functions, along with the way A and B are coupled together are the input. They pass through a black box called the framework, which gives an entropy function for the new system as the output.

Figure 1

In that case, the only “interpretive” work we’d have to do would be

Figure out how to mathematically represent the physical systems using a convex state space and entropy function.
Understanding what the output of the framework means.

For most simple thermodynamic systems, this would work pretty well, because the framework was built keeping these systems in mind. However, we would then lose out on the really interesting part — the inner workings of the framework itself.

A rectangle with the title Framework showing various equations and diagrams.

Figure 2

An “interpretation” of what all this machinery is doing offers an interpretation of the theory of thermostatics itself. I think this is what makes applied category theory really interesting. It is as much about figuring out how to think about a subject such as thermodynamics as building computational tools to study it. I hope that existing knowledge of category theory may then push our understanding of the subject even further through the interpretations we attach to the frameworks we build.

Setting up a thermostatic system: State spaces and entropy

When we describe the possible states of a thermodynamic system using points in set $A$ , such as using points in $\mathbb{N}\times \mathbb{R}_{\gt 0}\times \mathbb{R}_{\gt 0}$ to refer to the number of particles, volume, and temperature of a box of gas, we call $A$ the “state space” of the physical system we are studying.

An entropy function associated with state space $A$ is a function $S : A \to \overline{\mathbb{R}}$ , where $\overline{\mathbb{R}}$ is the extended reals. Entropy can have many interpretations. According to the second law of thermodynamics, if the system can change from one state, represented by point $a \in A$ , to another state, represented by $b \in A$ , then $S(b) - S(a) \geq 0$ . When dealing with certain thermodynamic systems, we may extend this to the maximum entropy principle: A thermodynamic system with state space $A$ will evolve to a subset of $A$ where the entropy function is maximized (a set of equilibrium states), and remain there. Imagine a stretched rubber band pulling itself back to its original shape. We might say that the entropy of the rubber band is maximized when in the state of its original shape.

Thermodynamic systems are often studied under the effects of various constraints that limit their state space to a subset of the full space. For example, suppose you begin with two tanks of incompressible fluid, with tank #1 described by its energy $U_1 \in \mathbb{R}_{\gt 0}$ and tank #2 similarly described by $U_2$ . Physics tells us that if a valve is opened allowing free exchange between these boxes, the new thermodynamic system is described by the sum of their energies $U_{1,2} = U_1 + U_2$ due to energy conservation. The state space of the new system is $\mathbb{R}_{\gt 0}$ .

When we consider the combined system, which we will call $Y$ , made from tanks called $X_1$ and $X_2$ , each point in the state space of $Y$ corresponds to a whole range of states of $X_1$ and $X_2$ — An energy of $u \in \mathbb{R}_{\gt 0}$ for $Y$ corresponds to $\{(u_1, u_2)\ |\ u_1 + u_2 = u\} \subset \mathbb{R}_{\gt 0} \times \mathbb{R}_{\gt 0}$ , where a point in $\mathbb{R}_{\gt 0} \times \mathbb{R}_{\gt 0}$ represents the state of $X_1$ together with a state of $X_2$ .

two rectangles colored in with blue, connected by a channel. One rectangle is labelled u1, and one is labelled u2. Both rectangles together with the channel is labelled u1 plus u2

Figure 3

The way to write this in the framework for thermodynamic systems in general as follows: Suppose you build a system $Y$ with state space $B$ from system $X$ with state space $A$ . You can describe how $Y$ is built from $X$ using a relation $R : A \to B$ , where $R\subset A \times B$ . In Figure 4, point $p\in B$ represents a state of $Y$ . If we observe $p$ , it could mean the system $X$ is at any state in the blue region. The knowledge of what exactly that blue region looks like comes from $R$ , which carries information about how exactly $Y$ is formed from $X$ . Thus, for now, we may think of systems as morphisms in the category $\mathsf{SetRel}$ — The category of sets and relations.

a circle labelled A and another circle labelled B. a cone is drawn from a single point in the circle B to a region of circle A.

Figure 4

So, knowing the entropy function $S$ on $A$ , and knowing $R : A \to B$ , can we calculate the entropy for system $Y$ with state represented by $p\in B$ ? This is possible using our understanding of how physical systems behave: We can interpret the maximum entropy principle to say that if a system described by $R:A\to B$ is observed at a state respresented by $p \in B$ , then we expect it to exist at a point in the blue region with maximum entropy. Thus, the entropy function $S' : B \to \overline{\mathbb{R}}$ for the new system is

$S'(p) = \sup_{x \in \{x'\ |\ (x',p) \in R\}} S(x).$

A more nuanced interpretation of the maximum entropy principle is that the system can and will evolve from any point in the blue region to a subset of the blue region with maximum entropy. More generally, since many entropy functions may exist for different systems described by the same state spaces, such as the tanks of fluid $X_1$ , $X_2$ , and $Y$ all having state space $\mathbb{R}_{\gt 0}$ in the earlier example, we might interpret the maximum entropy principle more strongly as follows: A thermodynamic system can evolve from any state corresponding to a point in the blue region to any other state with its point in the blue region, and additionally, the system will evolve to a collection of states corresponding to a subset of the blue region that maximizes entropy and remain there.

In the paper’s framework, we deal with sets equipped with “canonical paths” between points: For points $a$ and $b$ in state space $A$ , we have $c(a,b) : [0,1] \to A$ that follows certain rules:

Writing $c_\lambda (a,b)$ for $c(a,b)(\lambda)$ , $\begin{aligned} c_1(a,b)&= a \\ c_\lambda (a,a) &= a \\ c_\lambda (a,b) &= c_{1-\lambda} (b,a) \\ \end{aligned}$ And where $\lambda' = \lambda *\mu$ and $1-\lambda = (1-\lambda')(1-\mu')$ , $\begin{aligned} c_\lambda (c_\mu (a,b),c) &= c_{\lambda'} (a, c_{\mu'} (b,c)). \\ \end{aligned}$

By requiring that the state spaces be convex (ie. every two points $a,b$ in state space $A$ is associated with a path $c(a,b) : [0,1] \to A$ ) and that relations $R : A\to B$ be convex relations, we guarantee that for any $R$ , all “blue regions” of $R$ are convex as well. Thus, we choose to work in $\mathsf{ConvRel}$ rather than $\mathsf{SetRel}$ . In this framework, if a subset of a state space is convex, then every state of the system in that subset is accessible from every other state.

An interpretation of this may be that the paths $c(a,b)$ represent thermodynamic processes that can occur. If $c(a,b)$ exists for some $a, b \in A$ , then $a$ is accessible from $b$ by a known process, parameterized by $\lambda \in [0,1]$ . These are not necessarily the only allowed processes. But they are sufficient to describe which states are accessible from which others. This interpretation has some limitations. For one, the framework does not allow in general for blue regions that are just the range of $c(a,b)$ for some $a,b$ — $c(a,b)$ is not necessarily convex, but a process from $b$ to $a$ would imply a process from $c_{\lambda_1}(a,b)$ to $c_{\lambda_2}(a,b)$ for some $\lambda_1$ , $\lambda_2$ .

Thus, we now have a full description of what should constitute a thermostatic system in the framework: A convex state space $B$ for the system $Y$ of interest, convex relation $R: A\to B$ describing how $Y$ is constructed from some system $X$ with state space $A$ , and an entropy function $S : A \to \overline{\mathbb{R}}$ . If the system $X$ is used to build $X$ itself (ie. we are studying system $X$ on its own), let $B=A$ and $R = \{(a,a)\ |\ a\in A\}$ , so $\mathrm{Ent}[R](S) = S$ . This says the entropy function for $X$ is the entropy function for $X$ .

Properties of entropy: Extensivity and concavity

The entropy function $S$ is not arbitrary. Two properties of entropy other than the maximum entropy principle that commonly appear are “extensivity” and “concavity”. There are thermodynamical theories which first assume extensivity and then prove concavity for certain classes of systems, but we assume both of these properties immediately and build them into the framework.

Extensivity is the property of how entropy functions of systems are composed when those systems are put together. It means the entropy of two systems together is the sum of the entropies of the individual systems. In the example of the tanks of incompressible fluid, if the entropy of $X_1$ is $S_1$ and the entropy of $X_2$ is $S_2$ , then the entropy of the tanks coupled together is $S_1 + S_2$ . It is a prescription for the entropy function of a coupled system given the entropy functions on its individual components. This is built into the framework and is explained later in this blog post in the section on operads.

Concavity is a property that limits the shape entropy functions can have. An entropy function $S:A \to \overline{\mathbb{R}}$ is concave if for all $\lambda$ , $S(c_\lambda (a,b)) \geq c_\lambda (S(a),S(b))$ . Then, having the following convex structure on the extended reals,

$\begin{aligned} c_{\lambda}(a,b) &= (1-\lambda)b + \lambda a &\ \text{ for }\ a,b\in \mathbb{R} \\ c_\lambda (a,\infty) &= \infty &\ \text{ for }\ a \in \mathbb{R} \\ c_\lambda (a, -\infty) &= -\infty &\ \text{ for }\ a\in \mathbb{R} \\ c_\lambda (-\infty, \infty) &= -\infty &\ \text{ for }\ 0 \lt \lambda \lt 1 \end{aligned}$

forces certain shapes for the allowed entropy functions:

several graphs. honestly not important.

Figure 5

While concavity of entropy is a property of many common thermodynamic systems, it has some limitation. In some theoretical foundations for thermodynamics, such as that in the book “Statistical Mechanics of Lattice Systems: A Concrete Mathematical Introduction” by Sacha Friedli and Yvan Velenik, concavity of entropy is not immediately assumed, and rather arises out of certain properties of the lattice systems they deal with and the extensivity of entropy.

Mathematical machinery: Functors and operads

The $\mathrm{Ent}$ functor

The functor $\mathrm{Ent}:\mathsf{ConvRel}\to \mathsf{Set}$ is defined as follows:

For convex space $A$ , $\mathrm{Ent}(A)$ takes $A$ to the set of entropy functions on $A$ , which is a an object in $\mathsf{Set}$ .
For morphism $R : A \to B$ , $\mathrm{Ent}[R]$ is a morphism $\mathrm{Ent}[R] : \mathrm{Ent}(A)\to \mathrm{Ent}(B)$ . For entropy function $S : A \to \overline{\mathbb{R}}$ , we define

$\begin{aligned}&\mathrm{Ent}[R](S)(p) = \sup_{\{a\ |\ (a,p)\in R\}} S(a).\end{aligned}$ Thus, the maximum entropy principle is built into this framework.

When we say $\mathrm{Ent}$ is a functor, we mean that it has certain nice properties: It preserves identity morphisms and compositionality. Identity preservation means that for an unconstrained system, represented by $\{(a,a)\ |\ a\in A\} : A \to A$ , and given an entropy function $S : A\to \overline{\mathbb{R}}$ on the system, the entropy function for the new system $A$ should still be $S$ . This is of course reasonable. Preserving compositionality means that given a series of relations $R_1 : A \to B_1,\ R_2 : B_1 \to B_2, ...,\ R_n : B_{n-1}\to B_n$ , and an entropy function $S$ on $A$ , we have an equivalence between

“Building” the entire system $R_n \circ R_{n-1} \circ ... \circ R_1$ and calculating its entropy function.
Calculating the entropy function of the system based on the entropy functions for each component of the system considered one at a time, ie. $R_1$ , then $R_2\circ R_1$ , …, then finally $R_n\circ ... \circ R_2\circ R_1$ .

This ensures $\mathrm{Ent}$ works consistently for all state spaces and relations. We will never have a situation where computing the entropy function for $R_n \circ ... \circ R_1$ one way will give a different result from computing it any other way.

Because they’re everywhere in mathematics, we know a lot about functors in general in category theory. Thus, using functors in the framework also suggests opportunities to apply all our knowledge of category theory to deepen our interpretation of the framework.

Operads

Once again, I have been lying. The places we do thermostatics aren’t $\mathsf{ConvRel}$ and $\mathsf{Set}$ , but rather $Op(\mathsf{ConvRel})$ and $Op(\mathsf{Set})$ — the operads built over $\mathsf{ConvRel}$ and $\mathsf{Set}$ . We simply use operads as a type of structure, like a category, where we can have state spaces and systems made from those state spaces represented by operations (the operad name for morphisms) between them.

The special thing about operads is that operations can take “inputs” from multiple types. For categories, we represent the morphisms from object $A$ to object $B$ as $A\to B$ . For operad $\mathbb{O}$ , we represent the operations from types $A_1, ..., A_n$ to $B$ as $\mathbb{O}(A_1, ..., A_n; B)$ . We can also compose operations

Because $\mathsf{ConvRel}$ and $\mathsf{Set}$ are symmetric monoidal categories (with a notion of tensoring objects together and some “obvious” symmetries), there are “obvious” operads we can build from them.

For a symmetric monoidal category $\mathbf{C}$ with tensor product $\otimes$ , we can construct $Op(\mathbf{C})$ as follows:

The types in $Op(\mathbf{C})$ are exactly the objects of $\mathbf{C}$ .
For types $A_1,...,A_n, B$ , operations of $Op(\mathbf{C})(A_1, ..., A_n; B)$ are exactly the morphisms $A_1\otimes ... \otimes A_n \to B$ .
For operations $\begin{aligned} f_1 &\in Op(\mathbf{C})(A_{1,1}, A_{2,1}, ..., A_{n_1,1}; B_1)\ \text{(ie.}\ f_1: A_{1,1}\otimes...\otimes A_{n_1,1}\to B_1\text{)} \\ f_2 &\in Op(\mathbf{C})(A_{1,2}, A_{2,2}, ..., A_{n_2,2}; B_2) \\ \vdots & \\ f_m &\in Op(\mathbf{C})(A_{1,m}, A_{2,m}, ..., A_{n_m,m}; B_m) \\ g &\in Op(\mathbf{C})(B_1,...,B_m; C) \end{aligned}$ The composition of $g$ and $f_1,...,f_m$ is $g(f_1,...,f_m) \in Op(\mathbf{C}) (A_{1,1},A_{1,2},...,A_{n_m, m}; C)$ , ie. $g(f_1,...,f_m):A_{1,1}\otimes ...\otimes A_{n_m,m}\to C$ , defined as $g(f_1,...,f_m) = g \circ (f_1\otimes ...\otimes f_m).$

We now demonstrate how this is applied to $\mathsf{ConvRel}$ and $\mathsf{Set}$ .

The tensor product of $\mathsf{ConvRel}$ is the convex product $\times$ defined as follows: For convex spaces $A_1$ , $A_2$ (formed from sets also called $A_1$ and $A_2$ ) and convex structures defined by $c^1(\cdot,\cdot)$ and $c^2(\cdot, \cdot)$ respectively, $A_1\times A_2$ is given by the set $\{(a_1,a_2)\ |\ a_1\in A_1, a_2\in A_2\}$ and the convex structure $c^{1,2}(\cdot, \cdot)$ defined by $c^{1,2}_\lambda ((a_1, a_2),(b_1,b_2)) = (c^1_\lambda(a_1,a_2), c^2_\lambda(b_1,b_2))$ . The tensor product of $\mathsf{Set}$ is just the set product, which we will also call $\times$ for simplicity. The meaning of $\times$ will be clear from context.

Thus, the types of $Op(\mathsf{ConvRel})$ are convex spaces and elements of $Op(\mathsf{ConvRel})(A_1, ..., A_n;B)$ are convex relations, in particular the convex subsets of $(A_1\times ...\times A_n)\times B$ . A convex relation $R\subset (A_1\times ...\times A_n)\times B$ represents a way of coupling together the systems represented by state spaces $A_1,...,A_n$ to form the system represented by state space $B$ . The types in operad $Op(\mathsf{Set})$ are sets and elements of $Op(\mathsf{Set})(A_1, ..., A_n; B)$ are the functions from $A_1\times ...\times A_n$ to $B$ .

The symmetric monoidal structure of $\mathsf{ConvRel}$ has certain symmetries that can be interpreted to state that

Coupling thermodynamic system $X$ to thermodynamic system $Y$ is equivalent to coupling system $X$ to system $Y$ — The isomorphism “braiding” $\beta_{A,B} : A\times B \to B \times A$ implies that $Op(\mathsf{ConvRel})(A,B;C)$ and $Op(\mathsf{ConvRel})(B,A;C)$ can represent the same couplings of systems.
Coupling thermodynamic system $X$ to system $Y$ and then coupling that new system to $Z$ is equivalent to coupling $Y$ to system $Z$ and then coupling $X$ to that new system — The isomorphism “associator” $\alpha_{A,B,Z} : (A\times B)\times C \to A\times (B\times C)$ lets $Op(\mathsf{ConvRel})(A,B,C;D)$ be written unambiguously.

Suppose we have a system with state space $C_1$ built from $A_1,...,\ A_n$ , a system with state space $C_2$ built from $B_1,...,\ B_m$ , and a system with state space $D$ built from $C_1$ and $C_2$ .

By itself, the category $\mathsf{ConvRel}$ does not have the right structure to make our framework functional for coupling multiple systems in an elegant way. In $\mathsf{ConvRel}$ , we could build $R_A : A_1\times ...\times A_n \to C_1$ , $R_B : B_1\times ...\times B_m\to C_2$ , and $R_C : C_1\times C_2 \to D$ , but there is no prescription in the framework for how to compose $R_C$ with both $R_A$ and $R_B$ . We would have to build the system all at once “by hand”, with relation $R : A_1 \times ... \times A_n \times B_1 \times ... \times B_m \to D$ calculated from $R_A$ , $R_B$ , and $R_C$ .

The operadic approach, on the other hand, has the process of getting $R$ from $R_A$ , $R_B$ , and $R_C$ built into it through how composition is defined. Using $Op(\mathsf{ConvRel})$ , the process of coupling $A_1, ..., A_n$ to form $C_1$ and $B_1,...,B_m$ to form $C_2$ , and then coupling $C_1$ and $C_2$ to form $D$ would look like this:

a wiring diagram with wires coming from A1 ... An to C1, from B1 ... Bm to C2, and from C1 C2 to D

Figure 6

Here, each group of arrows represents a relation. Much more intuitive and elegant.

The work we did in $\mathsf{ConvRel}$ was still important, because the functor $\mathrm{Ent}$ we defined can be used to define a similar “map of operads” (a functor between operads) from $Op(\mathsf{ConvRel})$ to $Op(\mathsf{Set})$ . We’ll call this map of operads $\mathrm{OpEnt}$ . We define it as follows:

For type $A$ , $\mathrm{OpEnt}(A)$ takes $A$ to $\mathrm{Ent}(A)$ , the set of entropy functions on $A$ , which is a type in $Op(\mathsf{Set})$ .
For operation $R : Op(\mathsf{ConvRel})(A_1, ..., A_n; B)$ , $\mathrm{OpEnt}[R]$ is an operation in $Op(\mathsf{Set})(\mathrm{Ent}(A_1), ..., \mathrm{Ent}(A_n); \mathrm{Ent}(B))$ . For entropy functions $S_1,...,S_n\ \text{ on }\ A_1,...,A_n$ , we define $\begin{aligned}\mathrm{OpEnt}[R](S_1,...,S_n)(p) &= \mathrm{Ent}[R]\circ \epsilon_{A_1,...,A_n} (S_1,..., S_n)(p) \\ &= \sup_{\{a_1,...,a_n\ |\ ((a_1,...,a_n),p)\in R\}} S_1(a_1)+...+S_n (a_n).\end{aligned}$ The details of how $\epsilon$ is defined are not important here, but $\epsilon$ is called the “laxator” and it’s what builds extensivity of entropy into the framework.

One could say in summary that the operadic structure allows systems to be coupled together, $\mathrm{Ent}$ adds the maximum entropy principle to the framework, and $\epsilon$ adds extensivity.

Putting it all together

The final operad-based framework for doing thermostatics looks like this when applied to example of tanks of incompressible fluids:

To reiterate, we can describe tanks $X_1$ and $X_2$ using the state space $\mathbb{R}_{\gt 0}$ . We can couple these tanks to get a system also described by $\mathbb{R}_{\gt 0}$ . This new system formed from coupling the two tanks is described as a relation

$R = \{\left((u_1, u_2), u\right)\ |\ u\in \mathbb{R}_{\gt 0},\ u_1\in \mathbb{R}_{\gt 0},\ u_2\in \mathbb{R}_{\gt 0},\ u_1+u_2 = u\}\subset (\mathbb{R}_{\gt 0}\times \mathbb{R}_{\gt 0})\times \mathbb{R}_{\gt 0}.$

This is an operation $R : Op(\mathsf{ConvRel})(\mathbb{R}_{\gt 0}, \mathbb{R}_{\gt 0} ; \mathbb{R}_{\gt 0})$ .

We have

$\mathrm{OpEnt}[R] : Op(\mathsf{Set})(\mathrm{OpEnt}(\mathbb{R}_{\gt 0}), \mathrm{OpEnt}(\mathbb{R}_{\gt 0}); \mathrm{OpEnt}(\mathbb{R}_{\gt 0}))$

given by

$\begin{aligned}\mathrm{OpEnt}[R](S_1, S_2)(u) &=\mathrm{Ent}[R]\circ \epsilon_{\mathbb{R}_{\gt 0},\mathbb{R}_{\gt 0}}(S_1, S_2)(u)\\ &= \sup_{((u_1, u_2), u)\in R} S_1(u_1) + S_2(u_2).\end{aligned}$

A physicist tells us that the entropy functions $S_1$ and $S_2$ for $X_1$ and $X_2$ respectively are given by $S_1(x) = C_1\log(x)$ and $S_2(x) = C_2\log(x)$ . Using the method of Lagrange multipliers, we can then calculate that $S_1(u_1) + S_2(u_2)$ is maximized when $u_1 = \frac{C_1}{C_1+C_2} u$ and $u_2 = \frac{C_2}{C_1+C_2} u$ .

Thus, we can calculate that the entropy function on the new coupled system is

$\begin{aligned}\mathrm{OpEnt}[R](C_1\log(x), C_2\log(x)) &= C_1\log\left(\frac{C_1}{C_1+C_2} u\right) + C_2\log\left(\frac{C_2}{C_1+C_2} u\right) \\ &= (C_1+C_2)\log(u) + C_1\log\left(\frac{C_1}{C_1+C_2} u\right)\\ & + C_2\log\left(\frac{C_2}{C_1+C_2} u\right). \end{aligned}$

Having introduced the framework with some simple, expository examples, we will now explore how this framework can be applied to quantum systems.

Application: Quantum Systems And A Thermal Bath

With a picture of the category theoretic framework in hand we move to an application. We will compose two thermostatic systems: a heat bath and a system from the realm of quantum mechanics. However, before composing the two systems we will give a brief primer on some of the relevant concepts in quantum mechanics.

Quantum Systems

In quantum mechanics, physicists will call a complex-valued function $𝑓$ defined on $\mathbb{R}^1$ (if an electron is confined to a line, for example) or on $\mathbb{R}^2$ (if the electron is confined to a plane) where $\int|𝑓|^2=1$

a wave function.

Some examples of wave functions and their applications are as follows:

For a normalized position wave function $\psi(x)$ in the configuration space of – for example – an electron, $|\psi(x)|^2$ is the probability density for making measurements of that configuration on an ensemble of such systems.

So then, for a wave function $E(x,t)$ of a light wave, where $x$ is position and $t$ is time, $|E|^2$ is the energy density, where $E$ is the electric field intensity.

Given a wave function in position space $\psi(x)$ , one can take a Fourier transform to get the associated wave function in momentum space , $\phi(p)$ : $\phi(p) = {1\over{\sqrt{2\pi\hbar}}}\int e^{ipx/\hbar} \psi(x) dx$ This is true in the discrete or continous case, though in the discrete case one would use the discrete Fourier transform.

Physicists will refer to the space of complex normed functions (wave functions) as a Hilbert space. Our focus will be finite-dimensional spaces which can still be called Hilbert spaces because completeness comes automatically in finite dimensions.

Hilbert Spaces

An inner product $\langle \cdot , \cdot\rangle$ over a vector space $H$ gives rise to a norm

$\lVert x\rVert =\sqrt{\langle x, x\rangle }$

If every Cauchy sequence in $H$ with respect to this norm converges in $H$ , we say that $H$ is complete and a Hilbert space. The first of the six postulates of quantum mechanics (pg. 18) states that every physical system is assocated with a Hilbert space.

Completeness is important because it guarantees that we can approximate smooth functions and allows us to prove the Reiz Representation Theorem.

The Reiz Representation Theorem states that if $T$ is a bounded linear functional on a Hilbert space $H$ , then there exists some $g \in H$ such that for every $f \in H$ we have $T(f) =\langle f, g \rangle$ . Moreover, $\lVert T\rVert = \lVert g\rVert$ .

This result establishes the dual correspondence to $H$ and therefore use bracket notation $\langle\psi|\psi \rangle$ , where a “ket” $\left\vert \psi\right\rangle$ is a state vector representing some state of the system.

Hilbert spaces obey the Parallelogram Law and therefore are uniformily convex Banach spaces. That is, they are convex spaces.

The Density Matrix (Density Operator)

The Second Postulate of Quantum Mechanics: Every state of a physical system is associated with a density operator $ρ$ acting on Hilbert space, which is a Hermitian, nonnegative definite operator of unit trace, $tr(ρ) = 1$ .

An operator $\hat{A}$ is an “object” that maps one state vector, $\left\vert \psi\right\rangle$ , into another, $\left\vert \phi\right\rangle$ , so $\hat{A}\left\vert \psi\right\rangle = \left\vert \phi \right\rangle$ .

For us we are concerned primarily with projective operators: An operator $P$ is projective if it is an observable that satisfies $P=P^2.$

Density Operators It turns out that state vector/wave function representions of physical states are subject to phase conventions and can only represent pure states in a Hilbert space. The interested reader can learn more about pure states here but they can be thought of as elements of a Hilbert space with norm 1.

When a system is not in a pure state – maybe you are analyzing an ensemble of electrons that aren’t polarized in any particular direction – you cannot describe the system with a state vector. However, such mixed states can be described as convex combinations of projection operators.

In this case, a more general formalism is useful: associate our physical state with a positive semi-definite Hermitian operator of trace one acting on the Hilbert space of the system. This is called a density operator. It provides a useful way to characterize the state of the ensemble of quantum systems. $\rho=\sum_{i}p_{i}\left\vert \psi_i\right\rangle \left\langle\psi_i\right\vert \ \text{ (discrete)}$

$\rho=\int f(\lambda)\left\vert \psi(\lambda)\right\rangle \left\langle \psi(\lambda)\right\vert \ d\lambda\ \text{ (continuous)}$ When $\rho$ represents a pure state, $\begin{aligned} \rho^2&=\rho \\ tr(\rho)&=1 \ \text{ (It has a purity of one)} \end{aligned}$ For mixed states, the eigenvalues of the matrix representation of the associated density operator sum to one. For pure states, the principal eigenvalue will be 1 and the others will be 0. Pure states cannot be written as convex combinations, but mixed states can be written as convex combinations of pure states.

In an interpretation of quantum mechanics, the density operator $\rho$ , which is measurable on an ensemble of identically prepared systems, allows us to predict the expectation value (outcome) of any experiments performed on that system $\langle A\rangle=tr(\rho A)$ for some observable $A$ .

Entropy

For a brief introduction to entropy see the blog post written by Manoja Namuduri and Lia Yeh.

The entropy of a density matrix looks analagous to the Shannon entropy) of a probability distribution:

$S(\rho_A)=-Tr\left[\rho_A \log(\rho_A)\right]$

It’s shown here that this expression – called the von Neumann Entropy – is concave, ie.

$t S(ρ_1) + (1 − t)S(ρ_2) \leq S(tρ_1 + (1 − t)ρ_2) = S(ρ(t))$

More generally,

$\sum_i p_i S(\rho_i)\leq S(\rho)$

Therefore we can see that when we mix systems, the entropy will only increase.

Composition…

…of two different thermostatic systems

For a quantum system, take an observable with some value on a mixed state. Couple that observable with a heatbath and we should get a new system that’s analagous to the canonical distribution.

Following exactly the work done in Example 36, we consider two thermostatics systems: 1. A heatbath with $\begin{aligned} &\text{State space:}\ \mathbb{R} \\ &\text{Entropy function:}\ S(U_{bath})={U_{bath}\over{T}} \end{aligned}$ 2. A quantum system with $\begin{aligned} &\text{State space:}\ X \\ &\text{Entropy function:}\ S_{VN}=-Tr\left[\rho_A \log(\rho_A)\right] \end{aligned}$

Where $X$ is the set of density matrices and $\rho$ is the density operator for a discrete ensemble of pure states $\left\vert ψ_i\right\rangle$ with statistical weights $p_i$ .

We construct the convex relation demanding that the expectation value of the Hamiltonian operartor, H, of our quantum system is equal to the heat loss from the heat bath: $\langle \mathbf{H}\rangle=-\Delta U$

With our earlier defintion of the expectation value, the above equation implies ${U\over{T}}=-Tr\left[\sum_{i}p_{i}\left\vert \psi_i\right\rangle \left\langle \psi_i\right\vert \mathbf{H}\right]$

so that the new entropy is given by $\sup_{\rho}\left(S_{VN}-Tr\left[\mathbf{H}\sum_{i}p_{i}\left\vert \psi_i\right\rangle \left\langle\psi_i\right\vert \right]\mathbf{H}\right)$

Posted at July 25, 2022 9:54 PM UTC

TrackBack URL for this Entry: https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3409

Re: How to apply category theory to thermodynamics

Thanks for the post.

There’s a broken link here:

The link destination is empty.

Posted by: Tom Leinster on July 25, 2022 11:56 PM | Permalink | Reply to this

Hi Dr. Leinster, thanks for the reminder. The post we’d like to link to does not exist yet. I’ll try to update the link once the it is uploaded to n-Category Café.

Posted by: Nandan K on August 3, 2022 7:11 AM | Permalink | Reply to this

Could nonequilibrium thermodynamics be made to fit within this categorical framework?

Posted by: Madeleine Birchfield on July 26, 2022 1:36 AM | Permalink | Reply to this

Sorry for taking so long to respond!

I think the nature of a particular framework for non-equilibrium thermodynamics may depend on the particular physical systems it is designed to study. What characteristics of these systems when out of equilibrium should we try to capture in the categorical framework? The time-evolution of a deterministic system? The possible steady/cyclic states of a system? How should the environment of a system be considered? Since I haven’t really worked closely with a variety of non-equilibrium thermodynamical systems, I hesitate to say.

However, in the context of the framework discussed in this blog post, my understanding of non-equilibrium thermodynamics is that we would have to sacrifice the maximum entropy principle, since the interpretation of that principle only applies to systems at equilibrium. Our construction of $Op(\mathsf{ConvRel})$ only deals with state spaces and how they can be coupled together, so I would expect that to still make sense in the context of non-equilibrium thermodynamics. Of course, the $\mathrm{Ent}$ functor (or map of operads) would no longer apply, and I think we might need more information about the systems than just their state spaces to describe how they behave out of equilibrium.

I tried to generalize this framework a bit during research week, and I settled on characterizing thermodynamic systems by their states plus information about which states are accessible from which other states. Maybe including information about the environment of the system and the evolution of the system as its environment changes over time can help treat non-equilibrium thermodynamics with a similar framework.

In conclusion, a more capable framework for thermodynamics in general would require a lot more machinery than this thermostatics framework as it is. I would love to hear more perspectives of non-equilibrium thermodynamics from people.

Posted by: Nandan K on September 3, 2022 9:19 PM | Permalink | Reply to this

Re: How to Apply Category Theory to Thermodynamics

There’s a typo in the section “Putting it all together”. The entropy of the coupled system of tanks should read

$\begin{aligned}\mathrm{OpEnt}[R](C_1\log(x), C_2\log(x)) &= C_1\log\left(\frac{C_1}{C_1+C_2} u\right) + C_2\log\left(\frac{C_2}{C_1+C_2} u\right) \\ &= (C_1+C_2)\log(u) + C_1\log\left(\frac{C_1}{C_1+C_2}\right)\\ & + C_2\log\left(\frac{C_2}{C_1+C_2}\right). \end{aligned}$

Posted by: Nandan K on October 12, 2022 5:57 PM | Permalink | Reply to this

The n-Category Café

Skip to the Main Content

July 25, 2022