Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

January 12, 2016

A Compositional Framework for Markov Processes

Posted by John Baez

Last summer my students Brendan Fong and Blake Pollard visited me at the Centre for Quantum Technologies, and we figured out how to understand open continuous-time Markov chains! I think this is a nice step towards understanding the math of living systems.

Admittedly, it’s just a small first step. But I’m excited by this step, since Blake and I have been trying to get this stuff to work for a couple years, and it finally fell into place. And we think we know what to do next. Here’s our paper:

And here’s the basic idea….

Open detailed balanced Markov processes

A continuous-time Markov chain is a way to specify the dynamics of a population which is spread across some finite set of states. Population can flow between the states. The larger the population of a state, the more rapidly population flows out of the state. Because of this property, under certain conditions the populations of the states tend toward an equilibrium where at any state the inflow of population is balanced by its outflow.

In applications to statistical mechanics, we are often interested in equilibria such that for any two states connected by an edge, say ii and j,j, the flow from ii to jj equals the flow from jj to i.i. A continuous-time Markov chain with a chosen equilibrium having this property is called ‘detailed balanced’.

I’m getting tired of saying ‘continuous-time Markov chain’, so from now on I’ll just say ‘Markov process’, just because it’s shorter. Okay? That will let me say the next sentence without running out of breath:

Our paper is about open detailed balanced Markov processes.

Here’s an example:

The detailed balanced Markov process itself consists of a finite set of states together with a finite set of edges between them, with each state ii labelled by an equilibrium population q i>0,q_i >0, and each edge ee labelled by a rate constant r e>0.r_e > 0.

These populations and rate constants are required to obey an equation called the ‘detailed balance condition’. This equation means that in equilibrium, the flow from ii to jj equal the flow from jj to i.i. Do you see how it works in this example?

To get an ‘open’ detailed balanced Markov process, some states are designated as inputs or outputs. In general each state may be specified as both an input and an output, or as inputs and outputs multiple times. See how that’s happening in this example? It may seem weird, but it makes things work better.

People usually say Markov processes are all about how probabilities flow from one state to another. But we work with un-normalized probabilities, which we call ‘populations’, rather than probabilities that must sum to 1. The reason is that in an open Markov process, probability is not conserved: it can flow in or out at the inputs and outputs. We allow it to flow both in and out at both the input states and the output states.

Our most fundamental result is that there’s a category DetBalMark{DetBalMark} where a morphism is an open detailed balanced Markov process. We think of it as a morphism from its inputs to its outputs.

We compose morphisms in DetBalMark{DetBalMark} by identifying the output states of one open detailed balanced Markov process with the input states of another. The populations of identified states must match. For example, we may compose this morphism NN:

with the previously shown morphism MM to get this morphism MNM \circ N:

And here’s our second most fundamental result: the category DetBalMark{DetBalMark} is actually a dagger compact category. This lets us do other stuff with open Markov processes. An important one is ‘tensoring’, which lets us take two open Markov processes like MM and NN above and set them side by side, giving MNM \otimes N:

The compactness is also important. This means we can take some inputs of an open Markov process and turn them into outputs, or vice versa. For example, using the compactness of DetBalMark{DetBalMark} we can get this open Markov process from MM:

In fact all the categories in our paper are dagger compact categories, and all our functors preserve this structure. Dagger compact categories are a well-known framework for describing systems with inputs and outputs, so this is good.

The analogy to electrical circuits

In a detailed balanced Markov process, population can flow along edges. In the detailed balanced equilibrium, without any flow of population from outside, the flow along from state ii to state jj will be matched by the flow back from jj to i.i. The populations need to take specific values for this to occur.

In an electrical circuit made of linear resistors, charge can flow along wires. In equilibrium, without any driving voltage from outside, the current along each wire will be zero. The potentials will be equal at every node.

This sets up an analogy between detailed balanced continuous-time Markov chains and electrical circuits made of linear resistors! I love analogy charts, so this makes me very happy:

    Circuits    Detailed balanced Markov processes
potential population
current flow
conductance rate constant
power dissipation

This analogy is already well known. Schnakenberg used it in his book Thermodynamic Network Analysis of Biological Systems. So, our main goal is to formalize and exploit it. This analogy extends from systems in equilibrium to the more interesting case of nonequilibrium steady states, which are the main topic of our paper.

Earlier, Brendan and I introduced a way to ‘black box’ a circuit and define the relation it determines between potential-current pairs at the input and output terminals. This relation describes the circuit’s external behavior as seen by an observer who can only perform measurements at the terminals.

An important fact is that black boxing is ‘compositional’: if one builds a circuit from smaller pieces, the external behavior of the whole circuit can be determined from the external behaviors of the pieces. For category theorists, this means that black boxing is a functor!

Our new paper with Blake develops a similar ‘black box functor’ for detailed balanced Markov processes, and relates it to the earlier one for circuits.

When you black box a detailed balanced Markov process, you get the relation between population–flow pairs at the terminals. (By the ‘flow at a terminal’, we more precisely mean the net population outflow.) This relation holds not only in equilibrium, but also in any nonequilibrium steady state. Thus, black boxing an open detailed balanced Markov process gives its steady state dynamics as seen by an observer who can only measure populations and flows at the terminals.

The principle of minimum dissipation

At least since the work of Prigogine, it’s been widely accepted that a large class of systems minimize entropy production in a nonequilibrium steady state. But people still fight about the the precise boundary of this class of systems, and even the meaning of this ‘principle of minimum entropy production’.

For detailed balanced open Markov processes, we show that a quantity we call the ‘dissipation’ is minimized in any steady state. This is a quadratic function of the populations and flows, analogous to the power dissipation of a circuit made of resistors. We make no claim that this quadratic function actually deserves to be called ‘entropy production’. Indeed, Schnakenberg has convincingly argued that they are only approximately equal.

But still, the ‘dissipation’ function is very natural and useful—and Prigogine’s so-called ‘entropy production’ is also a quadratic function.

Black boxing

I’ve already mentioned the category DetBalMark,{DetBalMark}, where a morphism is an open detailed balanced Markov process. But our paper needs two more categories to tell its story! There’s the category of circuits, and the category of linear relations.

A morphism in the category Circ{Circ} is an open electrical circuit made of resistors: that is, a graph with each edge labelled by a ‘conductance’ c e>0,c_e > 0, together with specified input and output nodes:

A morphism in the category LinRel{LinRel} is a linear relation L:UVL : U \rightsquigarrow V between finite-dimensional real vector spaces UU and V.V. This is nothing but a linear subspace LUV.L \subseteq U \oplus V. Just as relations generalize functions, linear relations generalize linear functions!

In our previous paper, Brendan and I introduced these two categories and a functor between them, the ‘black box functor’:

:CircLinRel\blacksquare \colon {Circ} \to {LinRel}

The idea is that any circuit determines a linear relation between the potentials and net current flows at the inputs and outputs. This relation describes the behavior of a circuit of resistors as seen from outside.

Our new paper introduces a black box functor for detailed balanced Markov processes:

:DetBalMarkLinRel \square \colon {DetBalMark} \to {LinRel}

We draw this functor as a white box merely to distinguish it from the other black box functor. The functor \square maps any detailed balanced Markov process to the linear relation obeyed by populations and flows at the inputs and outputs in a steady state. In short, it describes the steady state behavior of the Markov process ‘as seen from outside’.

How do we manage to black box detailed balanced Markov processes? We do it using the analogy with circuits!

The analogy becomes a functor

Every analogy wants to be a functor. So, we make the analogy between detailed balanced Markov processes and circuits precise by turning it into a functor:

K:DetBalMarkCirc K : {DetBalMark} \to {Circ}

This functor converts any open detailed balanced Markov process into an open electrical circuit made of resistors. This circuit is carefully chosen to reflect the steady-state behavior of the Markov process. Its underlying graph is the same as that of the Markov process. So, the ‘states’ of the Markov process are the same as the ‘nodes’ of the circuit.

Both the equilibrium populations at states of the Markov process and the rate constants labelling edges of the Markov process are used to compute the conductances of edges of this circuit. In the simple case where the Markov process has exactly one edge from any state ii to any state j,j, the rule is this:

C ij=H ijq j C_{i j} = H_{i j} q_j

where:

  • q jq_j is the equilibrium population of the jjth state of the Markov process,
  • H ijH_{i j} is the rate constant for the edge from the jjth state to the iith state of the Markov process, and
  • C ijC_{i j} is the conductance (that is, the reciprocal of the resistance) of the wire from the jjth node to the iith node of the resulting circuit.

The detailed balance condition for Markov processes says precisely that the matrix C ijC_{i j} is symmetric! This is just right for an electrical circuit made of resistors, since it means that the resistance of the wire from node ii to node jj equals the resistance of the same wire in the reverse direction, from node jj to node i.i.

A triangle of functors

If you paid careful attention, you’ll have noticed that I’ve described a triangle of functors:

And if you’ve got the tao of category theory flowing in your veins, you’ll be wondering if this diagram commutes.

In fact, this triangle of functors does not commute! However, a general lesson of category theory is that we should only expect diagrams of functors to commute up to natural isomorphism, and this is what happens here:

The natural transformation α\alpha ‘corrects’ the black box functor for resistors to give the one for detailed balanced Markov processes.

The functors \square and K\blacksquare \circ K are actually equal on objects. An object in DetBalMark{DetBalMark} is a finite set XX with each element iXi \in X labelled a positive populations q i.q_i. Both functors map this object to the vector space X X.\mathbb{R}^X \oplus \mathbb{R}^X. For the functor ,\square, we think of this as a space of population-flow pairs. For the functor K,\blacksquare \circ K, we think of it as a space of potential-current pairs. The natural transformation α\alpha then gives a linear relation

α X,q: X X X X\alpha_{X,q} : \mathbb{R}^X \oplus \mathbb{R}^X \rightsquigarrow \mathbb{R}^X \oplus \mathbb{R}^X

in fact an isomorphism of vector spaces, which converts potential-current pairs into population-flow pairs in a manner that depends on the q i.q_i. I’ll skip the formula; it’s in the paper.

But here’s the key point. The naturality of α\alpha actually allows us to reduce the problem of computing the functor \square to the problem of computing .\blacksquare. Suppose

M:(X,q)(Y,r)M \colon (X,q) \to (Y,r)

is any morphism in DetBalMark.{DetBalMark}. The object (X,q)(X,q) is some finite set XX labelled by populations q,q, and (Y,r)(Y,r) is some finite set YY labelled by populations r.r. Then the naturality of α\alpha means that this square commutes:

Since α X,q\alpha_{X,q} and α Y,r\alpha_{Y,r} are isomorphisms, we can solve for the functor \square as follows:

(M)=α YK(M)α X 1 \square(M) = \alpha_Y \circ \blacksquare K(M) \circ \alpha_X^{-1}

This equation has a clear intuitive meaning! It says that to compute the behavior of a detailed balanced Markov process, namely (f),\square(f), we convert it into a circuit made of resistors and compute the behavior of that, namely K(f).\blacksquare K(f). This is not equal to the behavior of the Markov process, but we can compute that behavior by converting the input populations and flows into potentials and currents, feeding them into our circuit, and then converting the outputs back into populations and flows.

What we really did

So that’s a sketch of what we did, and I hope you ask questions if it’s not clear. But I also hope you read our paper! Here’s what we actually do in there. After an introduction and summary of results:

  • Section 3 defines open Markov processes and the open master equation.
  • Section 4 introduces detailed balance for open Markov processes.
  • Section 5 recalls the principle of minimum power for open circuits made of linear resistors, and explains how to black box them.
  • Section 6 introduces the principle of minimum dissipation for open detailed balanced Markov processes, and describes how to black box these.
  • Section 7 states the analogy between circuits and detailed balanced Markov processes in a formal way.
  • Section 8 describes how to compose open Markov processes, making them into the morphisms of a category.
  • Section 9 does the same for detailed balanced Markov processes.
  • Section 10 describes the ‘black box functor’ that sends any open detailed balanced Markov process to the linear relation describing its external behavior, and recalls the similar functor for circuits.
  • Section 11 makes the analogy between between open detailed balanced Markov processes and open circuits even more formal, by making it into a functor. We prove that together with the two black box functors, this forms a triangle that commutes up to natural isomorphism.
  • Section 12 is about geometric aspects of this theory. We show that linear relations in the image of these black box functors are Lagrangian relations between symplectic vector spaces. We also show that the master equation can be seen as a gradient flow equation.
  • Section 13 is a summary of what we have learned.

Finally, Appendix A is a quick tutorial on decorated cospans. This is a key mathematical tool in our work, developed by Brendan in an earlier paper.

Posted at January 12, 2016 12:28 PM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2858

2 Comments & 0 Trackbacks

Re: A Compositional Framework for Markov Processes

in an open Markov process, probability is not conserved: it can flow in or out at the inputs and outputs.

You lost me here. I’m looking at your example and I think I see how the detailed balance condition holds. On the right, for instance, we have 141\cdot 4 flowing to the right and 222\cdot 2 flowing to the left, which are equal.

Is it correct that the detailed balance condition implies that the populations are in equilibrium? Since at node ii there is the same amount flowing out to jj as flowing in from jj, for every jj individually.

Now none of this seems to have anything to do with choosing some states to call “input” and some to call “output”. Okay, you can choose some nodes and give them labels. But what is this “probability” thing? The only thing I can think of for it to mean is to sum up the populations and divide each of them by the total, so that in your example the total is 6+1+2+12=1926+1+2+\frac{1}{2} = \frac{19}{2}, and the populations would get normalized to probabilities of 1219,219,419,119\frac{12}{19}, \frac{2}{19}, \frac{4}{19}, \frac{1}{19}. But dividing all the populations by the same number doesn’t affect the detailed balance condition or the fact that they are in equilibrium, does it? So what does it mean to say that probability is not conserved?

(By the way, \leadsto is apparently not supported in iTeX, but I think you can use \rightsquigarrow.)

Posted by: Mike Shulman on January 13, 2016 5:32 AM | Permalink | Reply to this

Re: A Compositional Framework for Markov Processes

Mike wrote:

John wrote:

in an open Markov process, probability is not conserved: it can flow in or out at the inputs and outputs.

You lost me here.

Sorry. I’ll say some stuff that may help.

Is it correct that the detailed balance condition implies that the populations are in equilibrium?

Yes. Detailed balance is a stronger condition, which turns out to be very commonly obeyed in physics, and has enormous consequences.

Since at node ii there is the same amount flowing out to jj as flowing in from jj, for every jj individually.

Right.

Now none of this seems to have anything to do with choosing some states to call “input” and some to call “output”.

No, none of this depends on that.

But what is this “probability” thing?

I only mentioned the word ‘probability’ to explain why, unlike most people working on Markov processes, we don’t talk about probability distributions. I was trying to say that in an open Markov process, the total population is not conserved: it can flow in or out at the inputs and outputs. So, even if we normalize it to 1 at some time, it won’t stay normalized.

To clarify that, I think I should tell you what an open Markov process is. My introduction was not really trying to explain that in any detail.

An open Markov process consists of a finite set XX of states, a subset BXB \subseteq X of boundary states, and an infinitesimal stochastic linear operator H: X X,H: \mathbb{R}^X \to \mathbb{R}^X, meaning one whose matrix obeys

H ij0 H_{i j} \geq 0

for all iji \neq j and

iXH ij=0 \sum_{i \in X} H_{i j} = 0

for all jXj \in X.

I’ll explain these two conditions in a minute.

For each iXi \in X we introduce a population p i[0,),p_i \in [0,\infty), which may also depend on time, tt \in \mathbb{R}. We call the resulting time-dependent function p:X[0,)p : X \to [0,\infty) the population distribution.

We demand that the population distribution evolves in time according to the open master equation. This says that the master equation

dp i(t)dt= jXH ijp j \displaystyle{ \frac{d p_i(t)}{d t} = \sum_{j \in X} H_{i j}p_j}

holds for all iXBi \in X-B, while

p i(t)=b i(t) p_i(t) = b_i(t)

for all iBi \in B.

So, the populations p ip_i obey a linear differential equation at states ii that are not in the boundary, but they are specified ‘by the user’ to be chosen functions of time b ib_i at the boundary states.

The off-diagonal entries, H ijH_{i j} with iji \neq j, are the rates at which population hops from the jjth to the iith state. This lets us understand the definition of an infinitesimal stochastic operator. The first condition,

H ij0 H_{i j} \geq 0

for all iji \neq j, says that the rate for population to transition from one state to another is non-negative. The second:

iXH ij=0 \sum_{i \in X} H_{i j} = 0

for all jj, says that population is conserved, at least if there are no boundary states. Population can flow in or out at boundary states, since the master equation doesn’t hold there!

A steady state is a solution of the open master equation that does not change with time. A steady state is an equilibrium if it obeys the master equation

dp i(t)dt= jH ijp j \displaystyle{ \frac{d p_i(t)}{d t} = \sum_j H_{i j}p_j}

for all states ii, even boundary states.

So, an equilibrium obeys the master equation at all states, while for a steady state this may not be true at the boundary states. In the former, the total population is conserved! In the latter, we imagine that population can flow in or out at the boundary.

We say an equilibrium q:X[0,)q : X \to [0,\infty) of a Markov process is detailed balanced if the rate at which population flows from the iith state to the jjth state is equal to the rate at which it flows from the jjth state to the iith:

H jiq i=H ijq j H_{j i}q_i = H_{i j}q_j

for all i,jXi, j \in X.

A detailed balanced open Markov process is an open Markov process equipped with a specific detailed balanced equilibrium qq.

Now what was that business about ‘inputs’ and ‘outputs’? Well, if we go slightly further and choose a cospan of finite sets

IsXtOI \stackrel{s}{\rightarrow} X \stackrel{t}{\leftarrow} O

with

im(s)im(t)=B im(s) \cup im(t) = B

then we make our open detailed balanced Markov process into a morphism from II to OO. This is a morphism in the category we call DetBalMarkDetBalMark.

Posted by: John Baez on January 17, 2016 3:59 AM | Permalink | Reply to this

Post a New Comment