Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

November 29, 2010

State-Observable Duality (Part 3)

Posted by John Baez

This is the third and final episode of a little story about the foundations of quantum mechanics.

In the first episode, I reminded you of some basic facts about the real numbers \mathbb{R}, the complex numbers \mathbb{C}, and the quaternions \mathbb{H}.

In the second episode, I told you how Jordan, von Neumann and Wigner classified ‘formally real Jordan algebras’, which can serve as algebras of observables in quantum theory. Apart from the ‘spin factors’ n\mathbb{R}^n \oplus \mathbb{R} and the Jordan algebra of 3×33 \times 3 self-adjoint octonionic matrices, h 3(𝕆)\mathrm{h}_3(\mathbb{O}), these come in three kinds:

  • The algebra h n()\mathrm{h}_n(\mathbb{R}) of n×nn \times n self-adjoint real matrices with the product ab=12(ab+ba)a \circ b = \frac{1}{2}(a b + b a).
  • The algebra h n()\mathrm{h}_n(\mathbb{C}) of n×nn \times n self-adjoint complex matrices with the product ab=12(ab+ba)a \circ b = \frac{1}{2}( a b + b a).
  • The algebra h n()\mathrm{h}_n(\mathbb{H}) of n×nn \times n self-adjoint quaternionic matrices with the product ab=12(ab+ba)a \circ b = \frac{1}{2}(a b + b a).

In every case, even the curious exceptional cases, there is a concept of what it means for an element to be ‘positive’, and the positive elements form a cone. In this episode we’ll explore that further: we’ll meet the Koecher–Vinberg classification of convex homogeneous self-dual cones, and see how it’s really all about state-observable duality.

Last time we talked about Jordan algebras and their role in the foundations of quantum mechanics. But the formalism of Jordan algebras seems rather removed from the actual practice of physics. After all, physicists hardly ever take two observables aa and bb and form their Jordan product 12(ab+ba){1\over 2}(a b + b a). As I hinted last time, it is better to think of this operation as derived from the process of squaring an observable, which is something physicists actually do. But still, I can’t help wondering: can we see the classification of finite-dimensional formally real Jordan algebras, and thus the special role of normed division algebras, as arising from some axioms more closely tied to quantum theory as physicists usually practice it?

One answer involves the duality between states and observables. To understand this, you need to know about ‘mixed states’. A ‘state’ describes your knowledge about a physical system. If you know as much as possible we call it a ‘pure state’, but more generally, you may not know as much as possible, and then we speak of ‘mixed states’. It’s usually best to consider pure states as a special case of mixed states.

How does this play out in ordinary quantum theory? If a quantum system has the Hilbert space n\mathbb{C}^n, observables are described by self-adjoint n×nn \times n complex matrices: elements of the Jordan algebra h n()\mathrm{h}_n(\mathbb{C}). But matrices of this form that are nonnegative and have trace 1 also play another role. They are called density matrices, and they describe mixed states of our quantum system. The idea is that any density matrix ρh n()\rho \in \mathrm{h}_n(\mathbb{C}) lets us define expectation values of observables ah n()a \in \mathrm{h}_n(\mathbb{C}) via a=tr(ρa). \langle a \rangle = \mathrm{tr}(\rho a) . The map sending observables to their expectation values is real-linear. The fact that ρ\rho is nonnegative is equivalent to a0a0 a \ge 0 \; \implies \; \langle a \rangle \ge 0 and the fact that ρ\rho has trace 1 is equivalent to 1=1. \langle 1 \rangle = 1 .

We might call this relationship between states and observables ‘state-observable duality’. By this, we are not merely referring to the fact that states live in the dual of the vector space of observables: that much is obvious, given that the expectation value of an observable should depend linearly on that observable. The nontrivial thing is that we can identify the vector space of observables with its dual: h n() h n() * a a, \begin{aligned} \mathrm{h}_n(\mathbb{C}) &\stackrel{\sim}{\longrightarrow}& \mathrm{h}_n(\mathbb{C})^* \\ a &\mapsto& \langle a, \cdot \rangle \end{aligned} using the trace, which puts a real-valued inner product on the space of observables: a,b=tr(ab). \langle a, b \rangle = \mathrm{tr}(ab) . Thus, states can be identified with certain special observables!

All this generalizes to an arbitrary finite-dimensional formally real Jordan algebra AA. Every such algebra automatically has an identity element. This lets us define a state on AA to be a linear functional :A\langle \cdot \rangle : A \to \mathbb{R} that is nonnegative: a0a0 a \ge 0 \implies \langle a \rangle \ge 0 and normalized: 1=1. \langle 1 \rangle = 1 . But in fact, there is a one-to-one correspondence between linear functionals on AA and elements of AA. The reason is that every finite-dimensional Jordan algebra has a trace tr:A \mathrm{tr} : A \to \mathbb{R} defined so that tr(a)\mathrm{tr}(a) is the trace of the linear operator ‘multiplication by aa’. Such a Jordan algebra is then formally real if and only if a,b=tr(ab) \langle a, b \rangle = \mathrm{tr}(a \circ b) is a real-valued inner product. So, when AA is a finite-dimensional formally real Jordan algebra, any linear functional :A\langle \cdot \rangle : A \to \mathbb{R} can be written as a=tr(ρa) \langle a \rangle = \mathrm{tr}(\rho \circ a) for a unique element ρA\rho \in A. Conversely, every element ρA\rho \in A gives a linear functional by this formula. While not obvious, it is true that the linear functional \langle \cdot \rangle is nonnegative if and only if ρ0\rho \ge 0 in terms of the ordering on AA. More obviously, \langle \cdot \rangle is normalized if and only if tr(ρ)=1\mathrm{tr}(\rho) = 1. So, states can be identified with certain special observables: namely, those observables ρA\rho \in A with ρ0\rho \ge 0 and tr(ρ)=1\mathrm{tr}(\rho) = 1.

In short: whenever the observables in our theory form a finite-dimensional formally real Jordan algebra, we have state-observable duality. But what is the physical meaning of state-observable duality? Why in the world should states correspond to special observables? A state is a way for your system to be; an observable is something you can measure about it. They seem quite different!

Here is one attempt at an answer. Every finite-dimensional formally real Jordan algebra comes equipped with a distinguished observable, the most boring one of all: the identity, 1A1 \in A. This is nonnegative, so if we normalize it, we get an observable ρ 0=1tr(1)1A \rho_0 = \frac{1}{\mathrm{tr}(1)} \, 1 \in A of the special kind that corresponds to a state. This state, say 0\langle \cdot \rangle_0, is just the normalized trace: a 0=tr(ρ 0a)=tr(a)tr(1). \langle a \rangle_0 = \mathrm{tr}(\rho_0 \circ a) = \frac{\mathrm{tr}(a)}{\mathrm{tr}(1)} . And this state has a clear physical meaning: it is the state of maximal ignorance! It is the state where we know as little as possible about our system — or more precisely, at least in the case of ordinary complex quantum theory, the state where entropy is maximized.

For example, take A=h 2()A = \mathrm{h}_2(\mathbb{C}), the algebra of observables of a spin-12\frac{1}{2} particle. Then the space of states is the so-called Bloch sphere, really a 3-dimensional ball.

The ball is convex, and for a good reason Suppose I flip a coin, don’t show you the result, and tell you “I made the particle’s state be aa if the coin landed heads up, and bb if it landed tails up”. Then the mixed state that describes your knowledge is halfway between aa and bb. More generally any convex linear combination pa+(1p)bp a + (1-p)b of mixed states is another mixed state, where the probability pp is between 00 and 11. That’s what we mean by saying the ball is convex. Indeed, for any formally real Jordan algebra, the space of states is convex!

So, on the surface of this ball are the pure states: the states where you know as much as possible. For any point on this surface, there’s a state where you know the electron’s spin points that way. At the center of the ball is the state of maximum ignorance. This corresponds to the density matrix ρ 0=(12 0 0 12) \rho_0 = \left(\begin{aligned} \frac{1}{2} &\; & 0 \\ 0 & & \frac{1}{2} \end{aligned}\right) In this state, when I ask you about the particle’s spin along any axis, all you can say is that there’s a 12\frac{1}{2} chance that it’s pointing one way, and a 12\frac{1}{2} chance of it pointing the other way.

Now, back to the general theory:

AA acts on its dual A *A^*: given aAa \in A and a linear functional \langle \cdot \rangle, we get a new linear functional a\langle a \circ \cdot \rangle. This captures the idea, familiar in quantum theory, that observables are also ‘operators’: they act on states. And state-observable duality means we can get any state from the state of complete ignorance by act on it with a suitable observable. After all, any state corresponds to some observable ρ\rho, as follows: a=tr(ρa) \langle a \rangle = \mathrm{tr}(\rho \circ a) So, we can get this state by acting on the state of maximal ignorance, 0\langle \cdot \rangle_0, by the observable tr(1)ρ\mathrm{tr}(1) \rho: tr(1)ρa 0=tr(1)tr(1)tr(ρa)=a. \langle \mathrm{tr}(1)\rho \circ a \rangle_0 = \frac{\mathrm{tr}(1)}{\mathrm{tr}(1)} \mathrm{tr}(\rho \circ a) = \langle a \rangle . So, we see that the correspondence between states and special observables springs from two causes. First, there is a distinguished state, the state of maximal ignorance. Second, any other state can be obtained from the state of maximal ignorance by acting on it with a suitable observable.

While these ideas raise a host of questions, they also help motivate an important theorem of Koecher and Vinberg. The idea is to axiomatize the situation we we have just described, in a way that does not mention the Jordan product in AA, but instead emphasizes:

  • state-observable duality,
  • the fact that ‘positive’ observables, namely those whose observed values are always positive, form a cone.

To find appropriate axioms, suppose AA is a finite-dimensional formally real Jordan algebra. Then seven facts are always true.

First, the set of positive observables C={aA:a>0}. C = \{a \in A \colon a \gt 0\} . is a cone: that is, aCa \in C implies that every positive multiple of aa is also in CC. Second, this cone is convex: if a,bCa,b \in C then any linear combination pa+(1p)bp a + (1-p)b with 0p10 \le p \le 1 also lies in CC. Third, it is an open set. Fourth, it is regular, meaning that if aa and a-a are both in the closure C¯\overline{C}, then a=0a = 0. This last condition may seem obscure, but if we note that C¯={aA:a0} \overline{C} = \{ a \in A \colon a \ge 0 \} we see that CC being regular simply means a0anda0a=0, a \ge 0 \; and \; -a \ge 0 \quad \implies \quad a = 0 , a perfectly plausible assumption.

Next recall that AA has an inner product; this is what lets us identify linear functionals on AA with elements of AA. This also lets us define the dual cone C *={aA:bAa,b>0} C^* = \{ a \in A \colon \forall b \in A \; \; \langle a,b \rangle \gt 0 \} which one can check is indeed a cone. The fifth fact about CC is that it is self-dual, meaning C=C *C = C^*. This formalizes the notion of state-observable duality!

The sixth fact is CC is also homogeneous: given any two points a,bCa,b \in C, there is a real-linear linear transformation T:AAT : A \to A mapping CC to itself in a one-to-one and onto way, with the property that Ta=bTa = b. This says that cone CC is highly symmetrical: no point of CC is any ‘better’ than any other, at least if we only consider the linear structure of the space AA, ignoring the Jordan product and the trace.

From another viewpoint, however, there is a very special point of CC, namely the identity 11 of our Jordan algebra. And this brings us to our seventh and final fact: the cone CC is pointed, meaning that it is equipped with a distinguished element (in this case 1C1 \in C). As we have seen, this element corresponds to the ‘state of complete ignorance’, at least after we normalize it.

In short: when AA is a finite-dimensional formally real Jordan algebra, CC is a pointed homogeneous self-dual regular open convex cone. All the elements aCa \in C are positive observables, but certain special ones, namely those with a,1=1\langle a, 1 \rangle = 1, can also be viewed as states.

In fact, there is a category of pointed homogeneous self-dual regular open convex cones, where:

  • An object is a finite-dimensional real inner product space VV equipped with a pointed homogeneous self-dual regular open convex cone CVC \subset V.
  • A morphism from one object, say (V,C)(V,C), to another, say (V,C)(V',C'), is a linear map T:VVT : V \to V' preserving the inner product and mapping CC into CC'.

Now for the payoff. The work of Koecher and Vinberg, nicely explained in Koecher’s Minnesota notes:

  • Max Koecher, The Minnesota Notes on Jordan Algebras and Their Applications, eds. Aloys Krieg and Sebastican Walcher, Lecture Notes in Mathematics 1710, Springer, Berlin, 1999.

shows that:

Theorem: The category of pointed homogeneous self-dual regular open convex cones is equivalent to the category of finite-dimensional formally real Jordan algebras.

This means that the theorem of Jordan, von Neumann and Wigner, which we saw last time, also classifies the pointed homogeneous self-dual regular convex cones!

Theorem: Every pointed homogeneous self-dual regular open convex cones is isomorphic to a direct sum of those on this list:

  • the cone of positive elements in h n()\mathrm{h}_n(\mathbb{R}),
  • the cone of positive elements in h n()\mathrm{h}_n(\mathbb{C}),
  • the cone of positive elements in h n()\mathrm{h}_n(\mathbb{H}),
  • the cone of positive elements in h 3(𝕆)\mathrm{h}_3(\mathbb{O}),
  • the future lightcone in n\mathbb{R}^n \oplus \mathbb{R}.

Some of this deserves a bit of explanation. For 𝕂=,,\mathbb{K} = \mathbb{R}, \mathbb{C}, \mathbb{H}, an element Th n(𝕂)T \in \mathrm{h}_n(\mathbb{K}) is positive if and only if the corresponding operator T:𝕂 n𝕂 nT : \mathbb{K}^n \to \mathbb{K}^n has v,Tv>0 \langle v, T v \rangle \gt 0 for all nonzero v𝕂 nv \in \mathbb{K}^n. A similar trick works for defining positive elements of h 3(𝕆)\mathrm{h}_3(\mathbb{O}), but we do not need the details here. We say an element (x,t) n(x,t) \in \mathbb{R}^n \oplus \mathbb{R} lies in the future lightcone if t>0t \gt 0 and t 2xx>0t^2 - x \cdot x \gt 0. This of course fits in nicely with the idea that the spin factors are Minkowski spacetimes. Finally, there is an obvious notion of direct sum for Euclidean spaces with cones, where the direct sum of (V,C)(V,C) and (V,C)(V',C') is VVV \oplus V' equipped with the cone CC={(v,v)VV:vC,vC}. C \oplus C' = \{(v,v') \in V\oplus V' \colon \; v \in C, v' \in C' \} .

In short: self-adjoint operators on real, complex and quaternionic Hilbert spaces arise fairly naturally as observables starting from a formalism where nonnegative observables form a cone, and we insist on state-observable duality.

There is a well-developed approach to probabilistic theories that works for cones that are neither self-dual nor homogeneous: see for example the work of Howard Barnum and his coauthors. This has already allowed Barnum, Gaebler and Wilce to shed new light on the physical significance of self-duality. But perhaps if we think more about state-observable duality we can better understand its meaning… and thus the appearance of normed division algebras in quantum physics!

Finally, as Urs Schreiber pointed out, it is worth comparing state-observable duality to the ‘state-operator correspondence’. This was made popular in the context of string theory, but it really applies whenever we have a C *C^*-algebra of observables, say AA, equipped with a state :A\langle \cdot \rangle : A \to \mathbb{C}. Then the Gelfand-Naimark-Segal construction lets us build a Hilbert space HH on which AA acts, together with a distinguished unit vector vHv \in H called the ‘vacuum state’. The Hilbert space HH is built by completing a quotient of AA, so a dense set of vectors in HH come from elements of AA. Thus again, some observables give us states. In particular, the vacuum state vv comes from the element 1A1 \in A.

This is reminiscent of how in the Jordan algebra framework, the state of maximal ignorance comes from the element 11 in the Jordan algebra of observables. But there are also some differences: for example, the Gelfand-Naimark-Segal construction requires choosing a state, and it works for infinite-dimensional C *C^*-algebras, while our construction works for finite-dimensional formally real Jordan algebras, which have a canonical state: the state of maximum ignorance. Presumably both constructions are special cases of something more general.

Posted at November 29, 2010 11:27 AM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2309

46 Comments & 0 Trackbacks

Re: State-Observable Duality (Part 3)

I’d say if anyone is interested in exploring the state of the art of the classification of state spaces of operator algebras, a good place to start would be

Posted by: Tim van Beek on November 29, 2010 1:05 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Dear John,

Is there an observer-state-duality triality?

Best,

Daniel.

Posted by: Daniel de Franša MTd2 on November 29, 2010 1:56 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Louis Kaufman, Seth Lloyd are collaborating around this idea.

As an aid to try to find the SM and everything else. The funny thing it is that Cohl has to use the O^2 to find bosons of the SM.

Posted by: Daniel de Franša MTd2 on November 29, 2010 2:24 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

I see that my second guess as to what you were going to mean by state-observable duality was effectively correct. Just for emphasis: could you highlight which part you mean you have “never heard anyone talk about” (as you said here)?

Also (but probably that has the same answer) could you help me and state again explicitly what the characterization of formally real Jordan algebras by traces is? Something like “A Jordan algebra is formally real if and only if it has a non-degenerate trace?” (Hm, this can’t be quite true, but maybe something like this?)

Concerning the difference between equilibrium state and vacuum state, you write:

But there are also some differences: for example, the Gelfand-Naimark-Segal construction requires choosing a state, and it works for infinite-dimensional C *-algebras, while our construction works for finite-dimensional formally real Jordan algebras, which have a canonical state: the state of maximum ignorance.

On the other hand (as you know) the choice of vacuum state is not random in a theory that describes not just the kinematics (states) but also dynamics : the vacuum state in a QFT is unique under some reasonable axioms and assumptions

Posted by: Urs Schreiber on November 29, 2010 4:51 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Urs wrote:

A Jordan algebra is formally real if and only if it has a non-degenerate trace?

As you note, that’s not correct. And I don’t know how to state the theorem you’re after. I feel such a theorem would need to mention ‘positivity’ somehow. But a Jordan algebra is formally real precisely when the obvious kind of ‘positive elements’, namely elements of the form a 2a^2, are actually closed under addition. So it’s hard for me to make sense of the term ‘positive trace’ unless my Jordan algebra is already formally real!

But it then turns out that any finite-dimensional formally real Jordan algebra has a positive trace: a trace that’s positive on positive elements.

Just for emphasis: could you highlight which part you mean you have “never heard anyone talk about” ?

Of course

  • everyone uses density matrices (nonnegative self-adjoint operators with trace 1) to describe ‘mixed states’,

and

  • everyone knows that self-adjoint operators correspond to ‘observables’,

but I’ve never heard anyone seriously ponder the idea that a density matrix is an observable — that is, ask what it would mean to measure this observable, and how this measurement process would be related to corresponding mixed state.

I don’t actually claim to have an answer to those questions! Instead, I change viewpoint a bit and treat our Jordan algebra elements not as “observables” but as “operations”. And then I give a kind of answer.

All the math here is trivial except for the Koecher-Vinberg theorem; the part I’m trying to emphasize is the conceptual/philosophical meaning of this theorem. In ordinary complex quantum theory we use operators in at least 3 very distinct but related ways:

  1. to describe observables (things we can measure about a system)
  2. to describe operations (things we can do to a system)
  3. to describe mixed states (ways the system can be)

We tend to take this for granted, but it’s pretty weird and interesting if you think about it. For example: why in the world should something you can measure about a system also be a way it can be?

Or why should something you can do to a system also be a way it can be?

I’m actually thinking about the latter question here.

I consider certain situations where there’s a canonical mixed state corresponding (up to normalization) to the element “1” — namely, the state of maximum ignorance.

Then, the action of our Jordan algebra on mixed states establishes a canonical one-to-one correspondence between nonnegative Jordan algebra elements and mixed states, in the manner I describe.

And this has a nice conceptual meaning: you take an operation, something you can do, and you do that to the state of maximum ignorance, and you get a mixed state.

Or even more tersely: things we can do to the system turn into ways it can be.

And in the cases I’m talking about, this sets up a 1-1 correspondence between nonnegative observables and mixed states.

Of course, for this correspondence to be canonical, there has to be a canonical state for the system before you do anything to it. And the ‘state of maximum ignorance’ plays that role quite nicely.

The state of maximum igorance a bit like the ‘vacuum’ in quantum field theory, but it’s even more nothingy than that. It’s not that you know nothing is there; it’s that you know nothing about what’s there.

Three technical notes:

1. In the previous paragraphs, when I say “mixed state”, I often mean “not-necessarily-normalized mixed state”. I don’t have a good word for these things, but they deserve a name. For example, in statistical mechanics exp(βH)\exp(-\beta H) is used to describe such a thing. The whole ‘convex cone’ formalism, and the Koecher-Vinberg theorem, make it nice to emphasize these things.

2. While we often focus on using unitary operators to describe operations, there’s another class of operations described by nonnegative self-adjoint operators. The most famous of these are the ‘projections’. But more generally, a large class of nonnegative self-adjoint operators can be thought of as an ‘imperfect filters’ (not sure that’s the usual term) — and these are the sort of operations that we can apply to the state of maximum ignorance to get mixed states.

3. People often treat mixed states epistemically rather than ontically. In other words, instead of calling them ‘ways for the system to be’, they call them ‘assumptions we may have about how the system is’. But above, I’m sort of vacillating between epistemic and ontic ways of talking about mixed states. For example, I call them ‘ways for the system to be’, but then I talk about ‘the state of maximum ignorance’. I don’t really care, here.

Posted by: John Baez on November 30, 2010 8:25 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

John wrote:

Or why should something you can do to a system also be a way it can be?

It feels like I’m missing the main point of the whole discussion, because I simply think about polarized photons, polarized in the x or y direction of a cartesian coordinate system and propagation in z-direction, flying through a polarization filter that filters in the x-y-diagonal direction. Then, of course, what you can do to a system (a measurement) corresponds to a way it can be (polarized in the x-y-diagonal direction).

Acutally “measurement”, “observable” and “preparation of a state” very much sound like synonyms to me, from QM and QFT…

Posted by: Tim van Beek on November 30, 2010 10:02 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Tim wrote:

Actually “measurement”, “observable” and “preparation of a state” very much sound like synonyms to me, from QM and QFT…

So any observable corresponds somehow to the preparation of some state?

It’s pretty easy to understand how projection operators are related to pure states, but let’s try some more general puzzles:

For example: take an observable like the Hamiltonian of a free particle in a box, and tell me what state this corresponds to.

Or: take a state like the equilibrium state of this particle at the temperature of 140 kelvin, and tell me what observable this corresponds to.

(How would you measure this observable? Would measuring this observable resemble preparing the system in that state?)

This is the kind of puzzle I’m concerned with here… though I specialize it by limiting myself to quantum systems with finitely many degrees of freedom… and then generalize it to include the real and quaternionic versions of quantum theory as well as the complex version.

(Why generalize it? In part because the Koecher–Vinberg theorem sets up a nice correspondence between formally real Jordan algebra and well-behaved cones, which we can think of either as cones of non-negative observables, or as cones of un-normalized mixed states.)

Posted by: John Baez on November 30, 2010 10:16 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

So any observable corresponds somehow to the preparation of some state?

Yes. Here is a pure state:

ΨH \mathbb{C} \stackrel{\Psi}{\to} H

here is an observable

HAA H \stackrel{A}{\to} A

here is the preparation of a new pure state from the old one by that observable

ΨHAH. \mathbb{C} \stackrel{\Psi}{\to} H \stackrel{A}{\to} H \,.

Where I am using a bit of the language of quantum mechanics in terms of dagger-compact categories.

More generally, here is a mixed state

ρHH * \mathbb{C} \stackrel{\rho}{\to} H \otimes H^*

and here is an observable

A¯:HH * \bar A : \mathbb{C} \to H \otimes H^*

and here is the new mixed state

ρ A¯ H H H H . \array{ & \mathbb{C} &&&& \mathbb{C} \\ & \downarrow^{\rho} &&&& \downarrow^{\bar A} \\ H &\otimes& H^* &\otimes& H &\otimes& H^* \\ \downarrow &&& \nwarrow \swarrow &&& \uparrow } \,.

Now:

take an observable like the Hamiltonian of a free particle in a box, and tell me what state this corresponds to.

If we are talking pure states the answer is: the ground state!

Or: take a state like the equilibrium state of this particle at the temperature of 140 kelvin, and tell me what observable this corresponds to.

Every pure state ΨH\mathbb{C} \stackrel{\Psi}{\to} H corresponds to the observable

HΨ ΨH H \stackrel{\Psi^\dagger}{\to} \mathbb{C} \stackrel{\Psi}{\to} H

that measures: “Is the system in this state?”

Similarly every mixed state corresponds to an observable that meaures

“To which extent is the system in one of an ensemble of pure states?”

Of course you know all this. But since you asked… ;-)

Posted by: Urs Schreiber on November 30, 2010 12:48 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Those puzzles were for Tim, not you. But in fact you gave a somewhat different answer to one of the questions than I would have, so Tim can still have the fun of reading what I wrote, and what you wrote, and trying to understand it all, and figuring out which question I’m talking about.

Posted by: John Baez on November 30, 2010 12:54 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

John said:

Those puzzles were for Tim, not you.

The charming aspect of having a chat in a café is of course that others can join it :-)

Tim can still have the fun of reading what I wrote, and what you wrote, and trying to understand it all, and figuring out which question I’m talking about.

I’m not sure, unfortunately, but the questions

How would you measure this observable? Would measuring this observable resemble preparing the system in that state?

…seem to indicate that I should think about the mapping from elements of Jordan algebras or C-star-algebras and their states to a setup in a laboratory? In AQFT there is no perfect way to “calculate” an observable that corresponds to a detector, for example. Or, to be more precise, e.g. the Reeh-Schlieder theorem says that there cannot be an observable that is strictly localized and has zero expectation value in the vacuum state, which would be the first try to define what a detector would be like, when we assemble it today and deassamble it tomorrow in outer space.

In the concrete case at hand we would need an ensemble of particles in equillibrium with a heat bath at the temperature of 140 kelvin, and the observable would be the average energy per particle, correct? I don’t know how I would have to measure this (I slept through all the lab experiments involving cryostats). But if I could measure how much energy I have to put into the system to get it from 0 kelvin to 140 kelvin (at least I can do the Gedankenexperiment), I would have both prepared the state and measured the observable, wouldn’t I?

Posted by: Tim van Beek on November 30, 2010 9:17 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Tim wrote:

The charming aspect of having a chat in a café is of course that others can join it :-)

I like to play games where people pose each other puzzles, as a way of learning things. Jim Dolan constantly does this to me, and I’ve done it to Oz, and David Corfield, and various other victims. If you don’t like these games, that’s fine. But if you do, it can sort of spoil things when someone else steps in and takes over. My puzzles, which were supposed to be hard for you, were too easy for Urs.

Luckily you seem to be proceeding as if he hadn’t said what he said.

I’m not sure, unfortunately, but the questions… seem to indicate that I should think about the mapping from elements of Jordan algebras or C*-algebras and their states to a setup in a laboratory?

Some map from observables to states, or states to observables: that’s what I’m after. That’s what this blog entry is about. And you seemed to say that for you, it was sort of obvious that observables and states are almost the same thing:

Actually “measurement”, “observable” and “preparation of a state” very much sound like synonyms to me…

So, being the nasty sort of guy I am, I wanted to test you with a puzzle or two!

So I take a quantum system with a well-defined Hilbert space, like: a particle in a box. And I give you an observable, like: the usual Hamiltonian for a free particle in a box. And I ask you: what’s the corresponding state?

Of course, you’re allowed to say “Oh, but I can’t do that one, because the Frickhoff-Schmozguvskii isomorphism only applies when…”

Or, I take a state and ask you to hand me the corresponding observable.

I don’t think sleeping through lab classes is relevant here…

Posted by: John Baez on December 1, 2010 12:45 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

It might be wise to remember that this same correspondence arises in discrete *classical* probability theory: every probability weight is also a random variable. In that context, the story about the measurement of *any* random variable is the same: perform the underlying experiment, obtain an outcome, and read off/compute the corresponding value.

This same story works for finite-dimensional QM, of course – make the measurement corresponding to the PVM that diagonalizes the density matrix.

(I don’t mean – at all! — to trivialize these questions; but I do think that the classical case bears thinking about.)

Posted by: Alex Wilce on November 30, 2010 11:33 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Urs wrote:

Also (but probably that has the same answer) could you help me and state again explicitly what the characterization of formally real Jordan algebras by traces is? Something like “A Jordan algebra is formally real if and only if it has a non-degenerate trace?”

I’ve never seen such a characterization, or even a general definition of a trace on a Jordan algebra, but here’s what I’ve got so far. It’s not about traces, it’s about bilinear forms.

Given a Jordan algebra AA over field 𝔽\mathbb{F}, a bilinear form

,:A×A \langle \cdot, \cdot \rangle : A \times A \to \mathbb{R}

is associative if

ab,c=a,bc \langle a \circ b, c \rangle = \langle a, b \circ c \rangle

for all a,b,cAa,b,c \in A.

Theorem: A finite-dimensional Jordan algebra over \mathbb{R} is formally real iff it admits a positive definite associative bilinear form.

There’s also a similar result:

Theorem: A finite-dimensional Jordan algebra over \mathbb{C} is semisimple (a direct sum of simple ones) iff it admits a nondegenerate associative bilinear form.

You can get these by combining Cor. VIII.2.2, Prop. VIII.4.2 and Theorem VIII.5.2 of Faraut and Korányi’s Analysis on Symmetric Cones. The theorem here says:

Theorem: Every semisimple finite-dimensional Jordan algebra over \mathbb{C} is the complexification of a formally real Jordan algebra over \mathbb{R}.

Now it should be possible to re-express these theorems in terms of ‘traces’, but I’m confused about what the general definition of a ‘trace’ on a Jordan algebra should be. Obviously for starters it should be a linear functional

tr:A𝔽 tr : A \to \mathbb{F}

but what conditions, if any, should it satisfy? We automatically have

tr(ab)=tr(ba) tr(a \circ b) = tr(b \circ a)

since our Jordan algebra is commutative. However, I don’t think we want every linear functional

tr:h n() tr: \mathbf{h}_n(\mathbb{C}) \to \mathbb{C}

to count as a trace. Do we?

Hmm, we can take any linear functional

tr:A𝔽 tr: A \to \mathbb{F}

and define

a,b=tr(ab) \langle a, b \rangle = tr(a \circ b)

and ask when this gives an associative bilinear form.

ab,c=?a,bc \langle a \circ b, c \rangle \stackrel{?}{=} \langle a, b \circ c \rangle

I guess we need

tr((ab)c)=tr(a(bc)) tr((a \circ b) \circ c) = tr(a \circ (b \circ c))

so I guess this should be the defining equation of a trace on a Jordan algebra! Remember, our Jordan algebra isn’t usually associative.

It should be possible to reformulate the first two theorems up there in terms of traces…

Posted by: John Baez on December 1, 2010 12:59 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

but here’s what I’ve got so far

Thanks!

Have no time right now to say more, but somebody should now say something involving both the words “Jordan algebra” and “Frobenius algebra”.

Posted by: Urs Schreiber on December 1, 2010 2:22 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

So does this do away with the Heisenberg uncertainty principle? See

Uncertainty principle, Wikipedia.

“The most striking property of Heisenberg’s infinite matrices for the position and momentum is that they do not commute. Heisenberg’s canonical commutation relation indicates by how much:

[x,p] = [xp-px] = i hbar

Heisenberg realized that the non-commutativity implies the uncertainty principle. … Heisenberg showed that the commutation relation implies an uncertainty, or in Bohr’s language a complementarity. Any two variables that do not commute cannot be measured simultaneously—the more precisely one is known, the less precisely the other can be known.”

Posted by: streamfortyseven on November 29, 2010 11:11 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

streamfortyseven wrote:

So does this do away with the Heisenberg uncertainty principle?

No. I’m not doing anything revolutionary here, just analyzing the usual stuff.

As you note, the Heisenberg uncertainty principle has a lot to do with xppxx p - p x. But the Jordan algebra business has a lot to do with xp+pxx p + p x. It’s a somewhat less well-studied aspect of the overall story — but they’re both part of the same story.

Mathematicians say the phrase ‘Lie algebra’ when talking about operations like abbaa b - b a, and ‘Jordan algebra’ when talking about operations like ab+baa b + b a. There are about fifty times as many people studying Lie algebras, and there’s a good reason why — but still, that means it’s time to study Jordan algebras if you want to come up with something new without being a genius.

If you’re lagging behind the pack, just turn around and start running the other way, and all of a sudden you’ll be ahead of everyone.

It might be dumb to turn around like this — there might be a really good reason everyone is running the other way — but if there’s anything interesting in that other direction, you might be the first to meet it.

In fact, Jordan algebras have been pretty intensively studied by mathematicians. But the physical implications of their results, like the Koecher-Vinberg theorem, have not been fully explored.

Posted by: John Baez on November 30, 2010 8:29 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

I’d only ever really seen the state-operator correspondence invoked in conformal field theory, so it’s interesting to think about it being more general (and the more general idea being, in turn, part of something larger and not yet known).

Posted by: Blake Stacey on November 30, 2010 12:17 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

I’d only ever really seen the state-operator correspondence invoked in conformal field theory,

It is true that the term is mentioned in that context more than in other context, but it is not exclusive to CFT.

In AQFT people usually say “separating vaccum vector” or Reeh-Schlieder theorem for “operator-state correspondence”.

But sometimes it is explicitly mentioned these are different terms for the same phenomenon. For instance in the footnote on the bottom of page 34 in

Bert Schroer, Particle physics and QFT at the turn of the century (pdf)

To name other random examples, for instance

Szabo, Quantum Field Theory on Noncommutative Spaces (arXiv)

mentions the “operator-state correspondence in local quantum field theory” on p. 12 but does not talk about conformal QFT.

Same on p. 9 of

Jan Ambjrn, Yuri M. Makeenko, Jun Nishimuray and Richard J. Szabo, Lattice gauge fields and discrete non-commutative Yang-Mills theory Jan (pdf)

Wikipedia mentions it on its page for Bogoliubov transformation. Etc.

It’s also in standard QFT textbooks, but I won’t try to dig out page numbers for that now.

Posted by: Urs Schreiber on November 30, 2010 6:53 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Urs wrote:

In AQFT people usually say “separating vacuum vector” or Reeh-Schlieder theorem for “operator-state correspondence”.

It may look a little bit strange at first that in AQFT there is a operator-state correspondence, because the operators aka observables are defined to form a C-star-algebra and the states are the (normalized) states of this algebra, and the state space is usually considerably larger than the algebra. For example, the state space does not need to be separable, even if the C-star-algebra is.

But the AQFT people cheat and invoke “physical intuition” to argue that of course not all states make sense physically, and that in order to describe a concrete physical system, only a subset of all states - or certain equivalence classes - can and may be used.

An example of this kind of reasoning is on the page Fell’s theorem on the nLab.

Posted by: Tim van Beek on November 30, 2010 10:14 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

It may look a little bit strange at first that in AQFT there is a operator-state correspondence, because the operators aka observables are defined to form a C-star-algebra and the states are the (normalized) states of this algebra, and the state space is usually considerably larger than the algebra.

“Correspondence” here does not want to mean “bijection”.

But the AQFT people cheat and invoke “physical intuition”.

I am not sure what you mean by this. The operator-state correspondence in the form “there is a cyclic vacuum vector” says that every pure state can be obtained by applying some bounded operator to the vauum state.

Posted by: Urs Schreiber on November 30, 2010 11:46 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

“Correspondence” here does not want to mean “bijection”.

Okay, but John’s statement

…states can be identified with certain special observables!

about finite-dimensional formally real Jordan algebras does not generalize to infinite dimensions, at least not without further assumptions - that’s a (trivial?) observation.

I am not sure what you mean by this. The operator-state correspondence in the form “there is a cyclic vacuum vector” says that every pure state can be obtained by applying some bounded operator to the vauum state.

All I wanted to point out is that there is usually more to the story than “the observables form a C-star-algebra and its states are the states” :-)

Because people usually make further assumptions about what kind of states actually count as physically admissable, and on the other hand there are a lot of “observables” that are not observable in the sense that one could build a detector that observes (one of) them. But I’m not sure how this relates to the main point of the main blog post.

(Nitpicking: Strictly speaking we can only approximate every pure state by applying some bounded operator to the vacuum state.)

Posted by: Tim van Beek on November 30, 2010 9:31 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Tim wrote:

Because people usually make further assumptions about what kind of states actually count as physically admissible, and on the other hand there are a lot of “observables” that are not observable in the sense that one could build a detector that observes (one of) them. But I’m not sure how this relates to the main point of the main blog post.

(Nitpicking: Strictly speaking we can only approximate every pure state by applying some bounded operator to the vacuum state.)

All these issues are interesting. The mathematical subtleties that show up when we go to infinite-dimensional algebras of observables are enough to keep people busy for lifetimes, both for C*-algebras and the less famous Jordan operator algebras. The question of which mathematical ‘observables’ are actually observable in the lab is also very deep: experimentalists sometimes do things that you wouldn’t have thought possible.

But I was trying to side-step all these issues here. You could say my main goal was to get people interested in developing a physical/philosophical interpretation of the Koecher–Vinberg theorem. In a nutshell:

Formally real Jordan algebras were developed to formalize quantum theory, and in the finite-dimensional case there’s a nice classification. But the commutative Jordan product is not something physicists tend to think about. The convex cone formalism, where you have a convex cone of nonnegative observables and a dual convex cone of (un-normalized) states, is a lot easier to understand. The Koecher–Vinberg theorem says that these cones are secretly the same as formally real Jordan algebras, under some conditions. But a key condition is that the cone of observables is canonically isomorphic to the cone of states. So, to apply the Koecher–Vinberg theorem to physics, we need to understand state-observable duality.

Does this make sense, in a rough sort of way? Did my original writeup fail to explain this line of reasoning? This is part of a paper I’m writing. Maybe I need to rewrite it.

Posted by: John Baez on December 1, 2010 12:11 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

For what it’s worth (little, as I’m a member of the choir), I think the idea was explained very clearly. Maybe it would help to stress that, while the state-cone of discrete classical probability theory is homogeneous and self-dual for trivial reasons, those reasons are not very portable — generically, finite-dimensional probabilistic models have neither property. In fact, this is one of a large number of ways in which QM seem “more classical” than it might have been — for reasons we (or at least, I) don’t yet understand very well.

(A difficulty one meets, in trying to discuss questions of this kind, is that not everyone is willing to be puzzled about a successful formalism. Many people seem perfectly happy to regard the formal, mathematical apparatus of a physical theory as being a more-or-less contingent product of a sort of evolutionary, trial-and-error process.)

Posted by: Alex Wilce on December 1, 2010 2:05 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Alex wrote:

For what it’s worth (little, as I’m a member of the choir), I think the idea was explained very clearly.

Thanks! Sorry I haven’t gotten around to answering your email yet — I was busy trying to finish the paper of which this blog entry was a snippet, and also fighting my way through galley proofs of a long paper. My response was due yesterday.

Reading what editors have done to ones writing is always stressful… but sometimes amusing. For example, this sentence:

In general relativity, people had been using index-ridden expressions for a long time.

was “corrected” to:

In general, relativity people had been using index-ridden expressions for a long time.

Anyway, now I’m done.

I’m glad you find this section of the paper clear, but unfortunately you know too much about the convex cone approach to be a suitable test case! I’d really like to hear what some “ordinary” mathematicians and physicists think.

Maybe it would help to stress that, while the state-cone of discrete classical probability theory is homogeneous and self-dual for trivial reasons, those reasons are not very portable — generically, finite-dimensional probabilistic models have neither property.

Hmm. I guess the classical case corresponds to the formally real Jordan algebra \mathbb{R} \oplus \cdots \oplus \mathbb{R}, a direct sum of 1-dimensional simple formal real Jordan algebras.

A difficulty one meets, in trying to discuss questions of this kind, is that not everyone is willing to be puzzled about a successful formalism.

Yes, that’s true. I feel that quantum theory with Jordan algebras other than the usual h n()\mathrm{h}_n(\mathbb{C}) is actually important for understanding nature: that’s the main thrust of the paper that this blog entry is a snippet of. It may be hard to convince people of that, but at least it may get their attention… while puzzling over an already successful formalism might not.

Posted by: John Baez on December 1, 2010 2:42 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

John wrote: “Reading what editors have done to ones writing is always stressful… but sometimes amusing…” (etc.)

Thanks for sharing – that made my day!

Also: “Hmm. I guess the classical case corresponds to … a direct sum of 1-dimensional simple formal real Jordan algebras.”

Right. I just meant that it’s worth thinking about why state-observable duality holds in this case (and in QM, where one can use the spectral theorem to mimic the same construction), and that the exercise helps one to appreciate just how special this situation is.

Posted by: Alex Wilce on December 1, 2010 9:42 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

John said:

Does this make sense, in a rough sort of way?

Yes, the rough part is still “understand” in

So, to apply the Koecher–Vinberg theorem to physics, we need to understand state-observable duality.

“Understand” in the sense that we should find similar statements in more general settings?

Did my original writeup fail to explain this line of reasoning? This is part of a paper I’m writing. Maybe I need to rewrite it.

As a rule of thumb one should think about rewriting a paragraph when at least two test readers have the same problem understanding it.

Posted by: Tim van Beek on December 1, 2010 7:11 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

John wrote:

So, to apply the Koecher–Vinberg theorem to physics, we need to understand state-observable duality.

Tim wrote:

“Understand” in the sense that we should find similar statements in more general settings?

No, I guess I’m an incredible optimist, in that I think it’s possible to understand the universe, at least a little. So I just meant “understand” in the sense of making any sort of progress on any of these questions:

Why should we ever expect a canonical map from nonnegative observables to unnormalized mixed states? And why should it ever be a bijection? What’s the physical intuition here? Why should the world, or at least small pieces of it, act like this? Why should numerical quantities we can measure about something act like ways for that thing to be?

What I wrote was supposed to make some progress on these questions: I was showing that we get such a canonical map when we have a canonical “state of maximum ignorance” and observables correspond to operations that act on that to give states where we know more.

But in fact, I’ll be content if you (and everyone) understood why I was claiming the Koecher–Vinberg theorem is related to these questions — regardless of whether you thought I made any progress!

Posted by: John Baez on December 1, 2010 8:16 AM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

JB: I was showing that we get such a canonical map when we have a canonical “state of maximum ignorance” and observables correspond to operations that act on that to give states where we know more.

Unfortunately, your maximally ignorant state 1 makes sense only in finite-dimensional Hilbert spaces since a state must have trace 1 (and hence is bounded), while the most basic physical observables are unbounded operators are unbounded and hence live in infinite dimensions.

Posted by: Arnold Neumaier on December 1, 2010 8:02 PM | Permalink | Reply to this

Re: State-Observable Duality (Part 3)

Arnold wrote:

Unfortunately, your maximally ignorant state 1 makes sense only in finite-dimensional Hilbert spaces since a state must have trace 1 (and hence is bounded), while the most basic physical observables are unbounded operators are unbounded and hence live in infinite dimensions.

Good point! I was wondering when someone would mention that!

In the infinite-dimensional case the observable 1 corresponds not to a state, but to a ‘weight’, namely the usual trace functional. Weights are a widely used generalization of states in the study of infinite-dimensional von Neumann algebras. The most familiar example is Lebesgue measure on the real line, which can’t be normalized to give a probability measure. This is a ‘weight’ on the commutative von Neumann algebra of L L^\infty functions, but not a state.

Probability theorists are familiar with how Lebesgue measure is useful even though it can’t be normalized; they call such positive measures improper priors, and they can be useful, though one must treat them very carefully.

For example, if we take a Bayesian approach, the entropy of a probability measure should be defined relative to a prior probability measure. This gives the notion of relative entropy. And the usual entropy of a probability distribution on the real line,

S=p(x)ln(p(x))dx S = - \int p(x) ln(p(x)) d x

can be seen as the relative entropy of p(x)dxp(x) d x relative to dxd x. But dxd x isn’t a probability measure — it can’t be normalized! It’s an ‘improper prior’.

There is also a notion of relative entropy in the quantum case. And the usual entropy of a density matrix in infinite dimensions:

S=tr(ρln(ρ)) S = - tr( \rho \ln(\rho))

is the entropy of ρ\rho relative to the weight ‘trtr’, which corresponds to the ‘non-normalizable density matrix’ 1.

So there’s not a state of maximal ignorance, but merely a weight. Issues of this sort would really be distracting in the paper I’m writing, so I focused on the finite-dimensional case. But they’re interesting in another way: they let us grapple with situations where ‘our ignorance can be infinite’.

Posted by: John Baez on December 2, 2010 1:12 AM | Permalink | Reply to this

maximum entropy production?

As an official representative of states of maximum ignorance, may I ask a couple of questions that are slightly off topic and may not make sense. Is there a sense in which quantum systems evolve in the direction of maximum ignorance (maximum entropy)? If so, does this happen at the highest rate possible subject to the constraints imposed (maximum entropy production principle)?

Posted by: Robert Smart on November 30, 2010 5:46 PM | Permalink | Reply to this

Re: maximum entropy production?

Robert,

I think there is not since quantum unitary evolution preserves information, that is, it leaves von Neumann entropy unaltered (you also have Poincaré recurrences just like for classical systems, so that there is no steady state). To get an entropy increase you need to open up the system to interaction with the environment, supposing that the environment/measurement apparatus is infinite or that information is lost somewhere in the process. For example, you can postulate a markovian evolution (which is the simplest nontrivial positive trace-preserving memoryless evolution) and yes, for a special class of systems (those which obey detailed balance, that is, equilibrium systems), they go to a maximum entropy steady state satisfying a maximum entropy production principle.

Posted by: tomate on November 30, 2010 6:35 PM | Permalink | Reply to this

Re: maximum entropy production?

Tomate wrote:

… they go to a maximum entropy steady state satisfying a maximum entropy production principle.

Is there a nice place to learn more about this? Robert isn’t the only one obsessed with the principle of maximum entropy production.

Posted by: John Baez on December 2, 2010 1:38 AM | Permalink | Reply to this

Re: maximum entropy production?

Most of the posters here study beautiful subjects such as the quantum theory of closed systems, which has time reversal symmetry. I have a little bit of experience with open dissipative systems, which are not so pretty but may interest some of you for a moment. My advisor has experimentally explored entropy production in driven systems. Although I haven’t been personally involved in most of the experiments, we’ve had many interesting discussions on the topic.

There is a very simple maximum entropy production principle in systems with a linear response. In this case the system evolves towards its maximum entropy state along the gradient, which is the direction of maximum change in entropy. This principle applies in practice to systems which are perturbed a small amount away from equilibrium and then allowed to relax back to equilibrium. As tomate said, if you take a closed system, open it briefly to do something gentle to it, and then wait for it to relax before closing it again, you’ll see this kind of response.

However, the story is very different in open systems. When a flux through a system becomes large, (e.g. close to the Eddington Limit for radiating celestial bodies, when heat flow follows Cattaneo’s Law, etc), the response no longer follows simple gradient dynamics and there is no maximum entropy production principle. There have been many claims of maximum or minimum entropy production principles by various authors and many attempts to derive theories based on these principles, but these principles are not universal and any theories based on them will have limited applicability.

In high voltage experiments involving conducting spheres able to roll in a highly resistive viscous fluid, there is a force on the spheres which always acts to reduce the resistance R of the system. This is true whether the boundary condition is constant current I or constant voltage V. Since power dissipation is I^2 * R in the first case and V^2 / R in the second case, one can readily see that entropy production is minimized for constant current and maximized for constant voltage.

In experiments involving heat flow through a fluid, convection cells (a.k.a. Benard Cells) form at high rates of flow. For a constant temperature difference, these cells act to maximize the heat flow and thus the entropy production in the system. For a constant heat flow, these cells minimize the temperature difference and thus minimize the entropy production in the system.

If one were to carefully read “This Week’s Finds in Mathematical Physics (Week 296)” one would be able to find several more analogous examples where the response of open systems to high flows will either maximize or minimize the entropy production for pure boundary conditions or do neither for mixed boundary conditions.

Posted by: David Lyon on December 2, 2010 4:56 AM | Permalink | Reply to this

Re: maximum entropy production?

As David points out, many variational principles for nonequilibrium systems have been proposed. They only hold in the so-called “linear regime”, where the system is slightly perturbed from its equilibrium steady state. We are very far from understanding general non-equilibrium systems, one major result being the “fluctuation theorem”, from which all kinds of peculiar results descend; in particular, the Onsager-Machlup variational principle for trajectories. For the mathematically-minded, I think the works by Christian Maes et al. might appeal to your tastes.

Funnily enough, there exists a “minimum entropy production principle” and a “maximum entropy production principle”. The apparent clash is due to the fact that while minimum entropy production is an ensemble property, that is, it holds on a macroscopic scale, the maximum entropy production principle is believed to hold for single trajectories, single “histories”. I think the first is well-estabilished, indeed a classical result due to Prigogine, while the second is still speculative and sloppy; it is believed to have important ecological applications. [Similarly, a similar confusion arises when one defines entropy as an ensemble property (Gibb’s entropy) or else as a microstate property (Boltzmann entropy)]

Unfortunately, that I know, there is not one simple and comprehensive review on the topic of variational principle in Noneq Stat Mech.

Posted by: tomate on December 2, 2010 9:51 AM | Permalink | Reply to this

Re: maximum entropy production?

David wrote:

In experiments involving heat flow through a fluid, convection cells (a.k.a. Benard Cells) form at high rates of flow. For a constant temperature difference, these cells act to maximize the heat flow and thus the entropy production in the system. For a constant heat flow, these cells minimize the temperature difference and thus minimize the entropy production in the system.

Ah! Prigogine loves to talk about Benard cells. I hadn’t known they either minimize or maximize entropy production, depending on the boundary conditions. That’s very helpful.

For what it’s worth, here’s a little story, with references, of my feeble attempts so far to understand principles involving maximization or minimization of entropy production. So far I’ve mainly been collecting references! Someday I’ll sit down, read them, and starting thinking hard about this subject.

First I heard about these three papers:

I never got around to reading Dewar’s paper… and I was very confused, because Ilya Prigogine has a quite successful principle of least entropy production that applies to certain linear systems. But Martyusheva and Seleznev write:

1.2.6. The relation of Ziegler’s maximum entropy production principle and Prigogine’s minimum entropy production principle

If one casts a glance at the heading, he may think that the two principles are absolutely contradictory. This is not the case. It follows from the above discussion that both linear and nonlinear thermodynamics can be constructed deductively using Ziegler’s principle. This principle yields, as a particular case (Section 1.2.3), Onsager’s variational principle, which holds only for linear nonequilibrium thermodynamics. Prigogine’s minimum entropy production principle (see Section 1.1) follows already from Onsager–Gyarmati’s principle as a particular statement, which is valid for stationary processes in the presence of free forces. Thus, applicability of Prigogine’s principle is much narrower than applicability of Ziegler’s principle.

Then David Corfield got me really excited by noting that Dewar’s paper relies on some work by the great E. T. Jaynes, where he proposes something called the ‘Maximum Caliber Principle’:

  • E. T. Jaynes, Macroscopic prediction, in H. Haken (ed.) Complex systems – operational approaches in neurobiology, Springer, Berlin, 1985, pp. 254–269.

And I read this paper and got really excited… but then I got distracted by other things.

But then, on the blog Azimuth, John F tried to convince me that Jaynes’ ‘Maximum Entropy Method” for statistical reasoning is not distinct from his Maximum Caliber Principle. In pondering that, I bumped into this:

Abstract: Jaynes’ maximum entropy (\MaxEnt) principle was recently used to give a conditional, local derivation of the “maximum entropy production” (MEP) principle, which states that a flow system with fixed flow(s) or gradient(s) will converge to a steady state of maximum production of thermodynamic entropy (R.K. Niven, Phys. Rev. E, in press). The analysis provides a steady state analog of the \MaxEnt formulation of equilibrium thermodynamics, applicable to many complex flow systems at steady state. The present study examines the classification of physical systems, with emphasis on the choice of constraints in \MaxEnt. The discussion clarifies the distinction between equilibrium, fluid flow, source/sink, flow/reactive and other systems, leading into an appraisal of the application of \MaxEnt to steady state flow and reactive systems.

… which even cites some papers applying these ideas to climate change!

And then, David Corfield pointed me towards this:

This article outlines the place of the constructal law as a self-standing law in physics, which covers all the ad hoc (and contradictory) statements of optimality such as minimum entropy generation, maximum entropy generation, minimum flow resistance, maximum flow resistance, minimum time, minimum weight, uniform maximum stresses and characteristic organ sizes.

Any other important references I’m missing?

Tomate wrote:

We are very far from understanding general non-equilibrium systems, one major result being the “fluctuation theorem”, from which all kinds of peculiar results descend; in particular, the Onsager-Machlup variational principle for trajectories. For the mathematically-minded, I think the works by Christian Maes et al. might appeal to your tastes.

Okay, great. Here are some that I’ll add to my list:

We explain the (non-)validity of close-to-equilibrium entropy production principles in the context of linear electrical circuits. Both the minimum and the maximum entropy production principles are understood within dynamical fluctuation theory. The starting point are Langevin equations obtained by combining Kirchoff’s laws with a Johnson-Nyquist noise at each dissipative element in the circuit. The main observation is that the fluctuation functional for time averages, that can be read off from the path-space action, is in first order around equilibrium given by an entropy production rate. That allows to understand beyond the schemes of irreversible thermodynamics (1) the validity of the least dissipation, the minimum entropy production, and the maximum entropy production principles close to equilibrium; (2) the role of the observables’ parity under time-reversal and, in particular, the origin of Landauer’s counterexample (1975) from the fact that the fluctuating observable there is odd under time-reversal; (3) the critical remark of Jaynes (1980) concerning the apparent inappropriateness of entropy production principles in temperature-inhomogeneous circuits.

  • C. Maes and K. Netočný: Minimum entropy production principle from a dynamical fluctuation law, J. Math. Phys. 48, 053306 (2007). Also available as arXiv:math-ph/0612063.

The minimum entropy production principle provides an approximative variational characterization of close-to-equilibrium stationary states, both for macroscopic systems and for stochastic models. Analyzing the fluctuations of the empirical distribution of occupation times for a class of Markov processes, we identify the entropy production as the large deviation rate function, up to leading order when expanding around a detailed balance dynamics. In that way, the minimum entropy production principle is recognized as a consequence of the structure of dynamical fluctuations, and its approximate character gets an explanation. We also discuss the subtlety emerging when applying the principle to systems whose degrees of freedom change sign under kinematical time-reversal.

Hmm, here’s something I bumped into en route:

  • Gregory L. Eyink, Action principle in nonequilibrium statistical dynamics, Phys. Rev. E 54 (1996), 3419–3435.
Posted by: John Baez on December 2, 2010 11:13 AM | Permalink | Reply to this

Re: maximum entropy production?

I went thourgh Dewar’s paper some time ago. While I think most of his arguments are correct, still I don’t regard them as a full proof of the principle he has in mind. Unfortunately, he doesn’t explain analogies, differences ad misunderstandings around minimum entropy production and maximum entropy production. In fact, nowhere in his articles does a clear-cut definition of MEP appear.

I don’t think, like Martyusheva and Seleznev, that it is just a problem of boundary conditions, and the excerpt you take does not explain why these two principles are not in conflict in the regime where they both are supposed to hold.

Let me explain my own take on the minEP vs. maxEP problem and on similar problems (such as Boltzmann vs. Gibbs entropy increase). It might help sorting out ideas.

By “state” we mean very different things in NESM, among which: 1) the (micro)state which a single history of a system occupies at given times 2) the trajectory itself 3) the density of microstates which an ensemble of a large number of trajectories occupies at a given time (a macrostate). One can define entropy production at all levels of discussion (for the mathematically-inclined, markovian master equation systems offer the best set up where all is nice and defined). So, for example, the famous “fluctuation theorem” is a statement about microscopic entropy production along a trajectory, while the Onsager’s reciprocity relations are a statement about macroscopic entropy production. By “steady state”, we mean a stationary macrostate.

The minEP principle asserts that the distribution of macroscopic currents at a nonequilibrium steady state minimizes entropy production consistently with the constraints which prevent the system from reaching equilibrium.

As I understand it, maxEP is instead a property of single trajectories: most probable trajectories are those which have a maximum entropy production rate, consistently with constraints.

As a climate scientist, you should be interested in the second as we have not an ensemble of planets among which to maximize entropy or minimize entropy production. We have one single realization of the process, and we’d better make good use of it.

Posted by: tomate on December 2, 2010 12:44 PM | Permalink | Reply to this

Re: maximum entropy production?

Thanks very much for your thoughtful comments! I’m a bit distracted by other issues now, but I’ll save the clues you’ve provided and think about them hard someday.

Even without understanding any details yet, I’m already a bit puzzled. You say, very roughly, that minimum entropy production applies to macrostates, while maximum entropy production to microstates. David Lyon, on the other hand, emphasizes how minimum entropy production applies to boundary conditions involving constant flow (e.g. electrical current, or heat flow), while maximum entropy production applies to boundary conditions involving constant effort (e.g. voltage, or temperature difference).

I would like to understand both these ideas, and their range of validity… but I’d be even happier if they were related in some comprehensible way.

As a climate scientist, you should be interested in the second as we have not an ensemble of planets among which to maximize entropy or minimize entropy production. We have one single realization of the process, and we’d better make good use of it.

Of course I’m not a climate scientist; I’m just trying to learn about climate science and get more physicists and mathematicians to work on it. But even so, I think I understand this point.

Posted by: John Baez on December 3, 2010 5:58 AM | Permalink | Reply to this

Re: maximum entropy production?

John Baez mentioned this paper, which comes from a series of several papers which had been cited as a proof of the MEP principle:

R. C. Dewar, Maximum entropy production and the fluctuation theorem.

The proof which forms the basis of these papers was refuted in 2007:

Grinstein, G., Linsker, R. (2007). Comments on a derivation and application of the ‘maximum entropy production’ principle. Journal of Physics A: Mathematical and Theoretical, 40(31), 9717-9720.

The proof implicitly assumes linearity while claiming to cover nonlinear relations. The Onsager reciprocal relations require the same assumptions. I feel that while linear irreversible thermodynamics is understood as well as equilibrium thermodynamics, nonlinear irreversible thermodynamics remains more uncharted than known. Many astonishing discoveries in NIT await us in the 21st century, perhaps even 7 cities of irreversible gold! I suppose I’d settle for the Earth not turning into a scorched desert while billions starve to death…

Posted by: David Lyon on December 6, 2010 2:31 AM | Permalink | Reply to this

Re: maximum entropy production?

Since I’m getting more and more interested in the subject, I read this paper:

which is simple and clear, and also very critical of Dewar’s derivations.

> David Lyon, on the other hand, emphasizes how minimum entropy production applies to boundary conditions involving constant flow (e.g. electrical current, or heat flow), while maximum entropy production applies to boundary conditions involving constant effort (e.g. voltage, or temperature difference).

This resounds with an idea by Jaynes (from The minimum entropy production principle):

> But reversing the direction of the logic ought to reverse the principle. If the conservation laws represent the approximate condition of minimum entropy production for prescribed approximate phenomenological laws, then perhaps the exact phenomenology is the one that has maximum entropy production for prescribed exact conservation laws.

I don’t know if this reversed principle has been formalized. David, do you know about that?

Posted by: tomate on December 11, 2010 4:28 PM | Permalink | Reply to this

Re: maximum entropy production?

For convenience’s sake, here is a fuller bibliography for the Bruers paper:

Posted by: Blake Stacey on December 12, 2010 5:19 PM | Permalink | Reply to this

Re: maximum entropy production?

Here’s a paper that discusses maximum entropy production:

An intriguing quote:

As mentioned by Ozawa et al., Lorenz suspected that the Earth’s atmosphere operates in such a manner as to generate available potential energy at a possible maximum rate. The available potential energy is defined as the amount of potential energy that can be converted into kinetic energy. Independently, Paltridge suggested that the mean state of the present climate is reproducible as a state with a maximum rate of entropy production due to horizontal heat transport in the atmosphere and oceans. Figure 2 shows such an example. Without considering the detailed dynamics of the system, the predicted distributions (air temperature, cloud amount, and meridional heat transport) show remarkable agreement with observations. Later on, several researchers investigated Paltridge’s work and obtained essentially the same result.

There are lots of references provided. The figure looks impressive but I don’t know what it means yet.

Posted by: John Baez on February 8, 2011 9:56 AM | Permalink | Reply to this

duality in quantum information theory

If I understand well, the duality you have in mind is known as Jamiolkowsky duality in quantum information theory. I’d like to know if this is the case. You can read about it in

I Bengtsson, K Zyczkowski, Geometry of Quantum States, Cambridge University Press, chapter 11 “Duality: maps versus states”.

Posted by: tomate on November 30, 2010 6:41 PM | Permalink | Reply to this

Re: duality in quantum information theory

The always-infallible information resource known as Wikipedia has a brief note on the Jamiołkowski isomorphism under the name channel-state duality. A quantum channel is a map between density matrices (of, possibly, different quantum systems) which is linear, trace-preserving and completely positive. Feeding a matrix of maximal ignorance into a channel gives a density matrix “dual” to that channel.

Posted by: Blake Stacey on November 30, 2010 7:29 PM | Permalink | Reply to this

Re: duality in quantum information theory

The always-infallible information resource known as Wikipedia has a brief note on the Jamiołkowski isomorphism under the name channel-state duality.

The just as infallible (but nonetheless currently undercover) nnLab has this at quantum operation.

That page could do with some improvement. It had a hard youth and is still not quite grown up. Hopefully in the course of this discussion here somebody will take care of it.

Posted by: Urs Schreiber on November 30, 2010 7:48 PM | Permalink | Reply to this

Re: duality in quantum information theory

I’m occasionally infallible, and after thinking about this a bit, I believe the relation between operators and states I’m discussing is different than the ‘Choi-Jamiołkowski isomorphism’.

As Blake mentions, that isomorphism lets you take a suitable map from mixed states of one system to mixed states of another, and reinterpret it as a state of the joint system that has these two systems as parts. Stripping the Wikipedia entry to something like its category-theoretic essence, it says we can turn a map

H 1H 1 *H 2H 2 * H_1 \otimes H_1^* \to H_2 \otimes H_2^*

into a map

I(H 1H 2) *(H 1H 2) I \to (H_1 \otimes H_2)^* \otimes (H_1 \otimes H_2)

where H 1,H 2H_1, H_2 are objects in any compact closed category (e.g., finite-dimensional Hilbert spaces) and II is the unit object for that category (e.g., \mathbb{C}).

Of course we can strip this idea down a bit more: in any compact closed category, there’s a 1-1 correspondence between morphisms

XY X \to Y

and morphisms

IYX * I \to Y \otimes X^*

I’ve seen papers by Bob Coecke where he calls this something like ‘gate-state duality’, turning quantum gates into a quantum states.

(I don’t think he used such a catchy rhyme, though. Someone should!)

But anyway, I’m talking about something slightly different. In its simplest guise, it’s just the fact that we use density matrices to describe states, but they can also be thought of as observables. We could try to formulate this mathematically as a relation between morphisms

XX X \to X

and morphisms

IXX * I \to X \otimes X^*

but I think this somehow misses the conceptual point. Okay, morphisms XXX \to X are operators, and we think of some of these as ‘observables’, but why should certain morphisms IXX *I \to X \otimes X^* correspond to ‘mixed states’ of the system whose pure states are XX?

Well, they do: in ordinary quantum mechanics, any mixed state of a system XX can be gotten by tracing out a pure state of a more complicated system built by combining XX with a ghostly ‘double’ or ‘mirror image’, X *X^*. This fact is pretty well-known in the context of ordinary quantum theory, and it probably has its own famous name (I’m forgetting now). But this seems to take us further away from the original puzzle of why some observables are also mixed states.

Anyway, that’s my feeling. And as clue that I might be right, note that the form of observable-state duality I described in this blog entry holds even in cases when there’s no compact closed category in sight! For example, there’s no monoidal category that the exceptional Jordan algebra lives in, and there are also big problems with trying to tensor quaternionic Hilbert spaces (but more on that later).

Posted by: John Baez on December 3, 2010 6:25 AM | Permalink | Reply to this

Re: duality in quantum information theory

John Baez wrote:

in ordinary quantum mechanics, any mixed state of a system XX can be gotten by tracing out a pure state of a more complicated system built by combining XX with a ghostly ‘double’ or ‘mirror image’, X *X^*. This fact is pretty well-known in the context of ordinary quantum theory, and it probably has its own famous name (I’m forgetting now).

In the game of inventing axiom systems from which one can rederive QM, this is known as the “purification postulate”. See

which is a summary of/editorial on

The Perimeter Institute recently hosted a conference on this general topic, from which video recordings are available on PIRSA.

Posted by: Blake Stacey on August 2, 2011 2:53 PM | Permalink | Reply to this

Post a New Comment