Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

December 1, 2010

Solèr’s Theorem

Posted by John Baez

David Corfield likes theorems that say what’s special about the real numbers — especially theorems where \mathbb{R} emerges unexpectedly at the end, like a rabbit from a magician’s hat. He enjoys guessing how the rabbit got into the hat!

So, his ears perked up when I mentioned my favorite theorem of this type: Solèr’s Theorem. In 1995, Maria Pia Solèr proved a result with powerful implications for the foundations of quantum mechanics. She starts with what sounds like a vast generalization of the concept of infinite-dimensional Hilbert space: a generalization that replaces the complex numbers with an arbitrary ‘division *\ast-ring’. But then, she shocks us by showing that this ring must be one of these three:

  • the real numbers \mathbb{R},
  • the complex number \mathbb{C},
  • the quaternions \mathbb{H}!

These are our old friends, the famous trio: the associative normed division algebras!

Let me tell you what Solèr’s Theorem actually says. I’ll do little more than give the necessary definitions and state the result. I won’t say anything about the proof, and only a bit about the implications for quantum theory. For that, you can read this wonderful paper:

and then:

  • Maria Pia Solèr, Characterization of Hilbert spaces by orthomodular spaces, Comm. Algebra 23 (1995), 219–243.

To state Solèr’s theorem, let’s start by saying what sort of ring her generalized Hilbert spaces will be vector spaces over. We want rings where we can divide, and we want them to have an operation like complex conjugation. A ring is called a division ring if every element xx has an element x 1x^{-1} for which xx 1=x 1x=1x x^{-1} = x^{-1} x = 1. A ring is called a *\ast-ring if it is equipped with a function xx *x \mapsto x^* for which: (x+y) *=x *+y *,(xy) *=y *x *,(x *) *=x. (x + y)^* = x^* + y^*, \qquad (x y)^* = y^* x^* , \qquad (x^*)^* = x. A division *\ast-ring is a *\ast-ring that is also a division ring.

Of course \mathbb{R}, \mathbb{C} or \mathbb{H} are examples of division *\ast-rings, but there are many more — I’ll list a few later.

Now, suppose 𝕂\mathbb{K} is a division *\ast-ring. As before, let’s define a 𝕂\mathbb{K}-vector space to be a right 𝕂\mathbb{K}-module. You can use left modules too — it doesn’t really matter. And as before, let’s say a function T:VVT: V \to V' between 𝕂\mathbb{K}-vector spaces is 𝕂\mathbb{K}-linear if T(vx+wy)=T(v)x+T(w)y T(v x + w y) = T(v)x + T(w)y for all v,wVv,w \in V and x,y𝕂x,y \in \mathbb{K}.

We define a hermitian form on a 𝕂\mathbb{K}-vector space VV to be a map V×V 𝕂 (u,v) u,v \begin{aligned} V \times V & \to & \mathbb{K} \\ (u,v) &\mapsto& \langle u,v \rangle \end{aligned} which is 𝕂\mathbb{K}-linear in the second argument and obeys the equation v,u=u,v * \langle v, u \rangle = \langle u, v \rangle^* for all u,vVu, v \in V. We say a hermitian form is nondegenerate if u,v=0foralluVv=0. \langle u, v \rangle = 0 \; for \; all \; u \in V \quad \iff \quad v = 0 .

Suppose HH is a 𝕂\mathbb{K}-vector space with a nondegenerate hermitian form. Then we can define various concepts familiar from the theory of Hilbert spaces. For example:

  • We say a sequence e iHe_i \in H is orthonormal if e i,e j=δ ij\langle e_i, e_j \rangle = \delta_{ij}.
  • For any vector subspace SHS \subseteq H, we define the orthogonal complement S S^\perp by S ={vH:u,v=0foralluS}. S^\perp = \{ v \in H \colon \; \langle u, v \rangle = 0 \; for \; all \; u \in S \} .
  • We say SS is closed if S =SS^{\perp \perp} = S. (If HH is a real, complex, or quaternionic Hilbert space this is true iff SS is closed in the usual topological sense, but here we have no topology!)
  • And finally, we say HH is orthomodular if S+S =HS + S^\perp = H for every closed subspace SS. (This is automatically true if HH is a real, complex, or quaternionic Hilbert space.)

We are now ready to state Solèr’s theorem!

Theorem: Let 𝕂\mathbb{K} be a division *\ast-ring, and let HH be a 𝕂\mathbb{K}-vector space equipped with an orthomodular hermitian form for which there exists an infinite orthonormal sequence. Then 𝕂\mathbb{K} is isomorphic to ,\mathbb{R}, \mathbb{C} or \mathbb{H}, and HH is an infinite-dimensional 𝕂\mathbb{K}-Hilbert space.

Nothing in the assumptions mentions the continuum: the hypotheses are purely algebraic. It therefore seems quite magical that 𝕂\mathbb{K} is forced to be the real numbers, complex numbers or quaternions.

The orthomodularity condition is the key: this is how the rabbit gets into the hat! It has a long history. In quantum logic, unlike classical logic, ‘and’ may fail to distribute over ‘or’. You can see this by pondering subspaces of Hilbert spaces, or more vividly, by pondering a particle on a line. If you think about it a while, you’ll see that the distributive law fails because of the uncertainty principle.

I’m not sure, but I believe it was John von Neumann who proposed the modular law as a weakened substitute. The distributive law says:

pand(qorr)=(pandq)or(pandr) p \; and \;(q \;or \;r) \; = \; (p \; and \; q) \; or \; (p \;and \; r)

while the modular law says “at least this is true if (pandq)=q(p \;and \; q) = q”. In other words, the modular law says the distributive law holds when qq implies pp.

You can check that the modular law holds when our propositions are subspaces of a vector space, ‘andand’ is the intersection of subspaces (usually called \cap), and ‘oror’ is the linear span of subspaces (usually called ++).

But in quantum theory, our propositions are closed subspaces of a Hilbert space, ‘and’ is the intersection, and ‘or’ is the closure of the linear space. For finite-dimensional Hilbert spaces, this fine print — italicized here — doesn’t make any difference. But for infinite-dimensional Hilbert spaces the fine print matters, and the modular law fails.

Puzzle: find a counterexample.

So, we need a further fallback position, and this is called the orthomodular law. This involves the logical operation “not” as well. The orthomodular law says that the distributive law holds:

pand(qorr)=(pandq)or(pandr) p \; and \; (q \;or \; r) = (p \; and \; q) \;or \; (p \; and \; r)

if rr implies pp and also q=not(r)q = not(r). Unraveling this a bit, it says:

p=(pandq)or(pandnot(q)) p = (p \; and \;q) \;or \; (p \;and \; not(q))

When our propositions are closed subspaces of a Hilbert space, “not” is the orthogonal complement. This is also true when our propositions are closed subspaces in a 𝕂\mathbb{K}-vector space HH equipped with a hermitian form, where now 𝕂\mathbb{K} can be any division *\ast-ring. And I’m pretty sure that in this case, the lattice of closed subspaces is orthomodular if and only if HH is orthomodular in Solèr’s sense — namely, for every closed subspace SHS \subseteq H we have S+S =HS + S^\perp = H.

That would explain her use of the word ‘orthomodular’. But there’s a lot more history behind these ideas. In 1964, Piron had attempted to prove that any orthomodular complex inner product space must be a Hilbert space. His proof had a flaw which Ameniya and Araki fixed a few years later. Their proof also handles the real and quaternionic cases. Eventually these ideas led Solèr to realize that orthomodularity picks out real, complex, and quaternionic Hilbert spaces as special. And her results have implications for other axiomatic approaches to quantum theory: for example, approaches using orthomodular lattices and convex cones. For these, read Holland’s paper.

Some of you were asking about examples of division *\ast-rings. For starters, you can take any division ring and set x *=xx^* = x. So, for example: take the field of rational functions of 7 variables, or the field with 7 elements.

But it’s more fun to find examples where x *xx^* \ne x. So, for example: take the field with 7 elements, which does not have a square root of 1-1, but then formally adjoin a square root of minus 1-1, and get a field with 49 elements. This new field has an obvious concept of ‘complex conjugation’, so it becomes a division *\ast-ring. Or take the field of rational functions on some algebraic variety XX that has an ‘involution’, a map f:XXf: X \to X with f 2=1f^2 = 1, and do the obvious thing.

The examples I’ve given so far are all commutative.

Puzzle: what are some noncommutative division *\ast-rings?

Of course finite division rings are all commutative, and the only finite-dimensional associative division algebras over the real numbers are our three friends. But I don’t know much about infinite-dimensional noncommutative (but associative) division algebras over the real numbers!

Also, some of you were asking about how the orthomodularity condition rules out finite fields. Let’s look at the simplest example. Remember that 𝔽 2\mathbb{F}_2, the field with two elements, is what normal mortals call the integers mod two. We can make this field into a division *\ast-ring with x *=xx^* = x. Then we can put a hermitian form on the 2-dimensional vector space H=𝔽 2 2H = \mathbb{F}_2^2 in the obvious way: u,v=u 1v 1+u 2v 2 \langle u, v \rangle = u_1 v_1 + u_2 v_2 The vector (1,1)(1,1) is orthogonal to itself, since 1 2+1 2=01^2 + 1^2 = 0. So, if we let SS be the subspace of HH spanned by this vector, we have SS S \subseteq S^\perp which seems weird if you’re used to Hilbert spaces. Indeed it’s easy to see S=S , S = S^\perp , so SS is closed, but we certainly don’t have S+S =HS + S^\perp = H So, Solèr’s orthomodularity condition fails!

This example is finite-dimensonal, but it’s easy to see that the same problem shows up in some infinite-dimensional cases too.

It would also be fun to look at some other examples, for example infinite-dimensional vector spaces over the rational numbers…

Puzzle: Pick an infinite-dimensional rational vector space with a hermitian form on it, and show it isn’t orthomodular.

Posted at December 1, 2010 2:55 AM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2314

46 Comments & 0 Trackbacks

Re: Solèr’s Theorem

A couple of typos:

In the definition of *-ring, “y*x*” should have the second “*” as superscript.

After “And as before, let’s say a function” – “maps” is missing its slash

“And results have” should be “And these results have”?

Posted by: Stuart on December 1, 2010 8:52 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Thanks again! Fixed!

I’m glad you got to it before everyone else…

Posted by: John Baez on December 1, 2010 9:55 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Another typo: I think the definition of the orthogonal complement should end “…for all u in S}”?

Posted by: Michael Baker on December 1, 2010 11:58 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Thanks!

Posted by: John Baez on December 1, 2010 12:14 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Though my maths knowledge has decayed in the past few years, pretty sure you need to change a “v” to a “u” in the non-degeneracy definition ;)

Posted by: Sami on December 2, 2010 11:41 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Just for fun, I changed a vv to a uu. Thanks for catching that typo!

Posted by: John Baez on December 2, 2010 11:51 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Presumably in a division ring we cannot divide by 0. I wonder what the intuitionistic status of Solèr’s Theorem might be. If there are counter-examples, then there must be sheaf toposes in which the theorem fails. But then one wonders what these extra structures will be.

Posted by: Andrej Bauer on December 1, 2010 10:27 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Andrej, is it obvious to you how to handle the condition that there is an infinite orthonormal sequence? As I understand it, there are several notions of finiteness in a topos, all meaning the usual thing in “the” category of sets, but not all equivalent in an arbitrary topos. Do you know which one would be appropriate here? If so, how?

Posted by: Tom Leinster on December 1, 2010 1:17 PM | Permalink | Reply to this

Re: Solèr’s Theorem

The word “sequence” suggests to me that one should just ask for a function a:Ha\colon \mathbb{N}\to H, such that a(i),a(j)=δ ij\langle a(i),a(j)\rangle = \delta_{i j}. Note that δ ij\delta_{i j} makes sense intuitionistically since equality on \mathbb{N} is decidable.

Posted by: Mike Shulman on December 1, 2010 5:42 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Ah, yes. I somehow didn’t notice the word sequence, despite the fact that I typed it myself.

(What you didn’t see, I hope, is an even stupider comment of mine on this thread. I submitted it, realized five seconds later that it was colossally unintelligent, and frantically logged in to the blog software to delete it before the world could read it. I guess John got a copy by email. But don’t tell anyone else.)

Posted by: Tom Leinster on December 1, 2010 10:59 PM | Permalink | Reply to this

Re: Solèr’s Theorem

On the other hand, with my intuitionist hat on, I am naturally suspicious of the orthomodularity condition, since it is essentially the law of excluded middle for quantum logic! (-:

Posted by: Mike Shulman on December 1, 2010 6:09 PM | Permalink | Reply to this

Re: Solèr’s Theorem

We were talking about Solèr’s theorem over here, but let’s continue the conversation here!

Tim wrote:

My understanding is that we need orthomodularity to be able to say what an orthogonal projection is. The orthogonal projections correspond to answers to yes-no-questions one can ask about a quantum system in a certain quantum state, so orthomodularity means we are able to ask yes-no-questions.

John wrote:

You can define an orthogonal projection on any vector space EE with a Hermitian form: it’s an operator p:EEp:E \to E with p 2=pp^2=p and p =pp^\dagger =p. You don’t need the vector space to be orthomodular.

My understanding is that Solèr needs orthomodularity in her sense (M+M =EM+M^{\perp} =E) to make the lattice of closed subspaces of EE be orthomodular in the usual sense of quantum logic.

And, of course, to prove her theorem!

Tim wrote:

I meant “we need orthomodularity” in the following sense: Let’s say we would like to play “quantum mechanics in infinite dimensions”, what would the space look like that we need? In this sense, what would we need “orthomodularity” for? Can’t we play “quantum mechanics” without it? (It seems to me to be very much harder to question the other assumptions of the Solèr theorem).

If we don’t assume “orthomodularity”, then there will be an orthogonal projection pp - which is an observable! - that does not correspond to a yes-no-question, because there is a state that is neither in the range nor in the orthogonal complement of the range of pp. Seems hard to make sense of that, at least to me :-) Or am I missing something?

I agree that while we don’t need orthomodularity to be able to say what an orthogonal projection is, orthogonal projections will work in a strange way if we don’t have orthomodularity in Solèr’s sense.

The easiest way to see some of this strangeness is to look at the 2-dimensional vector space HH over the field with 2 elements, given the obvious hermitian form described in this blog entry. Then there’s a 1-dimensional subspace SHS \subset H spanned by the vector (1,1)(1,1).

Here’s the first curious thing: the orthogonal complement of SS is SS itself!

Secondly, it seems there’s not an orthogonal projection onto this subspace. After all, any projection pp onto this subspace would have a 1-dimensional kernel. And this kernel would have to contain either (1,0)(1,0) or (0,1)(0,1), since there are only 3 nonzero vectors in HH. But neither of those vectors are orthogonal to (1,1)(1,1)! So, pp can’t be an orthogonal projection.

Perhaps it would also be nice to ponder an example coming from an infinite-dimensional complex inner product space that’s not a Hilbert space, like the space of complex sequences with only finitely many nonzero terms, thought of as a subspace of 2\ell^2.

Or, an infinite-dimensional inner product space over \mathbb{Q}, like the space of rational sequences with only finitely many nonzero terms, thought of as a subspace of 2\ell^2, but now with a \mathbb{Q}-valued inner product.

In both these cases we can ask: why is the space not orthomodular, and what funny things happen with orthogonal projections?

I haven’t studied these questions, so if someone wants to, I hope they tell me what they find!

Posted by: John Baez on December 1, 2010 12:07 PM | Permalink | Reply to this

Re: Solèr’s Theorem

A snatched moment. What’s it like not to teach?

So what are the closed subspaces of your last example, the direct sum of countably many copies of \mathbb{Q} with Euclidean inner product? Presumably I’m allowed to pick any subset of \mathbb{N} and require the (finitely many) nonzero coordinates to lie in that subset.

Posted by: David Corfield on December 1, 2010 1:24 PM | Permalink | Reply to this

Re: Solèr’s Theorem

David wrote:

What’s it like not to teach?

Not teaching regular classes was great at first… but now I’m getting sort of lonely, in part because teaching is a big part of my life, in part because my grad students are also far away (and finishing up), and in part because I don’t know many people in Singapore. This is probably why I’m blogging so much these days, and spending so much time interacting with people on the Azimuth Forum.

The main good thing about having so many empty spaces in my life is that I’m free to think about new things… and change my direction in ways that were impossible when I was ‘pinned down’ by a tighter network of people and responsibilities. So I expect to be rather different by the time I get back from Singapore. It’s a bit unnerving, but basically good.

So what are the closed subspaces of your last example, the direct sum of countably many copies of \mathbb{Q} with Euclidean inner product?

I don’t know yet. I think there must be some ‘pathological’ subspaces that are closed in Solèr’s sense, but somehow strange, in ways that take advantage of the infinite-dimensionality.

Posted by: John Baez on December 1, 2010 2:41 PM | Permalink | Reply to this

Re: Solèr’s Theorem

I’ve been thinking about this puzzle on and off for a while now: find a vector subspace

SH= n=1 S \subset H = \bigoplus_{n = 1}^\infty \mathbb{Q}

with

S =S S^{\perp \perp} = S

with respect to the obvious inner product on H 2H \subset \ell^2, but

S+S H S + S^\perp \ne H

I haven’t made much progress; I just want to say I haven’t given up. There are probably clues in Holland’s paper, but I bet I can crack this just by thinking about it. I don’t mind if someone else here does it, though.

If you’re into analysis, you probably like the reals better than the rationals, so it might be good to start with this related question: find a vector subspace

SH= n=1 S \subset H = \bigoplus_{n = 1}^\infty \mathbb{R}

with

S =S S^{\perp \perp} = S

with respect to the obvious inner product on H 2H \subset \ell^2, but

S+S H S + S^\perp \ne H

Posted by: John Baez on December 3, 2010 1:59 AM | Permalink | Reply to this

Re: Solèr’s Theorem

You’d think we’d need some condition on the components of vectors in SS which would allow there to be more orthogonal vectors if we were allowed real values than when restricted to rationals. 2\sqrt{2} is our favourite irrational, so can we think of an SS for which an orthogonal vector would exist involving 2\sqrt{2}, were we allowed real components?

Posted by: David Corfield on December 3, 2010 12:28 PM | Permalink | Reply to this

Re: Solèr’s Theorem

On second thoughts, from the second part of your comment, the trouble is already there for the direct sum of countably many copies of the reals. So then the trouble with the rationals as a ground division *-ring is that the hermitian form won’t extend from countable direct sum to countable direct product.

So perhaps the best strategy is to exploit the fact that the direct sum of countably many copies of the reals is not a Hilbert space. And that’s because it’s not a complete metric space.

Posted by: David Corfield on December 3, 2010 1:00 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Looks like Morash has the answer, and earlier

I. Amemiya and H. Araki, ‘A remark on Piron’s paper’, Publ. Res. Inst. Math. Sei. Ser. A 2 (1966/67), 423-427.

Morash constructs suitable subspaces of the direct sum of countably many copies of any division subring of the quaternions.

Posted by: David Corfield on December 3, 2010 2:30 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Now we’ve seen what goes wrong for the direct product of \mathbb{Q}, and for the direct sum version of \mathbb{R} and other division subrings of the quaternions, what’s wrong with Hilbert spaces over pp-adic fields?

It seems they are not orthomodular according to Sandra Farrier in ‘Convergence of the Cesàro-like Means on p-adic Hilbert Spaces’, Communications in Mathematical Analysis, Volume 1, Number 2, pp. 121–136, 2006. At least orthomodularity “does not always hold in the pp-adic setting”.

Posted by: David Corfield on December 12, 2010 5:34 PM | Permalink | Reply to this

Re: Soler’s Theorem

Can this approach incorporate Finkelstein’s yet more fundamental ideas, in which the imaginary unit i is an emergent quantity?

Or, maybe a different question, does state-operator duality extend to the conjugation operation?

Posted by: John F on December 1, 2010 3:36 PM | Permalink | Reply to this

Re: Soler’s Theorem

I find Finkelstein’s ideas appealingly radical (he sent me a copy of his book Quantum Relativity, and that’s probably the best place to learn what he’s up to), but they’re sort of beyond the brink of what I can do much with.

Antilinear operators (those with T(cψ)=c¯TψT(c\psi) = \overline{c} T\psi) play a big, big role in the paper that these blog entries are coming from, but I haven’t thought about ‘operator-state duality’ for these; it’s not clear how that would work.

Posted by: John Baez on December 2, 2010 1:47 AM | Permalink | Reply to this

Re: Solèr’s Theorem

The GNS-construction usually constructs a pre-Hilbert space (and a Hilbert space by completion in the pre-Hilbert space norm) from a normed state over a *-algebra of observables over the complex field. The normed state tells us a unique expected value associated with every operator, and the normed positivity of the state allows us to construct a probability interpretation. There’s enough structure, for example, to construct a characteristic function ω(e iλO^)\omega(e^{i\lambda\hat O}) associated with an operator O^\hat O and hence to derive the corresponding probability distribution.

If we were attempting to reconstruct the premises of Solèr’s Theorem in this context I suppose we would introduce a *-algebra over a division *-ring, but for a probability interpretation we would also need to introduce at least an ordering of the division *-ring, to allow a concept similar enough to that of positivity of the state to allow a construction of something like a pre-Hilbert space (though perhaps some additional premise would be needed to get so far). The concept of a norm, and hence of completion in the norm, also presumably requires an ordering of the real part of the division *-ring, elements for which x=x *x=x^*.

If we introduce a division *-ring DD instead of the complex numbers, however, we are effectively saying that the results of experiments are elements of the real part of the division *-ring, not of the real numbers. Because this is a division *-ring, we can add, subtract, multiply, and divide entries in a list of measurement results to produce various “summary statistics” in DD. This might give us a workable concept of theoretical objects like “expected values”, which could be compared with experimental statistics, but does not give us a concept of probability.

Tim claims that “orthomodularity means we are able to ask yes-no-questions”, but I would say that orthomodularity is connected to whether we can construct a probability interpretation (at least, in the normal context, when operators mutually commute), not whether we can ask yes-no questions. If we have to construct summary statistics only in DD, we can ask yes-no questions, but we may not be able to construct summary statistics that tell us, for example, what “proportion” of answers were “yes” or “no” (reported as 1 or 0 in DD, presumably), insofar as there may be no concept of proportion in DD. I suppose that a concept of proportion, which ordinarily gets us from the integers to the rationals, might be strong enough to get us to part of something like Solèr’s Theorem, and that a judicious introduction of ordering and of a state might be enough to get all the way.

The premises of Solèr’s Theorem seem rather removed from operational Physics, insofar as experimental results are generally taken to be elements of the signed integers, integers, or rationals, and embedded in the reals. For information theory it may be useful to take experimental results to be elements of a division *-ring that is not embeddable in the reals, in which case a probability interpretation may not be possible even though we might be able to construct useful summary statistics. A probability interpretation is not a sine qua non for usefulness, but perhaps we can’t call it quantum theory.

Posted by: Peter Morgan on December 1, 2010 3:42 PM | Permalink | Reply to this

Re: Solèr’s Theorem

An historical aside here is that von Neumann never really cared about orthomodularity. For him the modular law was the real deal, despite the fact that it fails for the lattice of closed subspaces in infinite dimensional Hilbert space. You can read about this in Redei’s paper Why John von Neumann did not Like the Hilbert Space formalism of quantum mechanics (and what he liked instead). In a dual stance, John Harding has shown that orthomodularity is a naturally occurring structure in many different situations. He put all of this also in a category theoretic setting.

Posted by: bob on December 2, 2010 2:26 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Interesting! I prefer the orthomodular law to the modular law simply because I can remember it a lot more easily. As Mike Shulman hints, orthomodularity is a ‘relativized version’ of the principle of excluded middle

p=(pandq)or(pandnot(q)) p = (p \; and \; q)\; or\; (p\; and\; not(q))

because if you believe pp is true, it reduces to “qq or not(q)not(q) is true”.

There must be some way of stating modularity that’s more conceptual than the ways I know. Anyone here know one?

From Gian-Carlo Rota’s The many lives of lattice theory:

Modular lattices are lattices that satisfy the following identity, discovered by Dedekind:

(cand(aorb))orb=(corb)and(aorb).(c \; and \; (a \; or \; b)) \;or\; b = (c \; or\; b) \; and \; (a \; or \; b).

This identity is customarily recast in user-friendlier ways. Examples of modular lattices are lattices of subspaces of vector spaces, lattices of ideals of a ring, lattices of submodules of a module over a ring, and lattices of normal subgroups of a group. For example, in the lattice of subspaces of a vector space the meet of two subspaces is their set theoretic intersection, and the join of two subspaces is the subspace spanned by the two subspaces. Join and meet of linear varieties in projective projective space are algebraic renderings of projection and section of synthetic projective geometry. Synthetic projective geometry, relying as it does on axioms of incidence and refusing any appeal to coordinates, is best understood in the language of modular lattices.

But synthetic geometry acquired a bad name after algebraic geometers declared themselves unable to prove their theorems by synthetic methods. The synthetic presentation of geometry has become in the latter half of this century a curiosity, cultivated by Italians and by Professor Coxeter. Modular lattices were dismissed without a hearing as a curious outgrowth of a curiosity.

Garrett [Birkhoff] once described to me his first meeting with von Neumann. After exchanging a few words they quickly got down to their common interest in lattice theory, and von Neumann asked Garrett, “Do you know how many subspaces you get when you take all joins and meets of three subspaces of a vector space in general position?” Garrett immediately answered, “Twenty-eight!”, and their collaboration began at that moment. The free modular lattice with three generators, which indeed has twenty-eight elements, is a beautiful construct that is presently exiled from textbooks in linear algebra. Too bad, because the elements of this lattice explicitly describe all projective invariants of three subspaces.

This “28” should be related to the 12 indecomposable representations of the D 4D_4 quiver, but I forget exactly how. Surely it’s related to the fact that the D 4D_4 group is SO(8)SO(8), which has dimension 8×7/2=288 \times 7 / 2 = 28.

Rota doesn’t give me a definition of modular lattice that I can remember, though he makes it clear that the examples are important. In fact, he says:

Having argued for modular lattices, let me now argue against them. It turns out that all modular lattices that occur in algebra are endowed with a richer structure. They are lattices of commuting equivalence relations. What are commuting equivalence relations?

Two equivalence relations on a set are said to be independent when every equivalence class of the first meets every equivalence class of the second. This notion of independence originated in information theory and has the following intuitive interpretation. In the problem of searching for an unknown element, an equivalence relation can be viewed as a question whose answer will tell to which equivalence class the unknown element belongs. Two equivalence relations are independent when the answer to either question gives no information on the possible answer to the other question.

Philosophers have gone wild over the mathematical definition of independence. Unfortunately, in mathematics philosophy is permanently condemned to play second fiddle to algebra. The pairs of equivalence relations that occur in algebra are seldom independent; instead, they satisfy a sophisticated variant of independence that has yet to be philosophically understood—they commute.

Two equivalence relations are said to commute when the underlying set may be partitioned into disjoint blocks and the restriction of the pair of equivalence relations to each of these blocks is a pair of independent equivalence relations. In other words, two equivalence relations commute when they are isomorphic to disjoint sums of independent equivalence relations on disjoint sets.

Mme. Dubreil found in her 1939 thesis an elegant characterization of commuting equivalence relations. Two equivalence relations on the same set commute whenever they commute in the sense of composition of relations, hence the name. The lattice of subspaces of a vector space is an example of a lattice that is naturally isomorphic to a lattice of commuting equivalence relations on the underlying vector space viewed as a mere set. Indeed, if WW is a subspace of a vector space VV, one defines an equivalence relation on the set of vectors in VV by setting xyx \sim y whenever xyWx − y \in W: Meet and join of subspaces are isomorphic to meet and join of the corresponding equivalence relations in the lattice of all equivalence relations on the set VV. The lattice of subspaces of a vector space VV is isomorphic to a sublattice of the lattice of all equivalence relations on the set VV, in which any two equivalence relations commute.

Similar mappings into lattices of commuting equivalence relations exist for the lattice of all ideals of a ring and the lattice of all submodules of a module. Mark Haiman has proposed the term “linear lattice” for lattices of commuting equivalence relations.

Schützenberger found an identity satisfied in certain modular lattices that is equivalent to Desargues’s theorem. Not long afterwards, Bjarni Jónsson proved that every linear lattice satisfies Schützenberger’s identity. At that time the problem arose of characterizing linear lattices by identities. This brings us to two notable theorems Garrett proved in universal algebra.

The first of Birkhoff’s theorems characterizes categories of algebraic systems which can be defined by identities. These are precisely those categories of algebraic systems that are closed under the three operations of products, subalgebras, and homomorphic images. For example, groups and rings can be characterized by identities, but fields cannot, because the product of two fields is not a field. There are algebraic systems which are known to be definable by identities because they have been shown to satisfy the three Birkhoff conditions but for which the actual identities are not known.

The second of Birkhoff’s theorems states that a category of algebraic systems is endowed with “free algebras” if and only if it is closed under products and subalgebras. The category of linear lattices is closed under products and sublattices, so that the free linear lattice on any set of generators exists. A thorough study of free linear lattices, revealing their rich structure, was carried out by Gelfand and Ponomarev in a remarkable series of papers. Their results are so stated as to apply both to modular and to linear lattices. The free linear lattice in nn generators is intimately related to the ring of invariants of a set of nn subspaces in general position in projective space. Gelfand has conjectured that the free linear lattice in four generators is decidable. Recently an explicit set of generators for the ring of invariants of a set of four subspaces in projective space has been given by Howe and Huang; Gelfand’s conjecture is the lattice theoretic analog and is thus probably true.

It is not known whether linear lattices may be characterized by identities. Haiman has proved that linear lattices satisfy most of the classical theorems of projective geometry, such as various generalizations of Desargues’s theorem, and he proved that not even these generalized Desarguian conditions suffice to characterize linear lattices.

The deepest results to date on linear lattices are due to Haiman, who in his thesis developed a proof theory for linear lattices. What does such a proof theory consist of? It is an iterative algorithm performed on a lattice inequality that splits the inequality into subinequalities by a tree-like procedure and eventually establishes that the inequality is true in all linear lattices, or else it automatically provides a counterexample. A proof theoretic algorithm is at least as significant as a decision procedure, since a decision procedure is merely an assurance that the proof theoretic algorithm will eventually stop.

Haiman’s proof theory for linear lattices brings to fruition the program that was set forth in the celebrated paper “The logic of quantum mechanics”, by Birkhoff and von Neumann. This paper argues that modular lattices provide a new logic suited to quantum mechanics. The authors did not know that the modular lattices of quantum mechanics are linear lattices. In light of Haiman’s proof theory, we may now confidently assert that Birkhoff and von Neumann’s logic of quantum mechanics is indeed the long-awaited new “logic” where meet and join are endowed with a logical meaning that is a direct descendant of “and” and “or” of propositional logic.

Of course Birkhoff’s theorems on algebraic structures characterized by identities have now been subsumed by Lawvere’s work on ‘algebraic theories’.

Is it still not known whether linear lattices can be characterized by identities? It would seem easy to decide this using Birkhoff’s theorem. Am I confused?

Posted by: John Baez on December 3, 2010 12:38 AM | Permalink | Reply to this

Re: Solèr’s Theorem

A few remarks in response to John’s comment:

  • The only way I can ever remember that modular law is this: it is saying the same thing as that for any two elements a,ba, b, that the interval [ab,a][a \wedge b, a] is isomorphic to [b,ab][b, a \vee b]. (Draw a parallelogram: opposite sides are congruent.) To say this in a more precise, structured way: in any lattice, there is an adjunction

    ([ab,a]b[b,ab])([b,ab]a[ab,b])([a \wedge b, a] \stackrel{b \vee -}{\to} [b, a \vee b]) \dashv ([b, a \vee b] \stackrel{a \wedge -}{\to} [a \wedge b, b])

    and if this adjunction is an equivalence for any a,ba, b, the lattice is by definition modular. This equivalence makes lots of sense in the classical examples like lattices of vector subspaces, lattices of normal subgroups, etc.

  • As someone said on a Math Overflow thread, one gets the impression from Rota that Lattice Theory is the Rodney Dangerfield of mathematical disciplines. There was a fair amount of gossip on that thread and some complaints about Rota’s attitude from yours truly. Throughout essays like the one John was quoting, and in Indiscrete Thoughts, I find him exasperatingly coy, sybilline, as if he is the possessor of esoteric secrets. It makes me want to pull the veil off. (Café patron Yemon Choi has a decidedly more charitable take on this epigrammatic quality of Rota’s writings.)
  • I wound up doing a mini-research about this linear lattice business for that same Math Overflow query; part of what I found is sketched in this answer. As Rota says, linear lattices are essentially lattices of commuting equivalence relations. When I first saw that phrase, it didn’t really resonate with me, but I came to see that the typical sorts of modular lattices that crop up in universal algebra are exactly is what he’s talking about. It made more sense to me when I began substituting “kernel pair” (a certain type of equivalence relation) for “kernel” (such as a subspace, or normal subgroup).

    So: instead of considering a subspace VV of a vector space WW, consider the equivalence relation where x Vyx \sim_V y if xyVx - y \in V. If you have two subspaces VV, VV', then we have

    y(xyV)and(yzV)(xz)(V+V)\exists_y (x - y \in V) and (y - z \in V') \Leftrightarrow (x - z) \in (V + V')

    and this says that the relational composite of equivalence relations is an equivalence relation:

    V V= V+V\sim_V \circ \sim_{V'} = \sim_{V + V'}

    Since the join of subspaces is a symmetric operation, we see from this that the equivalence relations commute:

    V V= V V\sim_V \circ \sim_{V'} = \sim_{V'} \circ \sim_V

    Similarly for lattices of normal subgroups. In these commuting equivalence relations situations, the join of two equivalence relations is their relational composite.

  • But then what I really came to see is that all of these algebraic examples of commuting equivalence relations seem to arise as kernel pairs when the equational variety in question is a Mal’cev variety. The definition is deceptively simple: an algebraic theory (say in the sense of Lawvere) is Mal’cev if there is a definable ternary operation tt such that t(x,x,y)=y=t(y,x,x)t(x, x, y) = y = t(y, x, x) The archetypal example is the theory of vector spaces where t(x,y,z)=xy+zt(x, y, z) = x - y + z, but there are many others, for example groups, and also Heyting algebras where t(x,y,z)=((xy)z)((zy)x)t(x, y, z) = ((x \Rightarrow y) \Rightarrow z) \wedge ((z \Rightarrow y) \Rightarrow x) Anyway, here is the relevant result:

    Theorem: If TT is a Mal’cev theory and XX is a model of TT (or a TT-algebra), then the lattice of equivalence relations RX×XR \subseteq X \times X in the category of TT-algebras forms a linear lattice (or lattice of commuting equivalence relations).

    The calculations are simple and can be found at the nLab at Mal’cev variety.

  • As Rota remarks, there is a whole slew of identities satisfied by linear lattices or lattices of commuting equivalence relations; the modular law is just the tip of the iceberg. (You can see a proof of the modular law in the Mal’cev variety case at the nLab page just cited.) There is also the famous Desarguesian law which has a lattice-theoretic formulation and which is satisfied in these linear lattices, and again see that nLab page for Freyd’s curious proof. Rota and his collaborators had begun a kind of graphical logic used to establish identities in linear lattices (cited in my Math Overflow answer), but I get the sense that he was skeptical that they would ever completely plumb the depths. He says it is an open problem whether the equational theory of linear lattices is decidable.
  • At the same time, I also got the impression that they hadn’t been talking much to categorical logicians. There are some very interesting remarks in Categories, Allegories which strike me as maybe being pertinent to the study of linear lattices, although I am far from sure. The idea is that linear lattices arise as sublattices of Equiv(X)Equiv(X) in the allegory of sets, and that the linear lattice identities that I know of all seem to have an allegorical formulation (replacing joins RSR \vee S by composites RSR \circ S) that hold more generally in the allegory of sets. So that the set of conditions on allegories which ensure a faithful representation in the allegory of sets would contain all the linear lattice equations.

    As it happens, Freyd and Scedrov also outline a graphical calculus and in fact a decision procedure which would account for all allegorical equations that are true in the allegory of sets. If my guess is right, that would in principle give a decision procedure that accounts for identities in linear lattices. I am wondering whether there is some connection between the graphical calculus of Freyd and Scedrov and the graphical logic for linear lattices devised by Rota et al.

Posted by: Todd Trimble on December 3, 2010 4:00 AM | Permalink | Reply to this

Re: Solèr’s Theorem

What a great, meaty comment!

Todd wrote:

The only way I can ever remember that modular law is this: that for any two elements a,ba,b, that the interval [ab,a][a\wedge b,a] is isomorphic to [b,ab][b,a\vee b].

Thanks, that may do the job! If I can remember this, my mathematical instincts should kick in and tell me that ‘is isomorphic to’ should be replaced by ‘has the property that some canonical map is an isomorphism’… and then I can sit down and remember the canonical map.

(In fact, maps going both ways!)

(Draw a parallelogram: opposite sides are congruent.)

Right. Now I’m having a strange idea. There seems to be a mystical relationship between this parallelogram — which lives in a lattice, e.g. the lattice of subspaces of a vector space — and the parallelogram that shows up in my mind when I think about “commuting equivalence relations”.

When I think about commuting equivalence relations, I imagine a vector space VV equipped with two subspaces AA and BB, which give two equivalence relations on VV:

vRwiffvwA v R w \; iff \; v - w \in A

and

vSwiffvwB v S w \; iff \; v - w \in B

Clearly these equivalence relations commute:

RS=SR R S = S R

and they way I ‘see’ this, is to imagine the case where VV is a plane and AA and BB are lines through the origin. Then RR and SS look like foliations of the plane by parallel lines. Checking RS=SRR S = S R in this example amounts to drawing a parallelogram: if you can get to somewhere by going parallel to AA and then going parallel to BB, you can also get there by going parallel to BB and then parallel to AA.

(Of course the same idea works in higher dimensions, and it’s sort of silly to focus on the 2d example, since then RSR S and SRS R are generically the ‘always true’ relation, but visualizing the 2d case makes it easier to visualize all the higher-dimensional cases.)

So when you said ‘parallelogram’, I instantly thought of this. And there seems to be some relationship between my parallelogram and yours. But it’s sort of misty and vague, which is why I called it ‘mystical’.

Hmm. I guess my parallelogram is all about 00, aAa \in A, bBb \in B and a+b=b+aA+Ba + b = b + a \in A + B. Yours is all about ABA \wedge B, AA, BB, and ABA \vee B. So mine is like a ‘microcosmic version’ of yours: elements instead of sets.

It’s still pretty vague, but I thought I’d mention it.

Throughout essays like the one John was quoting, and in Indiscrete Thoughts, I find [Rota] exasperatingly coy, sybilline, as if he is the possessor of esoteric secrets. It makes me want to pull the veil off.

We’ve already discussed this privately, but now I feel the urge to repeat myself.

I love Rota’s writing because it’s interesting. It shows the glimmering highlights of a subject without all the details. To me, the main problem with most mathematical writing is that it contains so much detail that it makes me struggle to sift through the noise to find the signal: the ‘essence’. Once I see the essence, I can learn the details to the extent that I need to know them. But spotting the essence amid the clutter of details takes a lot of time. It can be fun in a detective-story sort of way — but like actual detective work, it also requires a lot of slogging, the equivalent of going to the local court office and sifting through hundreds of deeds.

So what could be better than a mathematical essay that makes you desperately eager to ‘rip the veil off’? But I think I understand your frustration. Instead of whetting your appetite with the highlights and then satisfying you with the details, he whets your appetite and then sends you packing to fill in the details yourself.

I must admit that I enjoy Rota’s writing because I knew him pretty well. I never took a math course from him when I was a grad student at MIT, but I took his course on Heidegger and Husserl, and it was quite an experience.

It met from 7 to 10 pm on Tuesday nights. He would take a break halfway through and go outside to smoke. I soon hooked up with a group of people who liked philosophy, and afterwards he would go to dinner with all of us, typically to Legal Seafood, where he would have champagne and oysters for dinner. When he sensed that this restaurant was straining my limited budget, he paid the bill himself. “Don’t worry,” he joked, “It’s on my credit card,” as if that made it free.

He was intimidatingly suave for a young math geek like me — dressing with Italian taste, making big bucks doing secret work on the side for Los Alamos, etc. — but he was very friendly, with a great sense of humor. He was in love with ‘style’ and ‘stylishness’, but it was all a game for him: he would often break out laughing after making some nice epigrammatic remark. So when I read his essays, I hear them in his voice, and see the gleam of humor in them.

If you dislike his essays, you’d hate the brief book reviews he used to write for the journal he ran, Advances in Mathematics. They are often just a couple of sentences long. By writing these reviews, he got publishing companies worldwide to send him review copies of math books. Near the end of his life, he rented two apartments in Central Square: one to live in, and one that was full of books in ‘stacks’, just like an actual library. Even the one he lived in was full of books, piled up in stacks of the other sort. Apparently this mania for books was one reason his wife left him.

There are a lot more Rota stories to tell, and a lot more to say about the math in your comment, but I think I’ll quit here for now, and say more later.

Posted by: John Baez on December 4, 2010 2:45 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Looking forward to hearing more reactions!

Regarding Rota:

I love Rota’s writing because it’s interesting.

It is interesting.

It shows the glimmering highlights of a subject without all the details.

If we are talking about his actual mathematics papers and books, then I think it is universally agreed that Rota is a fantastic, masterly writer. A few pages of Klain and Rota’s book should be enough to convince anyone. So there I agree with you, absolutely.

But it’s more the book review or “gossip” type of thing that I’m talking about when I say I find him exasperating. That gossip about Grothendieck that we were talking about over at Math Overflow is a fairly typical example of what I’m referring to.

When I say “pull the veil off”, that’s just a polite way of saying, “call him on his BS”, because in some of these cases I have strong suspicion that he was way overplaying his hand. This thing about Grothendieck and distributive lattices just seems so thin to me, that I suspect the real reason he cloaks his remarks there in mystery is that there’s actually not much there to begin with!

(If there are any Rota disciples reading this and know what the heck Rota is talking about in this gossip, I would love to hear it. I was hoping to hear from such people at Math Overflow, but didn’t. I did do a little research into the connection between distributive lattices and the Chinese remainder theorem, but it hardly justifies whatever fuss he’s making about algebraic geometers and lattice theory.)

I won’t repeat the story of what I witnessed at a public lecture by Rota, but the upshot is that during the questions period, a philosophy professor did call him on his BS… and Rota had no response! (He did respond, but in the lamest possible way.) I’ll add that it was a very witty and stylish lecture, but when the chips were really down…

I must admit that I enjoy Rota’s writing because I knew him pretty well. I never took a math course from him when I was a grad student at MIT, but I took his course on Heidegger and Husserl, and it was quite an experience.

Well, there I am rather envious of you. I do find his ruminations on phenomenology and mathematics in Indiscrete Thoughts utterly fascinating. In fact, it drove me to check out a book by his pal Robert Sokolowski, Introduction to Phenomenology, which I strongly recommend to anyone who wants to learn what phenomenology is really about. It’s written just the way a mathematician would like: both principled and chock full of really well-chosen illustrative examples. In fact, I think I remember reading in the introduction or foreword that the book was written at Rota’s strong urging.

Posted by: Todd Trimble on December 4, 2010 10:39 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Rota used to review books on philosophy as well as maths. One was called “Contemporary philosophers.” Rota: “When pygmies cast such long shadows, it must be very late in the day.”

Posted by: GMHurley on December 4, 2010 12:56 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Yeah, that one was reprinted in Indiscrete Thoughts. It’s awfully nasty, but I admit it made me laugh.

Posted by: Todd Trimble on December 4, 2010 1:54 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Another nice one (and I am quoting from memory): “If one writes as well as Quine one may forgive him everything, even the fact that he is an analytical philosopher”

Posted by: Gonçalo Marques on December 4, 2010 2:06 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Rota wrote:

When pygmies cast such long shadows, it must be very late in the day.

This is reminiscent of Gell–Mann’s variant on a famous remark of Newton. Newton wrote to Hooke:

If I have seen a little further it is by standing on the shoulders of Giants.

The particle physicist Murray Gell–Mann wrote:

If I have seen farther than others, it is because I am surrounded by dwarves.

Posted by: John Baez on December 7, 2010 12:33 PM | Permalink | Reply to this

Re: Solèr’s Theorem

This reminds me of Karl Krauss’ “even dwarves cast a shadow when the sun of culture sets down” (“Wenn die Sonne der Kultur niedrig steht, werfen selbst Zwerge einen Schatten”)

Posted by: Martin Ouwehand on December 7, 2010 2:00 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Ha! That’s probably where Rota got that from. Thanks, Martin.

Posted by: Todd Trimble on December 7, 2010 3:07 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Mark Kac’s autobiography Enigmas of Chance page xvi Kac quotes Erwin Chargaff article “Preface to Grammar of Biology” ; “That Pygmies cast giant shadows is proof how late in the day it has become.”

Rota reviewed this book in Advances in Math back in the eighties, so I am almost certain this is where the quote came from unless he read Chargaff independently of Kac. Every child of math should read this book.

Posted by: john e gray on December 10, 2010 5:44 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Thanks, John! (There is a well-known category theorist with your first and last name, but I’m not sure about the middle initial.)

Posted by: Todd Trimble on December 10, 2010 11:37 AM | Permalink | Reply to this

Re: Solèr’s Theorem

I checked to make sure it wasn’t the same guy. I was sort of hoping it was… that would be pretty interesting.

Posted by: John Baez on December 11, 2010 11:45 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Not me, only category theorist I know is Andre Ehresmann who I told about your website after it opened. CT is useful way to think about some aspects of symmetry.

Posted by: john e gray on December 15, 2010 4:44 AM | Permalink | Reply to this

Re: Solèr’s Theorem

I had forgotten that Freyd and Scedrov give a graphical calculus for allegories. How is it related to your string diagram calculus for predicate logic?

Posted by: Mike Shulman on December 6, 2010 4:05 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Mike, I hadn’t thought about it before you brought it up! Having written out most of this comment and now returning here, I see there is definitely a connection to the string diagram calculus, except the string diagram calculus is richer (more complicated) because it is meant to handle discrete cartesian (locally posetal) bicategories = unitary pretabular allegories, whereas the graphs used by Freyd and Scedrov do not need to account for the unitary and the pretabular. Thus, the calculus they deal with is probably simpler. I need to think about this some more.

In case anyone wants to follow along, the Freyd-Scedrov graphs I was referring to are described starting on page 207 of Categories, Allegories. I may as well run through this now; maybe something will occur to me as I do so. F-S start by considering the category of bipointed directed graphs (directed graphs equipped with distinguished nodes ss and tt, possibly the same), and whose edges are labeled by letters of some alphabet AA. The underlying graphs can also be considered cospans

1sGt11 \stackrel{s}{\to} G \stackrel{t}{\leftarrow} 1

in the finitely cocomplete category of graphs. There are two ways of “composing” graphs: the “series” composition by cospan composition, and “parallel” composition by taking the coproduct in the category of cospans from 1 to 1. Note that the coproduct means: take the disjoint sum, and identify the two ss’s and identify the two tt’s.

(These operations do correspond to string diagram operations; the series composition corresponds to the ordinary vertical composition os string diagrams.)

Next, F-S take the posetal collapse of the monoidal category Cospan(1,1) opCospan(1, 1)^{op}, so that they get a one-object allegory, where now the parallel composites become products = intersections, and the series composition is the “relational composition”. There is also a reciprocation operation obtained by interchanging ss and tt.

Finally, they cut down by looking at the suballegory G A\mathbf{G}_A generated by AA-labeled graphs with just one edge between distinct nodes ss and tt. (At this point, I’ll throw out a little exercise: show that it is possible to have s=ts = t in the allegory of such graphs.)

They think of such graphs as those circuits between termini ss and tt that admit a parallel-series decomposition, and this is what gave me the idea that there might be some connection with the graphs considered by Rota and his collaborators for expressing the logic of linear lattices – they too conceive of these graphs in terms of circuits and the two types of composition. That plus the fact that linear lattices form allegories of equivalence relations that embed faithfully in the allegory of sets.

Here is what Freyd and Scedrov say:

  • Define an allegory to be representable if it admits a faithful representation in the allegory of sets. Then, G A\mathbf{G}_A is the free one-object representable allegory generated by the alphabet AA.

That seems pretty believable to me; there are similar freeness results in string diagram calculus.

The next critical observation they make is this:

  • Equality is decidable in G A\mathbf{G}_A.

(That is, given two ‘circuits’, there is a decision algorithm for detecting whether there are graph morphisms between them running each way. However, the equational theory is not finitely axiomatizable.)

I haven’t gone through all of their arguments yet; they are terse and dense, as usual in that book.

Posted by: Todd Trimble on December 7, 2010 6:30 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Interesting! “Representability” seems at first sight like an unreasonably strong condition; any idea how accurate that reaction is? If E is a regular category, like a topos, then what does “representability” of Rel(E) say about E?

Also, what does “A-labeled graphs” mean? Are we labeling the edges? (I don’t have a copy of C,A with me right now.)

Posted by: Mike Shulman on December 8, 2010 6:36 AM | Permalink | Reply to this

Re: Solèr’s Theorem

Yes, it’s the edges that are being labeled by the alphabet. The alphabet itself is supposed to index a set of relations R aR_a on a set XX, and the nodes can be thought of as generic elements of XX, where labeling an edge from node xx to node yy by aa amounts to an assumption x(R a)yx (R_a) y.

“Representability” seems at first sight like an unreasonably strong condition; any idea how accurate that reaction is? If EE is a regular category, like a topos, then what does “representability” of Rel(E)Rel(E) say about EE?

To address a possible confusion which would have been my fault: I didn’t mean to assert that the string diagrams for predicate calculus address representability issues; I was only thinking at the time of writing that the construction of G A\mathbf{G}_A which seems reasonable according to my experience with string diagrams for predicate calculus.

The string diagram calculus really works best for unitary pretabular allegories (discrete cartesian locally posetal bicategories). But this stops short of full-fledged tabulations for which one has representation theorems (e.g., any small unitary tabular allegory embeds in a power of Rel(Set)Rel(Set)).

While I’m at it, let me try to link to this Google book result for The Logic of Commuting Equivalence Relations by Finberg, Mainetti, and Rota. Now that I’ve had a chance to think on it more, I think I more clearly see the fundamental similarity between what they’re doing and what Freyd and Scedrov are doing.

Some remarks on their paper. They write:

The system of natural deduction for linear lattices that is presented in the present work follows the lines of other natural deduction systems that were developed for the predicate calculus. In its simplest form, it provides a series of mechanical steps whereby, given an inequality PQP \leq Q between two lattice polynomials PP and QQ, either the inequality is proved true for every linear lattice, or else a linear lattice in which the inequality does not hold is automatically constructed.

I am a little confused by this passage, because the way it is written, it seems to suggest that the theory of equations valid in linear lattices is decidable. That is a little bit at odds with what Rota writes in The Many Lives of Lattice Theory (page 5), which leaves the issue more open; maybe what they mean is that a proof-generating program can be written, but it is not known whether it always halts in finitely many steps. (I didn’t see any decidability results in the pages I viewed; from what I saw, it was mostly a series of examples that illustrate the graphical natural deductions.)

On the other hand, Freyd and Scedrov claim that they have a decision procedure for one-object representable allegories, so I think there is cause for optimism here: it looks like a linear lattice is the same thing as a one-object representable allegory in which each element RR is reflexive, symmetric, and transitive.

Posted by: Todd Trimble on December 8, 2010 6:33 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Todd wrote:

But then what I really came to see is that all of these algebraic examples of commuting equivalence relations seem to arise as kernel pairs when the equational variety in question is a Mal’cev variety. The definition is deceptively simple: an algebraic theory (say in the sense of Lawvere) is Mal’cev if there is a definable ternary operation tt such that t(x,x,y)=y=t(y,x,x)t(x, x, y) = y = t(y, x, x) The archetypal example is the theory of vector spaces where t(x,y,z)=xy+zt(x, y, z) = x - y + z but there are many others, for example groups, and also Heyting algebras where t(x,y,z)=((xy)z)((zy)x)t(x, y, z) = ((x \Rightarrow y) \Rightarrow z) \wedge ((z \Rightarrow y) \Rightarrow x) Anyway, here is the relevant result:

Theorem: If TT is a Mal’cev theory and XX is a model of TT (or a TT-algebra), then the lattice of equivalence relations RX×XR \subseteq X \times X in the category of TT-algebras forms a linear lattice (or lattice of commuting equivalence relations).

Hey, that’s great! Is this theorem your own? If so, congratulations: you snatched a really notable truth from the jaws of the unknown.

I had heard about Mal’cev varieties but flinched from understanding them, in part because nobody explained them as well as you just did, but also because ternary operations are a bit scary. I’ve only recently started to like them.

Somewhere Rota said something like: the mathematics of the future will explore ternary operations with the thoroughness that present-day mathematics has explored binary operations.

Lately I’ve been learning more about symmetric spaces and their relation to Jordan triple systems and Lie triple systems. Have you thought about those, Todd? They’re both vector spaces equipped with trilinear operations obeying axioms that initially seem obscure and terrifying.

But you get a Lie triple system whenever you have an ordinary Lie algebra with a /2\mathbb{Z}/2-grading L=L 0L 1 L = L_0 \oplus L_1 just by considering [[,],]:L 1×L 1×L 1L 1[[\cdot, \cdot], \cdot] : L_1 \times L_1 \times L_1 \to L_1 Conversely, there’s a universal construction that coughs up a Lie algebra with a /2\mathbb{Z}/2-grading starting from a Lie triple system. We don’t get an equivalence of categories, just an adjunction… but it’s very nice.

Similarly, you get a Jordan triple system whenever you have a semisimple Lie algebra equipped with a \mathbb{Z}-grading with just the three ‘middle’ grades nonzero: L=L 1L 0L 1 L = L_{-1} \oplus L_0 \oplus L_1 I won’t bother explaining how, but since such a thing can be seen as a /2\mathbb{Z}/2-graded Lie algebra with extra structure, it’s not surprising that a Jordan triple system can be seen as a Lie triple system with extra structure.

Now, this might seem like an idle digression on theme of ‘algebraic structures with ternary operations’, and maybe it is… but is it? I’m not sure.

On the one hand, Lie triple systems appear to be related to Mal’cev algebras, and there are some books that mention both Mal’cev algebras and Mal’cev varieties.

But is there any relation between Mal’cev algebras and Mal’cev varieties besides the guy?

It could be just a will-o’-the-wisp rather than a real connection! Does anyone here know?

Posted by: John Baez on December 7, 2010 1:24 PM | Permalink | Reply to this

Re: Solèr’s Theorem

John wrote:

Hey, that’s great! Is this theorem your own? If so, congratulations: you snatched a notable truth from the jaws of the unknown.

If my notes from a universal algebra course given by Peter Johnstone are correct (and I assume they are), this theorem is due to Mal’cev. I guess it’s the reason he invented Mal’cev theories. Actually, the theorem can be improved from an “if” to an “iff”: a theory TT is Mal’cev iff for any congruences RR and SS on any TT-algebra XX, we have RS=SRR S = S R.

There are a bunch more equivalent conditions; Peter lists seven in total. For example, if RR and SS are congruences on a model of TT then so is RSR S. A congruence is what Todd calls an equivalence relation in the category of TT-algebras. In other words, a congruence on a TT-algebra XX is an equivalence relation RR on XX that, regarded as a subset of X×XX \times X, is a subalgebra. In the theory of groups, congruences correspond to normal subgroups; in the theory of rings, they correspond to ideals.

I think this is one of those instances where a mathematical concept has several equivalent definitions, one of which is extremely short and extremely hard to interpret. Since mathematicians love brevity, they tend to quote the short one, to the general mystification of the uninitiated. So even though Todd doesn’t necessarily need congratulating for proving the theorem, he can be congratulated for stating it.

Posted by: Tom Leinster on December 7, 2010 2:01 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Tom already said what I was about to say. Experts in universal algebra know well these facts about Mal’cev theories. One text where you can read more about this (besides the ever-lovin’ nLab), and which you probably still have somewhere, is that same set of summer school notes where you (John) gave lectures in Coimbra on categorification. The other summer schools were by Vaughan Pratt on Chu spaces, and by Maria M. Clementino on algebraic theories – she has some stuff on Mal’cev theories there, including the facts I mentioned.

(I have very fond memories of that 1999 conference.)

I haven’t thought about Jordan triple systems and their relations to symmetric spaces, unfortunately. It sounds like fun, though!

I don’t know of any connection between Mal’cev algebras and Mal’cev varieties, except for the fact that Mal’cev algebras form a Mal’cev variety (which is obvious – it’s just because Mal’cev algebras have underlying vector spaces). Perhaps there’s a more exciting connection.

Posted by: Todd Trimble on December 7, 2010 3:28 PM | Permalink | Reply to this

Re: Solèr’s Theorem

Characterizing the lattice of equivalence relations in terms of the existence of particular terms is one of the big triumphs of universal algebra. There are characterizations for most properties you can think of, such as being distributive or modular. Here’s a book chapter on the subject.

You can think of the ternary operation as being the parallelogram definition of vector addition in Euclidean space. If x, y, and z are points in space, then the four points x, y, z, and M(x,y,z) make up a parallelogram. Vector addition is “really” a ternary operations in this sense, except we normally fix a single point 0 to be the origin. If you fix an element to be 0, then you can reconstruct vector addition and subtraction as x + y = M(x, 0, y) and -y = M(0,y,0). (Commutativity and associativity won’t hold in general, of course.)

There is a general notion of what it means for an algebra to be “abelian” in universal algebra. If you specialize it to Mal’cev varieties, then you get a nice result: abelian Mal’cev varieties are exactly modules over a ring (without the choice of distinguished zero) – addition is defined as above.

Posted by: walt on December 8, 2010 9:29 AM | Permalink | Reply to this

Re: Solèr’s Theorem

I’m surprised no one’s mentioned the M 3M_3-N 5N_5 Theorem, which characterizes both modular and distributive lattices.

N 5N_5 is the smallest example of a non-modular lattice, and looks like this:

Any sublattice of a modular lattice is modular. Hence if a lattice is modular, it has no sublattice isomorphic to N 5N_5.

The modular half of the M 3M_3-N 5N_5 Theorem states the converse. In other words, N 5N_5 is essentially the only non-modular lattice:

Theorem  A lattice is modular iff it has no sublattice isomorphic to N 5N_5.

You could take that as a definition of modularity.

I might as well state the other half of the theorem, concerning distributivity. Distributivity is a stronger condition than modularity. In particular, N 5N_5 isn’t distributive. There’s also another five-element non-distributive lattice, called M 3M_3:

Every sublattice of a distributive lattice is distributive, so no distributive lattice contains either M 3M_3 or N 5N_5 as a sublattice. The other half of the M 3M_3-N 5N_5 Theorem says that this is the only obstruction to distributivity.

Theorem  A lattice is distributive iff it has no sublattice isomorphic to M 3M_3 or N 5N_5.

Posted by: Tom Leinster on December 4, 2010 4:39 AM | Permalink | Reply to this

Post a New Comment