Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

October 26, 2012

The Zorn Identity

Posted by Tom Leinster

(Note: the Café went down for a few days in early November 2012, and when Jacques got it back up again, some of the comments had been lost. I’ve tried to recreate them manually from my records, but I might have got some of the threading wrong.)

My aim here is to make one simple point:

Zorn’s lemma has almost nothing to do with the axiom of choice.

It hasn’t got much to do with set theory, either.

People often describe Zorn’s lemma as a form of the axiom of choice. If called on to justify that statement, they would probably answer: in the presence of the Zermelo–Frankel axioms for set theory, the axiom of choice implies Zorn’s lemma and vice versa. This is, indeed, a fact.

However, I want to argue that the emphasis is misplaced. Zorn’s lemma is provable entirely without the axiom of choice, until the very end, where choice is used in a simple and transparent way.

Let me make that precise. Here are three theorems about partially ordered sets (posets).

Preliminary definitions: for a poset XX,

  • a chain in XX is a subset of XX that, with the order inherited from XX, is totally ordered
  • an upper bound for a subset CXC \subseteq X is an element xXx \in X such that cxc \leq x for all cCc \in C
  • a maximal element of XX is an element mXm \in X such that the only element of XX greater than or equal to mm is mm itself.

The first of the three theorems is usually called “Zorn’s lemma”.

1. Zorn’s maximal theorem  Let XX be a poset in which every chain has an upper bound. Then XX has a maximal element.

For the second, we need another definition: an inflationary operator on a poset XX is a map of sets s:XXs\colon X \to X such that s(x)xs(x) \geq x for all xXx \in X. Note that ss need not be order-preserving.

2. Zorn’s fixed point theorem  Let XX be a poset in which every chain has an upper bound. Then every inflationary operator on XX has a fixed point.

In the definition of upper bound for a subset CXC \subseteq X, it is not required that the upper bound belongs to CC. However, our third theorem says that if we’ve got an upper bound for every chain in a poset, then at least one of them must in fact belong to that chain.

3. Zorn’s chain theorem  Let XX be a poset. Let u:{chains in  X}Xu\colon \{\text{chains in }  X\} \to X be a function assigning to each chain in XX an upper bound. Then u(C)Cu(C) \in C for some chain CC.

I confess that these names are completely without historical justification. I have no idea which of these statements Zorn knew, or proved. For all I know, Zorn didn’t even prove Zorn’s lemma — that wouldn’t be surprising, given how inaccurate mathematical attributions often are.

My central point, that Zorn’s lemma has almost nothing to do with the axiom of choice, is justified as follows:

  • Zorn’s chain theorem is provable without the axiom of choice
  • Zorn’s chain theorem trivially implies Zorn’s maximal theorem (a.k.a. Zorn’s lemma).

I won’t show you the choice-free proof of Zorn’s chain theorem, or at least, not today. But I will show you how Zorn’s chain theorem (3) trivially implies Zorn’s maximal theorem (1). In fact, I’ll show you how each of the three theorems trivially implies the others. There’s a direction to it: both implications 3 \Rightarrow 2 \Rightarrow 1 seem to require choice, but the reverse implications don’t.

If you’re feeling too lazy to read the following proofs, it doesn’t really matter. All you need to observe is: they’re very short!

  • 1 \Rightarrow 2  An inflationary operator fixes every maximal element.
  • 2 \Rightarrow 1  There is an inflationary operator ss on XX defined by taking s(x)=xs(x) = x whenever xXx \in X is maximal, and choosing s(x)>xs(x) \gt x for all other xXx \in X. Then ss has a fixed point, so XX has a maximal element.
  • 2 \Rightarrow 3  The restriction of uu to one-element chains is an inflationary operator, so has a fixed point.
  • 3 \Rightarrow 2  Choose an upper bound v(C)v(C) for each chain, and let ss be an inflationary operator on XX. Then whenever CC is a chain, s(v(C))s(v(C)) is also an upper bound for CC. Hence s(v(C))Cs(v(C)) \in C for some chain CC, so s(v(C))v(C)s(v(C)) \leq v(C), so v(C)v(C) is a fixed point of ss.

So, the three theorems are trivially deducible from each other. In contrast, the proof of any one of them is a relatively serious endeavour.

Describing Zorn’s lemma as a form of the axiom of choice is a bit like describing a person in terms of their hairstyle. Occasionally hair matters: wear a mohican and snooty restaurants will refuse you entry, but part your hair at the side and they’ll let you in. Nevertheless, for almost all purposes, you’re the same person. The three theorems above are really one theorem with three different haircuts.

I tried to come up with a similar example from elsewhere in mathematics, but I couldn’t think of one. (Maybe you can.) What we need is a nontrivial theorem that exists in two variants, one of which requires the axiom of choice for its proof and one of which doesn’t. Deducing either variant from the other should be trivial.

Why does any of this matter? That question can only have a personal answer. I’ll give you mine, but it’s quite likely you’ll disagree, and that’s OK — it is, after all, personal.

What it’s not about, for me, is the axiom of choice. I’m not actually enormously interested in figuring out which parts of mathematics use the axiom of choice and which don’t. (I don’t think that’s a pointless endeavour; I’m only stating that it’s not a particular interest of my own at the moment, just as heart surgery and firefighting aren’t.)

What it is about is a continuing effort to understand the shape and character of set theory. Like many readers, I suspect, I took a ZFC-based course on set theory when I was a student. At the time I very much enjoyed it. Later, I came to realize that there was a qualitative difference between theorems I’d learned such as “for any sets XX and YY, either XX is not an element of YY or YY is not an element of XX” (which is only meaningful in certain formal systems) and those such as “for any sets XX and YY, either there is an injection from XX into YY or there is an injection from YY into XX” (which just about every mathematician would agree is meaningful). Now, I find myself wanting to understand which of the results typically called “set theory” — like Zorn’s lemma — are really results about order that intrinsically have little to do with sets.

So let’s treat Zorn’s lemma in exactly the same way as we treat, say, the classification theorem for finite abelian groups — no mystique, just a straightforward, run-of-the-mill, result. Let’s not treat it as a mysterious creature poking its head up from the dark set-theoretic cellar of mathematics. Let’s call it what it is: quite simply, a useful piece of order theory.

Posted at October 26, 2012 11:33 AM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2570

38 Comments & 0 Trackbacks

Re: The Zorn Identity

Zorn’s original paper is easy to find, freely available, short, easy to read, and worth looking at, if only to save yourself from having to say things like “For all I know, Zorn didn’t even prove Zorn’s lemma.” You also get an idea of how he himself thought of it, which is worth something.

Posted by: Graham on October 26, 2012 1:28 PM | Permalink | Reply to this

Re: The Zorn Identity

Thanks for the reference, Graham. That seems like a nice paper, and I agree, it’s easy to read.

Two things about it strike me, after just a quick look.

First, what he calls the “maximum principle” isn’t as general as what we now call Zorn’s lemma. It states that given any “closed” set of sets, there is at least one that is maximal with respect to the inclusion ordering. Here “closed” means closed under taking unions of chains. This seems to amount, essentially, to the following statement:

Let XX be a poset in which every chain has a least upper bound. Then XX has a maximal element.

That “least” makes it a weaker statement than what we now call Zorn’s lemma.

Second, he doesn’t prove the maximum principle in this paper. Rather, he takes it as a principle or axiom that one might simply assume. At the bottom of page 669, he writes:

In another paper I shall discuss the relations between MP, the axiom of choice, and the well-ordering theorem. I shall show that they are equivalent if the axiom yielding the set of all subsets of a set is available.

So unless I’m overlooking something, Zorn neither states nor proves Zorn’s lemma in this paper. But maybe he did so elsewhere — do you know?

Posted by: Tom Leinster on October 26, 2012 1:44 PM | Permalink | Reply to this

Re: The Zorn Identity

I’m surprised to learn that apparently the other paper never appeared. The wikipedia page:

Essentially the same formulation (weakened by using arbitrary chains, not just well-ordered) was independently given by Max Zorn in 1935,[2] who proposed it as a new axiom of set theory replacing the well-ordering theorem, exhibited some of its applications in algebra, and promised to show its equivalence with the axiom of choice in another paper, which never appeared.

Huh, you learn something every day.

Posted by: Graham on October 26, 2012 2:53 PM | Permalink | Reply to this

Re: The Zorn Identity

When Zorn was old he was at Indiana University at Bloomington. As some people at Indiana still remember he used to protest when in his presence some seminar speaker would conspicuously mention “Zorn lemma” in his presence, which he claimed was known before him and he did not want to hear it quoted as his result.

Posted by: Zoran Skoda on October 26, 2012 6:21 PM | Permalink | Reply to this

Re: The Zorn Identity

Let ZL be the statement that any poset in which any chain has a least upper bound contains a maximal element, and let ZL+ be the statement that any poset in which any chain has some upper bound contains a maximal element.

Then trivially ZL+ ⇒ ZL. The converse also holds (even constructively so):

Let X be a poset in which any chain has some upper bound. In the poset of chains of X, every chain has a least upper bound (the union of a given chain of chains). Hence there is a maximal chain X₀ by ZL.

By assumption, the chain X₀ has some upper bound x₀. Since X₀ ∪ {x₀} is a chain, maximality of X₀ implies x₀ ∈ X₀.

Now let x be an arbitrary element of X such that x₀ ≤ x. Since X₀ ∪ {x} is a chain, maximality of X₀ again implies x ∈ X₀; hence x ≤ x₀.

(This argument is an adaption of D4.5.14 in the Elephant.)

Posted by: Ingo Blechschmidt on January 5, 2022 3:52 PM | Permalink | Reply to this

Re: The Zorn Identity

If you can read it, there is This handily-titled paper, which relies on some private communication with Zorn himself.

Posted by: Jesse C. McKeown on October 26, 2012 2:50 PM | Permalink | Reply to this

Re: The Zorn Identity

Ah, excellent! If anyone wants to know who proved Zorn’s lemma first and whether it should be named after Zorn, this is surely the best place to look.

Posted by: Tom Leinster on October 26, 2012 2:56 PM | Permalink | Reply to this

Re: The Zorn Identity

Yes, that’s a nice find!

Posted by: Graham on October 26, 2012 4:01 PM | Permalink | Reply to this

Re: The Zorn Identity

How constructive is the Zorn chain theorem? I.e., is excluded middle used in your proof?

Of course, it would be nice to link this, if possible, with Pataraia’s theorem (which is constructive), as we were beginning to discuss in the other thread. Perhaps such a link is apparent in your proof (that you are withholding from us, darn it!)?

Ah, excellent! If anyone wants to know who proved Zorn’s lemma first and whether it should be named after Zorn, this is surely the best place to look.

Since that article is behind a paywall, maybe someone could tell us the answer to both questions for free?

This looks really nice, Tom. It should be archived somewhere (and you probably know a place I have in mind).

Posted by: Todd Trimble on October 26, 2012 3:25 PM | Permalink | Reply to this

Re: The Zorn Identity

Jumped the gun. It’s not behind a paywall. Fine, I’ll read it then!

Posted by: Todd Trimble on October 26, 2012 3:29 PM | Permalink | Reply to this

Re: The Zorn Identity

Again, I wasn’t teasing! It’s just that from a quickish look at the paper, I couldn’t see a straight answer to either question. My overall impression was “it’s complicated”.

Posted by: Tom Leinster on October 26, 2012 5:23 PM | Permalink | Reply to this

Re: The Zorn Identity

I’d like to say that I’m holding back something exciting, but I’m really not. In particular, despite some effort, I can’t find a way to use a Pataraia-like argument to get a nice proof of Zorn’s lemma.

Here’s a sketch of how one can prove the Zorn chain theorem.

First prove the Bourbaki–Witt fixed point theorem: for a poset YY in which every chain has a least upper bound, every inflationary operator on YY has at least one fixed point.

Now take a poset XX and an operation uu assigning to each chain CC in XX an upper bound u(C)Xu(C) \in X. Let Ch(X)Ch(X) be the set of chains in XX, ordered by inclusion. Note that the union of a chain of chains is again a chain, from which it follows that every chain in Ch(X)Ch(X) has a least upper bound. Also, s:CC{u(C)}s\colon C \mapsto C \cup \{u(C)\} is an inflationary operator on Ch(X)Ch(X). By the Bourbaki–Witt theorem, s(C)=Cs(C) = C for some chain CC, and then u(C)Cu(C) \in C.

The second part is certainly constructive (and is a trick I’m sure you’re familiar with). So Zorn’s chain theorem can be proved constructively if the BW theorem can. Can it? I don’t know. Let’s see. Given an inflationary operator ss on a chain-complete poset YY, we can define an element s α(0)s^\alpha(0) for each ordinal α\alpha by iterating ss and taking joins at limit ordinals. Hartog’s lemma tells us that s α(0)=s β(0)s^\alpha(0) = s^\beta(0) for some ordinals α<β\alpha \lt \beta (though maybe it doesn’t construct such an α\alpha and β\beta for us). Since the sequence is increasing, we have s α(0)=s α+1(0)=s(s α(0))s^\alpha(0) = s^{\alpha + 1}(0) = s(s^\alpha(0)), and we’re done.

So, I don’t know.

Posted by: Tom Leinster on October 26, 2012 5:21 PM | Permalink | Reply to this

Re: The Zorn Identity

Thank you very much! The last page of this paper informs me that the Bourbaki-Witt theorem has no constructive proof, as shown by Andrej Bauer in his paper

  • On the Failure of Fixed Point Theorems for Chain-complete Lattices in the Effective Topos, Electronic Notes in Theoretical Computer Science 249 (2009), p. 157–167

This looks as though it might put the kibosh on any dreams I might have had…

Posted by: Todd Trimble on October 26, 2012 6:03 PM | Permalink | Reply to this

Re: The Zorn Identity

ArXiv link to that paper of Andrej Bauer’s: arXiv:0911.0068.

Posted by: Tom Leinster on October 26, 2012 8:23 PM | Permalink | Reply to this

Re: The Zorn Identity

Bad that it spoils your nice idea, but good that you found a definitive answer. Yay Andrej!

(I’m sure I saw Andrej’s theorem mentioned somewhere else on the web recently. It was probably here, MathOverflow or the nforum.)

I’m still trying to see which part of my sketched proof of Bourbaki-Witt isn’t constructive. In my previous comment I wrote that maybe Hartog’s lemma “doesn’t construct such an α\alpha and β\beta”. But we could always just choose the least such α\alpha, then the least such β\beta. Unless, that is, basic facts about ordinals go wrong in the absence of excluded middle.

Maybe the culprit is “ordinal comparability”, i.e. that given two well-ordered sets, either the first embeds as a downwards-closed subset of the second or vice versa. That “or” could lead to trouble. (It may also be that my grip on the meaning of “constructive” is shakier than I’d like.)

Posted by: Tom Leinster on October 26, 2012 8:21 PM | Permalink | Reply to this

Re: The Zorn Identity

Exactly as you say, the trouble is with the ordinals. As usual, classically equivalent definitions become non-equivalent constructively, and each version suffers some problems. In particular, once you take a definition of ordinal strong enough that you can iterate along them as in the proof of Bourbaki-Witt (trichotomous ordinals), it turns out there may not be very many of them. Andrej and I have written a successor paper to the one linked above, On the Bourbaki-Witt principle in toposes, where we show (among other things) that Bourbaki-Witt fails exactly when the trichotomous ordinals form a set.

Paul Taylor’s paper Intuitionistic Sets and Ordinals gives a good analysis of the range of possible definitions.

Posted by: Peter LeFanu Lumsdaine on October 27, 2012 8:52 PM | Permalink | Reply to this

Re: The Zorn Identity

What a lovely paper, you two!

Posted by: Todd Trimble on October 27, 2012 9:50 PM | Permalink | Reply to this

Re: The Zorn Identity

This post seems like a call to return to the dark old days before formalized set theory. Back to the time when results in analysis would be uncertain because different people were using different background principles to reason about sets.

I think reasonably we can say that accepting different formal systems as a notion of set theory involves meaning a different thing by set. If you accept sets that involve ur-elements or otherwise aren’t ZFC sets as sets you simply mean a different thing by set than I do. I personally happen to like Quine’s NF version of set theory but I don’t make the mistake of trying to apply that version of set theory in proofs because that simply isn’t what the mathematical community has settled on for it’s notion of set.

Given that most mathematicians understand set to ultimately refer to an object that satisfies (at least) ZF and you admit that over ZF choice and Zorn’s lemma are equivalent I don’t see what else you could want. You know quite well there is no choice free proof of Zorn’s lemma in ZF so unless you are talking about a different notion of set than the rest of us I don’t see what possible truth there could be in your claim. If you are talking about some other intuitive notion of set, well that’s a great fact about that other notion but it’s false in mathematics as customarily understood.

Posted by: Peter Gerdes on October 29, 2012 9:24 PM | Permalink | Reply to this

Re: The Zorn Identity

Hi Peter.

This post seems like a call to return to the dark old days before formalized set theory.

That wasn’t what was intended. I agree that having formal rules for manipulating sets is highly important. On the other hand, I’m assuming nothing more than naive set theory in these posts, partly because I don’t want to get sucked into a big argument about axiomatics. I said a bit about this at the beginning of my last post.

If you accept sets that involve ur-elements or otherwise aren’t ZFC sets as sets you simply mean a different thing by set than I do.

Could you clarify your meaning here, simply at the grammatical level? I’m afraid I can’t parse this.

Given that most mathematicians understand set to ultimately refer to an object that satisfies (at least) ZF

I don’t agree that they do, but I don’t feel like arguing about this right now.

you admit that over ZF choice and Zorn’s lemma are equivalent I don’t see what else you could want

It’s not a case of “wanting” something. As my post explains, it’s about emphasis, and it’s about the arrangement of the facts.

For example, I think it’s noteworthy that one can prove Zorn’s lemma by first proving a result that doesn’t need choice (theorem 3 above, “Zorn’s chain theorem”) and then performing just two trivial further steps (which, inevitably, do use choice).

I don’t see what possible truth there could be in your claim. If you are talking about some other intuitive notion of set, well that’s a great fact about that other notion but it’s false in mathematics as customarily understood.

Which “claim” are you referring to here? If you mean the one at the beginning of my post, well, most of the rest of the post is devoted to explaining the sense in which it’s true. I’m not sure what you mean by saying it’s “false in mathematics as customarily understood”.

I’m not talking about a different notion of set. As I said at the beginning of the previous post, I’m steering clear of committing myself to a particular axiomatic system (though I am assuming choice). Everything I’ve written in this post and the last can be understood in terms of ordinary naive set theory.

Posted by: Tom Leinster on October 29, 2012 10:37 PM | Permalink | Reply to this

Re: The Zorn Identity

As I understand him, Peter is saying that “set” should be thought of as a technical term defined by mathematical axioms, just as “group” or “vector space” is. We may have an intuitive or everyday notion of “set”, but for mathematical discussion we should leave that aside and just stick to the axiomatic definition. In particular, the definition most commonly accepted amongst mathematicians is that given by ZF, so we should read “set” as “entity defined by the ZF axioms”. If you use “set” to mean something else, you’re just talking about a different thing than most mathematicians. (Likewise, if you kept using the word “group” but allowed that some elements might not have an inverse, you wouldn’t be talking about a group as most mathematicians understand the term.)

Thus, according to Peter, “sets that involve ur-elements” are not really sets, and “[sets that] otherwise aren’t ZFC sets” are not really sets. If you use the term “set” in a way that includes such things, you’re not talking about “sets” but rather talking about a different kind of thing.

Hence: If you accept (sets that involve ur-elements) or ([sets that] otherwise aren’t ZFC sets) as sets [then] you simply mean a different thing by set than I do.

Posted by: Stuart Presnell on October 30, 2012 2:16 PM | Permalink | Reply to this

Re: The Zorn Identity

Got it! Thanks very much for the explanation, and for the parse tree.

My response is as before — I don’t entirely agree with this point of view, but for the purposes of this post, it doesn’t much matter which axioms for sets one uses (within reason). For example, ZFC would do fine.

Posted by: Tom Leinster on October 30, 2012 3:57 PM | Permalink | Reply to this

Re: The Zorn Identity

I would argue that most mathematicians do not actually mean “entity defined by the ZF axioms” when they say “set”, even if they’ve been brainwashed into thinking that they do.

(Originally posted at 12:29:27 on 3 Nov 2012)

Posted by: Mike Shulman on November 11, 2012 11:55 PM | Permalink | Reply to this

Re: The Zorn Identity

Given that most mathematicians understand set to ultimately refer to an object that satisfies (at least) ZF..

Some people disagree with this (including me), as most of mathematics can be formalised in Bounded Zermelo set theory with Choice. By this I mean everything except set theory (pretty much: the study of ZFC and its extensions) and some of the more esoteric stuff cooked up to need the stronger of the ZFC axioms, such as in point-set topology.

And in fact, when people study, for example, low-dimensional manifolds, or number theory, foundations are pretty much irrelevant to their work. What matters are the ‘local axioms’. Otherwise it’s like wondering if the n thn^{th} prime p np_n in my set theory is really the same as the the n thn^{th} prime in your set theory, or whether the fundamental group of this manifold I constructed via a given surgery algorithm on the sphere using my foundations is the same as the fundamental group of the manifold you construct from your sphere using the same algorithm.

And since naive set theory, or BZC if you want to be formal, is a sub-theory of ZFC, mathematicians aren’t proving things which are not provable in ZFC…

But as Tom said, this was not the point of his post.

What I find interesting, to return to topic, is the shifting of the ‘heavy machinery’ of the proof of Zorn’s Lemma away from AC. I know Tim Gowers or Terry Tao (I can’t quite remember which, or where) has written about needing strong machinery to prove strong results, or needing to put in a respectable amount of effort somewhere in the proof (if not relying on an existing edifice) - even if it is just mental effort to arrive at a clever construction.

As far as examples of what Tom is asking for, I’m thinking of statements in constructive topology, where we restate theorems, quite trivially, using things like excluded middle or AC, and then the restated theorem has a proof without AC and/or excluded middle. It is the reformulation that needs AC, much like the case of Zorn’s chain theorem being equivalent to Zorn’s lemma. Usually it is the either the definitions of properties (e.g. compactness) or objects (e.g. locales vs spaces) or even of logical connectives. In Coq \exists means that there is a function as in the hypothesis of Zorn’s chain theorem. So stated in the language of Coq, Zorn’s lemma is actually just the chain theorem!

Posted by: David Roberts on October 30, 2012 12:27 AM | Permalink | Reply to this

Re: The Zorn Identity

Re your last paragraph, David: interesting!

I did also wonder whether Tychonoff’s theorem was a case in point. Tychonoff’s theorem for topological spaces — that a product of a family of compact spaces is compact — does require the axiom of choice, but as I understand it, Tychonoff’s theorem for locales doesn’t. (I think this is a theorem of Peter Johnstone.) What I don’t know is whether Tychonoff for spaces and Tychonoff for locales are trivially inter-deducible, given choice.

Posted by: Tom Leinster on October 30, 2012 12:52 AM | Permalink | Reply to this

Re: The Zorn Identity

Probably the case of compact Hausdorff spaces works (more on that in a moment). At this MO answer a proof is mentioned that uses ultrafilters. The equivalence of the definition of compactness using ultrafilters and the usual (open cover) definition is shown using the boolean prime ideal theorem (which is a weak form of Choice). For Hausdorff spaces the ultrafilter version of Tychonoff then uses no Choice. (This is the substance of a comment by Michael Greinecker.)

You are correct about Johnstone’s proof of Tychonoff for locales, it is mentioned as one of the other answers (by Andrej Bauer) to the MO question I linked to. In particular see Mike Shulman’s comment, which lets us know that we can’t just recover the sober topological space version from spatial locales. We need to look at compact regular locales/spaces (or poss. just Hausdorff, I’m not sure on this).

There’s some relevant discussion on the nLab page.

Posted by: David Roberts on October 30, 2012 2:01 AM | Permalink | Reply to this

Re: The Zorn Identity

In Coq ∃ means that there is a function as in the hypothesis of Zorn’s chain theorem.

I would agree with that if you said “in propositions as types logic”. But Coq’s built-in “Prop” (which is what its standard “exists” refers to) is not quite propositions-as-types.

(In general, it’s probably better to make statements like this about formal systems, like the calculus of constructions, dependent type theory, or PAT logic, rather than about particular computer programs which implement them.)

(Originally posted at 12:46:03 on 3 Nov 2012)

Posted by: Mike Shulman on November 11, 2012 11:56 PM | Permalink | Reply to this

Re: The Zorn Identity

(In general, it’s probably better to make statements like this about formal systems, like the calculus of constructions, dependent type theory, or PAT logic, rather than about particular computer programs which implement them.)

Ah, yes - I agree. I was probably thinking of something like the simplicial set model of univalent foundations or something anyway. Thanks for clarifying.

(Originally posted at 23:21:15 on 4 Nov 2012)

Posted by: David Roberts on November 12, 2012 12:02 AM | Permalink | Reply to this

Re: The Zorn Identity

Thinking about this some more, prompted partly by David’s comment about PAT logic, leads me to the following observations.

If we beta-reduce your proof of “Zorn’s lemma” from “Zorn’s chain theorem”, there are two uses of the axiom of choice. First, you use choice to choose an upper bound v(C)v(C) for each chain. Then you use choice to chose an element strictly greater than every non-maximal element, defining an inflationary operator ss (with s(x)=xs(x)=x for maximal xx). Then you apply the chain theorem to the function Cs(v(C))C\mapsto s(v(C)).

It seems to me that the first use of choice is not as important, because in many applications of Zorn’s lemma we do in fact have such a function. Very often it seems like our poset is some subposet of a power set which is closed under unions of chains (e.g. ideals in a ring), in which case we have the function “take the union”. So not much would be lost if we stated Zorn’s lemma instead as “if XX is a poset together with a function assigning an upper bound to each chain, then XX has a maximal element”. (Indeed, I believe this version would still be equivalent to the axiom of choice.)

Now as David pointed out, in PAT (“propoitions as types”) logic, a statement like “there exists an xXx\in X such that …” is interpreted by the type of all xx equipped with a proof (= inhabitant) of the ellipsis, while a statement like “for all yy, …” is interpreted by a function assigning to each yy a proof (= inhabitant) of the ellipsis. (The latter is always the case in type-theoretic logics, while in general the former might instead have a bracket type or some other modality applied to it.) Thus, if we stated Zorn’s lemma in PAT logic in the usual way as “if XX is a poset in which every chain has an upper bound, then XX has a maximal element”, then it would carry exactly the same content as my alternative version above. (*)

Of course this version would still not be provable in the (constructive) PAT logic. If we assume Bourbaki-Witt (which is at least not ridiculous, constructively, since as Andrej and Peter showed, it holds in all Grothendieck toposes), then the chain theorem would be true, but we would still have that second lingering use of choice. In fact, that use of choice involves also a use of excluded middle: we need to know that every element is either maximal or not, and in the non-maximal case we have to choose an element strictly greater than it.

The assumption that this is possible, for a given XX, means that we have a function assigning to each xXx\in X either an element yy strictly greater than xx, or a proof that xx is maximal. In PAT logic, this is just the interpretation of the statement “for all xx, either there exists a y>xy\gt x, or xx is maximal”. (The PAT interpretation of “or” is simply a disjoint union; in other logic we would have to bracket that too.)

So it sounds like in PAT logic, we have the following true theorem: “If Bourbaki-Witt holds, and if XX is a poset in which chains have upper bounds and in which every element is either increasable or maximal, then XX has a maximal element.”

Of course, this is interesting to compare with the fact that one classical version of the axiom of choice is a true theorem in PAT logic: “if for all xx there exists a yy such that P(x,y)P(x,y), then there is a function ff such that for all xx, P(x,f(x))P(x,f(x)).” I wonder what other classical equivalents of the axiom of choice have variants which are provable in PAT logic? Presumably someone has thought about this…

(*) Actually, it would say a little more: that we have a function assigning to any such XX a maximal element of it. This extra bit of constructivity would hold as long as the chain theorem behaves the same way, which in turn I guess would be true as long as we assume Bourbaki-Witt in the corresponding strong form.

(Originally posted at 12:02:11 on 4 Nov 2012)

Posted by: Mike Shulman on November 11, 2012 11:59 PM | Permalink | Reply to this

Valeria

hi Mike, I’m coming back to this old conversation to ask whether you found out anything more about your question: “I wonder what other classical equivalents of the axiom of choice have variants which are provable in PAT logic? Presumably someone has thought about this…” any news on that?

Posted by: Valeria de Paiva on January 5, 2021 2:27 PM | Permalink | Reply to this

Re: The Zorn Identity

No, I haven’t thought about this any more or heard of any work on it.

Posted by: Mike Shulman on January 5, 2021 8:22 PM | Permalink | Reply to this

Re: The Zorn Identity

Something I think is odd about the analysis is that Zorn’s lemma is quite trivial as an order-theoretic result.

Take the following extremely easy result: Every nonempty finite poset has a maximal element. One can simply prove this by recursively constructing a strictly increasing sequence in the poset if there were no maximal element, which contradicts it being finite.

If one tries to use the same proof to show that every nonempty poset has a maximal element, it fails precisely because there might be some infinite chain. But if we can pick an upper bound of this chain, we can increase the transfinite sequence to a strictly increasing sequence of length omega+1. If there is no maximal element, one continuous this way until one gets a strictly increasing transfinite sequence in the poset that has cardinality larger than the poset, which is impossible.

So Zorn’s lemma is straightforward if one is familiar with transfinite recursion.

Posted by: Michael Greinecker on June 27, 2013 9:19 AM | Permalink | Reply to this

Re: The Zorn Identity

Yes, I agree (though “quite trivial” is of course a value judgement). I guess that’s the most famous proof of Zorn. But I don’t see how it justifies your opening statement that there’s something odd about my analysis. Can you explain what you mean?

Posted by: Tom Leinster on June 27, 2013 1:47 PM | Permalink | Reply to this

Re: The Zorn Identity

What I mean by “odd” (I hope it didn’t come over as offensive) was that I think the essential gap between Zorn’s chain theorem and Zorn’s lemma is not much smaller than the classical proof of Zorn’s lemma.

This is of course an entirely subjective statement. But I think it is still defensible to think of Zorn’s lemma as a form of the axiom of choice, because of the relative size of these gaps.

Posted by: Michael Greinecker on June 27, 2013 3:11 PM | Permalink | Reply to this

Re: The Zorn Identity

Ah, I see what you mean now. (And no worries, no offence taken.)

Evidently we see the relative sizes of the gaps the opposite way round from each other. I think this is explained by what we’re willing to take for granted. I don’t want to take for granted any order theory, including the theory of ordinals (some of which is needed to justify the transfinite argument in the proof you gave). So when measuring the size of a proof of Zorn’s lemma, I would include in that measurement the cost of setting up the necessary theory of ordinals. That certainly makes it much longer than the few lines (included in my post) that it takes to deduce Zorn’s lemma from Zorn’s chain theorem.

If you are willing to take for granted the theory of ordinals/transfinite induction, then yes, I agree that the proof of Zorn’s lemma is not much longer than the deduction of Zorn’s lemma from Zorn’s chain theorem.

Posted by: Tom Leinster on June 28, 2013 9:16 AM | Permalink | Reply to this

Re: The Zorn Identity

By the way, we had a big discussion about the relationship between set theory and order theory here. Maybe that clarifies where I’m coming from.

Posted by: Tom Leinster on June 28, 2013 9:28 AM | Permalink | Reply to this

Re: The Zorn Identity

The really important question is, why are you calling a mohawk a “mohican”?

Is that what they call it in England? In the US, I only hear the “mohican” used in the phrase “the last of the mohicans”, meaning the last of something of its kind—the title of a famous novel. But now I’m guessing Mohican and Mohawk are two names for the same tribe.

For various reasons, “mohican” sounds much more dainty to me than “mohawk”.

Posted by: John Baez on August 12, 2013 4:26 AM | Permalink | Reply to this

Re: The Zorn Identity

I think that the Mohicans and the Mohawks are separate tribes. (This is after a glance at the Wikipedia pages.) The Mohawks were Iroquois, the Mohicans or Mahicans, were Algonquians and so were not part of the Iroquois Confederacy. Apparently they fought with the Mohawks at various times.

Posted by: Tim Porter on August 12, 2013 6:43 AM | Permalink | Reply to this

Re: The Zorn Identity

Re names for the haircut: it’s just a difference of dialect, as you guessed.

Posted by: Tom Leinster on August 12, 2013 8:04 AM | Permalink | Reply to this

Post a New Comment