The Zorn Identity
Posted by Tom Leinster
(Note: the Café went down for a few days in early November 2012, and when Jacques got it back up again, some of the comments had been lost. I’ve tried to recreate them manually from my records, but I might have got some of the threading wrong.)
My aim here is to make one simple point:
Zorn’s lemma has almost nothing to do with the axiom of choice.
It hasn’t got much to do with set theory, either.
People often describe Zorn’s lemma as a form of the axiom of choice. If called on to justify that statement, they would probably answer: in the presence of the Zermelo–Frankel axioms for set theory, the axiom of choice implies Zorn’s lemma and vice versa. This is, indeed, a fact.
However, I want to argue that the emphasis is misplaced. Zorn’s lemma is provable entirely without the axiom of choice, until the very end, where choice is used in a simple and transparent way.
Let me make that precise. Here are three theorems about partially ordered sets (posets).
Preliminary definitions: for a poset $X$,
- a chain in $X$ is a subset of $X$ that, with the order inherited from $X$, is totally ordered
- an upper bound for a subset $C \subseteq X$ is an element $x \in X$ such that $c \leq x$ for all $c \in C$
- a maximal element of $X$ is an element $m \in X$ such that the only element of $X$ greater than or equal to $m$ is $m$ itself.
The first of the three theorems is usually called “Zorn’s lemma”.
1. Zorn’s maximal theorem Let $X$ be a poset in which every chain has an upper bound. Then $X$ has a maximal element.
For the second, we need another definition: an inflationary operator on a poset $X$ is a map of sets $s\colon X \to X$ such that $s(x) \geq x$ for all $x \in X$. Note that $s$ need not be order-preserving.
2. Zorn’s fixed point theorem Let $X$ be a poset in which every chain has an upper bound. Then every inflationary operator on $X$ has a fixed point.
In the definition of upper bound for a subset $C \subseteq X$, it is not required that the upper bound belongs to $C$. However, our third theorem says that if we’ve got an upper bound for every chain in a poset, then at least one of them must in fact belong to that chain.
3. Zorn’s chain theorem Let $X$ be a poset. Let $u\colon \{\text{chains in } X\} \to X$ be a function assigning to each chain in $X$ an upper bound. Then $u(C) \in C$ for some chain $C$.
I confess that these names are completely without historical justification. I have no idea which of these statements Zorn knew, or proved. For all I know, Zorn didn’t even prove Zorn’s lemma — that wouldn’t be surprising, given how inaccurate mathematical attributions often are.
My central point, that Zorn’s lemma has almost nothing to do with the axiom of choice, is justified as follows:
- Zorn’s chain theorem is provable without the axiom of choice
- Zorn’s chain theorem trivially implies Zorn’s maximal theorem (a.k.a. Zorn’s lemma).
I won’t show you the choice-free proof of Zorn’s chain theorem, or at least, not today. But I will show you how Zorn’s chain theorem (3) trivially implies Zorn’s maximal theorem (1). In fact, I’ll show you how each of the three theorems trivially implies the others. There’s a direction to it: both implications 3 $\Rightarrow$ 2 $\Rightarrow$ 1 seem to require choice, but the reverse implications don’t.
If you’re feeling too lazy to read the following proofs, it doesn’t really matter. All you need to observe is: they’re very short!
- 1 $\Rightarrow$ 2 An inflationary operator fixes every maximal element.
- 2 $\Rightarrow$ 1 There is an inflationary operator $s$ on $X$ defined by taking $s(x) = x$ whenever $x \in X$ is maximal, and choosing $s(x) \gt x$ for all other $x \in X$. Then $s$ has a fixed point, so $X$ has a maximal element.
- 2 $\Rightarrow$ 3 The restriction of $u$ to one-element chains is an inflationary operator, so has a fixed point.
- 3 $\Rightarrow$ 2 Choose an upper bound $v(C)$ for each chain, and let $s$ be an inflationary operator on $X$. Then whenever $C$ is a chain, $s(v(C))$ is also an upper bound for $C$. Hence $s(v(C)) \in C$ for some chain $C$, so $s(v(C)) \leq v(C)$, so $v(C)$ is a fixed point of $s$.
So, the three theorems are trivially deducible from each other. In contrast, the proof of any one of them is a relatively serious endeavour.
Describing Zorn’s lemma as a form of the axiom of choice is a bit like describing a person in terms of their hairstyle. Occasionally hair matters: wear a mohican and snooty restaurants will refuse you entry, but part your hair at the side and they’ll let you in. Nevertheless, for almost all purposes, you’re the same person. The three theorems above are really one theorem with three different haircuts.
I tried to come up with a similar example from elsewhere in mathematics, but I couldn’t think of one. (Maybe you can.) What we need is a nontrivial theorem that exists in two variants, one of which requires the axiom of choice for its proof and one of which doesn’t. Deducing either variant from the other should be trivial.
Why does any of this matter? That question can only have a personal answer. I’ll give you mine, but it’s quite likely you’ll disagree, and that’s OK — it is, after all, personal.
What it’s not about, for me, is the axiom of choice. I’m not actually enormously interested in figuring out which parts of mathematics use the axiom of choice and which don’t. (I don’t think that’s a pointless endeavour; I’m only stating that it’s not a particular interest of my own at the moment, just as heart surgery and firefighting aren’t.)
What it is about is a continuing effort to understand the shape and character of set theory. Like many readers, I suspect, I took a ZFC-based course on set theory when I was a student. At the time I very much enjoyed it. Later, I came to realize that there was a qualitative difference between theorems I’d learned such as “for any sets $X$ and $Y$, either $X$ is not an element of $Y$ or $Y$ is not an element of $X$” (which is only meaningful in certain formal systems) and those such as “for any sets $X$ and $Y$, either there is an injection from $X$ into $Y$ or there is an injection from $Y$ into $X$” (which just about every mathematician would agree is meaningful). Now, I find myself wanting to understand which of the results typically called “set theory” — like Zorn’s lemma — are really results about order that intrinsically have little to do with sets.
So let’s treat Zorn’s lemma in exactly the same way as we treat, say, the classification theorem for finite abelian groups — no mystique, just a straightforward, run-of-the-mill, result. Let’s not treat it as a mysterious creature poking its head up from the dark set-theoretic cellar of mathematics. Let’s call it what it is: quite simply, a useful piece of order theory.
Re: The Zorn Identity
Zorn’s original paper is easy to find, freely available, short, easy to read, and worth looking at, if only to save yourself from having to say things like “For all I know, Zorn didn’t even prove Zorn’s lemma.” You also get an idea of how he himself thought of it, which is worth something.