Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

October 26, 2024

Axiomatic Set Theory 6: Gluing

Posted by Tom Leinster

Previously: Part 5. Next: Part 7.

A category theorist might imagine that a chapter with this title would be about constructing colimits, and they’d be half right.

We did indeed construct the quotient of a set by an equivalence relation, and prove its universal property and the first isomorphism theorem for sets (which is the core of its more famous algebraic cousins). And we did indeed construct coproducts, using a technique that looks much more like the ZFC trick of “{0}×X{1}×Y\{0\} \times X \cup \{1\} \times Y” than it looks like Paré’s sublime construction of colimits in a topos.

But that’s not all! We also did an isomorphism-invariant version of “there is no set of all sets”, proving that for any set-indexed family of sets (X i) iI(X_i)_{i \in I}, there is some set not isomorphic to any of its members X iX_i. And, most excitingly, we used coproducts to prove the Cantor–Bernstein theorem: if you’ve got sets XX and YY and know that there exist injections XYX \leftrightarrows Y, then XX and YY must, in fact, be isomorphic.

Posted at October 26, 2024 2:54 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3572

19 Comments & 0 Trackbacks

Re: Axiomatic Set Theory 6: Gluing

Incidentally, at some point I spent a while reading about the history of the Cantor–Bernstein theorem. It has often been called the “Schröder–Bernstein theorem” (at least in English language sources). But, as I observe in the notes, Schröder’s only contribution was a proof attempt which he himself conceded was wrong.

If I understand and remember correctly, the contributions of the various players were something like this.

  • Cantor proved the result, but used the axiom of choice to do it. He set it as a challenge to prove it without the axiom of choice.

  • Schröder thought he’d proved it without choice, and published the purported proof. The mistake was eventually spotted, and Schröder agreed that his proof was wrong.

  • Dedekind proved it without choice in a letter to Cantor, but didn’t publish it. I think I remember reading that there’s no evidence that Cantor read or understood Dedekind’s proof, and that the relationship between the two of them was sometimes strained.

  • Bernstein was the first to publish a correct proof without choice.

There’s more on the Wikipedia page, which I’m too lazy to read again now.

The notes use my preferred proof, which is probably the less well known of the two I’m aware of. It depends on the Knaster–Tarski fixed point theorem, which states that every order endomorphism of a complete lattice has a fixed point. In the notes, I only state and prove this for the case we need, where the lattice is a power set. This saves me from having to define “complete lattice” (or indeed “lattice”).

The other proof I have in mind is where you trace each element back along the given injections for as long as you can, and split into cases according to whether/how that process terminates. I think that’s the best known proof, but for some reason I find it less appealing.

Although the two proofs feel quite different, if you work through them, you discover that both actually give the same algorithm. What I mean is that given sets and injections i:XYi: X \to Y and j:YXj: Y \to X, both proofs not only prove the existence of a bijection between XX and YY, but construct a bijection—and it turns out that they construct the same bijection.

Posted by: Tom Leinster on October 27, 2024 6:16 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

Regarding Dedekind, he wrote down a proof of the result without choice while in the later stages of drafting Was sind und was sollen die Zahlen? (in July 1887), but didn’t include it in that monograph. However, he did include a related result which essentially gives it to you, using a very similar method of proof - and it is (I think) the only statement not explicitly proved in that text, and left to the reader. As you say, this was in the middle of the period with a strained relationship with Cantor.

For those who are interested, Arie Hinkis’ book Proofs of the Cantor-Bernstein Theorem digs into the history.

Posted by: David Roberts on October 29, 2024 1:23 AM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

Wow, David, that’s an amazingly comprehensive book! What a labour of love. Thanks for linking to it; I had no idea it existed.

It’s also free through the link David gives.

Posted by: Tom Leinster on October 29, 2024 1:32 AM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

Ooh the first isomorphism theorem for sets, that any f:XYf:X\to Y factors uniquely as XX/YYX \twoheadrightarrow X\!/\!\sim \xrightarrow{\sim} Y' \hookrightarrow Y. I always teach people this when I need to teach them equivalence relations. Generally I have X,YX,Y send their envoys to parley, who then manage to find common cause.

Posted by: Allen Knutson on October 28, 2024 4:35 AM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

I’d like to sit in your class and hear you tell that story about the envoys!

Posted by: Tom Leinster on October 28, 2024 9:14 AM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

The more I read of your notes, the more I wonder if it is really a course in category theory or in set theory. You are certainly teaching a lot of category-theoretic techniques and ideas, even if the subject matter is sets.

I suspect that the “Sets and Categories” course I will teach next semester will have rather less category theory than yours (even though it will have the definition of a category!)

I also wonder if the amount of category theory is actually necessary for ETCS and I suspect it mostly is. I realised that while ETCS and ZFC^- are mutually interpretable, they are not biinterpretable and that the difference really shows up. ETCS is more of a categorification of ZFC than a different approach to the same theory. Anyway, it’s very interesting to see this play out!

Posted by: Jonathan Kirby on October 28, 2024 1:13 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

I don’t deny that there’s a lot of categorical thinking going on in this course on sets (even though I don’t mention categories), just as there may be a lot of ring-theoretic thinking going on in a course on number theory (even if they never mention rings).

I hope the students will find that a good deal of what they learn is useful for other subjects too. It’s not my primary purpose to teach them generally useful stuff, but I do think it’s a substantial fringe benefit of a categorical approach (even a secretly categorical approach) that you learn more broadly applicable concepts than you do in a ZFC-based course.

However, with each week that passes, the course does feel more particular to sets and less about general categorical notions. For example, last week’s proof of Cantor’s theorem on X<P(X)X \lt P(X) and the Cantor–Bernstein theorem are highly specific to sets. General categorical thinking won’t help you much there.

What we’re doing this week (in notes I’ll release soon) is the definition and structure of \mathbb{N}, \mathbb{Z}, \mathbb{Q} and \mathbb{R}. The material on \mathbb{N} inevitably starts off by using the axiom that there exists a natural numbers object. But pretty soon in the development, everything starts looking just about the same as in any other axiomatization of set theory.

The topics for the remaining weeks are well ordered sets, the axiom of choice, and cardinal arithmetic. This will be really very close to the development that you’d see in a ZFC-based course. The main difference is that everything will be isomorphism-invariant — so, no transitive sets etc. But the results are essentially all the same.

I also wonder if the amount of category theory is actually necessary for ETCS and I suspect it mostly is.

I’ve tried to keep the generalities to a minimum. That doesn’t mean I’ve succeeded: I’m not naturally a concise writer, and these notes are a first iteration written under great time pressure. Plus, while I did detailed advance planning of the whole course, I’m not one hundred percent sure which results I have to include in earlier chapters in order to be able to use them in later chapters. So there’s probably some fat that could be trimmed, despite my best efforts.

Posted by: Tom Leinster on October 28, 2024 5:25 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

Dr. Kirby

You write:

“I also wonder if the amount of category theory is actually necessary for ETCS and I suspect it mostly is.”

There is a reason for this which is obscured in “a battle of set theories” (Dr. Shulman’s material/structural distinction). The debates which had occurred on the FOM mailing list between Dr. Friedman and category theorists had been about advocacy over the first-order paradigm — not “sets.” When one reads Lawvere and Rosebrugh, the significant element of category diagrams for reconstruction of “set intuition” are inclusions. Historically, one may compare this with more modern accounts of the mereological part relations.

Where Lawvere and Rosebrugh elevate Dedekind and criticize philosophers, they ignore Russell’s critique of Dedekind relative to Peano. Russell’s observations have been restated in “Set Theory and Its Philosophy” by Michael Potter.

In effect, inclusions approach “individuation” via singleton sets. This is how Zermelo had approached the idea in his 1908 paper (If I recall correctly, it also appears in Padoa’s paper where models for identifying axiom independence is introduced.). For Zermelo, this use of singleton sets appears motivated by a need to accommodate urelements. However, the logicist view of set theory had changed to an algebraic view with Skolem’s criticism of “definite properties” in Zermelo’s paper. First-order advocates will always object to conflation of “an individual” with the singleton having an individual as an element. And, this goes back to Russell’s comparison of Dedekind and Peano.

With Skolem, support for urelements becomes an axiom for a “set” of urelements (everything is a set) unnecessary for mathematics (a theory of pure sets suffices).

If one understands Lawvere’s contribution as a reverse engineering of symbolic logic, then one can have a deep appreciation for how “element” and “member” are interpreted by Lawvere. Along similar lines, where the status of “names” is central to the linguistic paraphrases of symbolic logic, the explanation of “names” in McClarty is equally impressive.

Once one acknowledges that the debate involves individuation and interpretation for the sign of equality, the two papers to compare are Lawvere’s paper on the unity of opposites

https://github.com/mattearnshaw/lawvere/blob/master/pdfs/1996-unity-and-identity-of-opposites-in-calculus-and-physics.pdf

and Russell’s rejection of “differences” on PDF page 7 of his paper “On Denoting,”

https://www.philosophy-index.com/russell/on-denoting/Russell-On_Denoting.pdf

For what this is worth, Russell’s view is useful to the verificationist perspective of homotopy type theory. It is implicit to the sequent axiomatization found in the guest post on identity types sponsored by Dr. Riehl. And there is a particularly unfortunate circumstance that makes this so difficult to sort out — logicism, first-order logic, and Herbrand logic share the same inference rules. So, when one portrays the situation as “a battle of set theories,” it is difficult to figure out who is arguing against what.

Posted by: mls on October 28, 2024 9:19 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

“The more I read of your notes, the more I wonder if it is really a course in category theory or in set theory.”

It’s a course in set theory. A course on category theory will inevitably talk about functors and adjunctions and natural transformations and general limits/colimits and the Yoneda lemma and those are completely missing from this course.

Posted by: Madeleine Birchfield on October 30, 2024 4:30 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

A friend and colleague of mine wrote a nice paper following “F. William Lawvere. Diagonal arguments and cartesian closed categories” on diagonalization arguments: https://arxiv.org/abs/math/0305282

Posted by: Ana N Mouse on October 30, 2024 7:18 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

I like how

A+B{(a,0):aA}{(b,1):bB}A + B \coloneqq \{(a,0) : a \in A\} \cup \{(b,1) : b \in B\}

is a ‘trick’ but

A+B{({a},):aA}{(,{b}):bB}A + B \coloneqq \{(\{a\},\varnothing) : a \in A\} \cup \{(\varnothing,\{b\}) : b \in B\}

is somehow ‘sublime.’ You have the rhetorical subtlety of an elephant, Tom.

Posted by: James Hanson on December 14, 2024 5:10 AM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

Well, it’s “Paré’s … construction of colimits in a topos.” that Tom described as sublime, not the construction of a disjoint sum.

And why are you assuming that the word “trick” here is being used in a deliberately disparaging way? (Tom is not Lawvere, for instance) Do you think ‘Scott’s Trick’ is a belittling term? Getting the construction of disjoint sum as a special case from the monadicity of contravariant powerset is a rather nice thing to know, regardless of one’s preference for foundations.

Posted by: David Roberts on December 14, 2024 2:06 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

The method I called a trick is more or less the one I chose to use in my course. I chose to use it because it’s the argument I liked best in the context. That doesn’t stop me from also finding Paré’s construction beautiful.

Not everything has to be a war.

I can’t resist summarizing Paré’s argument for those who haven’t seen it before — in which case, you’re in for a treat. The theorem is that every (elementary) topos has finite colimits. Paré wasn’t the first to prove this, but he found the following wonderful proof.

For any topos \mathcal{E}, the contravariant power set functor P: opP: \mathcal{E}^{op} \to \mathcal{E} is self-adjoint on the right. That is, maps XP(Y)X \to P(Y) correspond to maps YP(X)Y \to P(X). This is because P=Ω ()P = \Omega^{(-)}, and so a map XP(Y)X \to P(Y) is the same thing as a map X×YΩX \times Y \to \Omega, which is the same as a map Y×XΩY \times X \to \Omega, which in turn is the same as a map YP(X)Y \to P(X).

Better still, this adjunction is monadic. This can be proved using the monadicity theorem. Hence the functor P: opP: \mathcal{E}^{op} \to \mathcal{E} is monadic. But monadic functors create limits! And since \mathcal{E} has finite limits, op\mathcal{E}^{op} has finite limits too — that is, \mathcal{E} has finite colimits. Isn’t that wonderful?

If you unwind this general proof and specialize to the case of binary coproducts in the topos of sets, then you eventually end up with the following: P(X+Y)P(X + Y) is constructed as P(X)×P(Y)P(X) \times P(Y), and X+YX + Y is constructed as the set of singletons in P(X+Y)P(X + Y). Putting these together gives a construction of X+YX + Y as a subset of P(X)×P(Y)P(X) \times P(Y), namely, the subset in the second display of James’s comment (where XX and YY are called AA and BB). Of course, what people find so elegant about Paré’s argument is the general proof by monadicity that all toposes have all finite colimits, not the particular formula that can be extracted in this special case.

Posted by: Tom Leinster on December 14, 2024 4:18 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

Sure, I understand that the method you used in your course is closer to the standard coding in ZFC, but the point is that you only ever use these kinds of negatively-tinged words when talking about material set theory. What is it about the ZFC technique that makes it a ‘trick’ exactly? What’s tricky about it? How is it crafty? Who is it fooling? Why is it a ‘trick’ but the thing you used in your class is a ‘technique’? You could have just as easily chosen to write “And we did indeed construct coproducts, using a trick that looks much more like the ZFC technique of {0}×X{1}×Y\{0\}\times X \cup \{1\}\times Y than it looks like Paré’s sublime construction of colimits in a topos.” but you didn’t. I feel like the only thing making it a ‘trick’ is the fact that it’s what people use when they’re doing ZFC. I understand that this really is splitting hairs, that one little piece of colorful wording isn’t really that important, but your rhetoric regarding ZFC has this unsubtle consistent low-level disdain that I find irritating and unacademic.

Of the three constructions of coproducts we’re talking about here, the standard material one is the only one that’s both constructive and predicative. The one you used in your class relies on case splitting that doesn’t work constructively and Paré’s construction relies on the existence of power objects. Obviously this is subjective, but I personally find something a little bit ugly about needing something as big as a power set just to build a coproduct. The material coding is also just more conceptually straightforward. The proof that it works is literally one or two sentences long if you’ve already established basic properties of ordered pairs. Honestly, I would argue it’s the least ‘tricky’ of the three.

Posted by: James Hanson on December 15, 2024 3:02 AM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

I really don’t want to spend time arguing about a single word I used in an off-the-cuff blog post two months ago. “Trick” isn’t necessarily pejorative; I’m thinking here of the Tricki project of Tim Gowers and collaborators (a compendium of their favourite tricks), and the critical role played by the tensor power trick in my last book, and this MO question on “every mathematician has only a few tricks”, and, as David mentioned, Scott’s trick. There’s a famous quote attributed to Pólya along the lines of “an idea used once is a trick; an idea used twice is a method”; whether one sees something as a trick depends on whether one sees it as fitting into a larger framework, which is quite subjective.

Back to the mathematics. The three constructions we’re talking about do three different things:

  • Paré’s construction gives arbitrary finite colimits in an arbitrary topos.

  • The construction in my notes gives disjoint unions in an ETCS category.

  • The usual ZFC construction gives disjoint unions in a model of ZFC (where, in contrast to the other two settings, taking the union of an arbitrary pair of sets is a well-defined operation).

For anyone reading this who hasn’t looked at the relevant part of my notes (i.e. almost anyone reading this :-) ), what I did was the following.

A disjoint union of sets XX and YY is defined to be a set DD together with injections XDYX \to D \leftarrow Y such that every element of DD is mapped to by either an element of XX or an element of YY, but not both. I showed that XX and YY always have a disjoint union in three steps:

  • First, there’s some set ZZ into which XX and YY both embed, but perhaps not disjointly. (If XX or YY is empty, you can take ZZ to be YY or XX; assuming otherwise, you can take Z=X×YZ = X \times Y.)

  • From this, it follows that XX and YY embed disjointly into 2×Z2 \times Z. (Embed XX into the first copy of ZZ and YY into the second.)

  • Then take the union of the images of these two disjoint embeddings to get a disjoint union DD.

And it’s then proved that disjoint unions are coproducts. Concretely, this means that casewise definitions of functions are legitimate (e.g. define f(x)f(x) in one way if x0x \geq 0 and in another if x<0x \lt 0).

The splitting off of the trivial cases in the first step isn’t ideal, for sure. I could have avoided that by replacing the first two steps by the Paré-style move of embedding XX and YY disjointly into P(X×Y)P(X \times Y). I forget exactly why I preferred not to, but surely for pedagogical reasons, perhaps the reason you mention: using something as big as a power set just to build a coproduct seems like overkill.

One other aspect of Paré’s construction that I enjoy is what happens if you specialize it to coequalizers in SetSet, and in fact specialize even further to quotienting a set XX by an equivalence relation \sim. The recipe for X/X/{\sim} that pops out of Paré’s proof in this case is precisely the same as the standard first-year undergraduate one: it’s the subset of P(X)P(X) consisting of the equivalence classes. So what might seem like an isolated idea, that you can construct the quotient X/X/{\sim} as a set of subsets of XX, appears here as part of a bigger story.

(But again, in my course I didn’t tell that story: I just constructed X/X/{\sim} directly as a set of equivalence classes.)

Having said how much I appreciate Paré’s argument, I do think there’s something a bit odd about the theorem that toposes have finite colimits. It’s a bit like starting with multiplication and exponentiation on \mathbb{N} and constructing addition from them, when the usual approach is the other way round. I guess the Paré-style argument would be like this: given natural numbers xx and yy, somehow show that there’s a unique number x+yx + y such that 2 x+y=2 x2 y2^{x + y} = 2^x \cdot 2^y.

If my memory doesn’t betray me, someone — maybe Todd Trimble — told me that Lawvere and/or Tierney and/or Schanuel contemplated finding an approach to toposes where colimits rather than limits were given the central role. I guess this is related to Lawvere and Schanuel’s work on extensive categories, but as far as I know the full idea has never been brought to fruition.

Posted by: Tom Leinster on December 16, 2024 11:55 PM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

I deleted a comment here that used words like “moronic” and “obnoxious”. Disagreement about mathematics is allowed; insults are not.

Online discussions are rarely productive when the temperature is raised. Any further comments should stick to the mathematics.

Posted by: Tom Leinster on December 21, 2024 10:55 AM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

In any case, I had a mathematical point, although I am making the point because I am criticizing your rhetoric.

The standard coding of coproducts does fit into a broader framework. It fits into the framework of set-theoretic coding used by people working in material set theories like ZF(C) and KP. When you go deeper into inner model theory, fine structural things like primitive recursive set functions and the Jensen hierarchy become relevant. Calling the standard construction of coproducts a ‘trick’ is like calling computability theorists’ use of pairing functions a ‘trick.’ It’s not a clever thing; it’s a basic technique of the field.

This, in combination with your history of criticism of ZFC and the historical context of ETCS, is why I don’t really find your use of the word ‘trick’ to be affectionate in the same manner as the examples you cited.

Posted by: James Hanson on December 22, 2024 1:44 AM | Permalink | Reply to this

Re: Axiomatic Set Theory 6: Gluing

OK. You entered this conversation to object to the word “trick”. You’ve now made your point loud and clear. That discussion ends here.

Also, I said yesterday that any further comments should stick to the mathematics, and I’ve deleted another comment that didn’t.

People value the nn-Category Café as a place for calm and reasoned discussion. I get that people disagree over set theory in a way that they don’t over, say, ring theory. And I get that people can feel defensive. But we have a history here of hosting productive discussions of different approaches to set theory, involving commenters with a range of disagreeing views, yet maintaining a positive atmosphere and avoiding a combative tone. That’s been really great: all of us can learn from each other when the atmosphere is right.

To keep that productive atmosphere, any further comments that cross the line will simply be deleted without comment.

Posted by: Tom Leinster on December 22, 2024 5:02 PM | Permalink | Reply to this

Post a New Comment