Large Sets 12
Posted by Tom Leinster
Previously: Part 11. Next: Part 12.5
Today’s topic is replacement. Replacement is not directly about large sets, but it does imply that certain large sets exist.
Even among those who are familiar with and sympathetic to categorical set theory, I think there’s a lingering impression that replacement is somehow borrowed from ZFC. If categorical set theory is supposed to stand on its own two feet, without having to lean on membership-based set theory for conceptual motivation, then perhaps there are those who believe that to supplement ETCS with replacement would be an embarrassing admission of defeat.
I’ll explain why this is a misconception, stating replacement in a way that’s entirely natural from a structural/categorical perspective. The form of replacement I’ll use is due to Colin McLarty, who wrote of it “Our axiom is not a translation from ZF. It is a plain categorical version of Cantor’s idea.”
Let’s begin by thinking about families. In categorical set theory, a family of sets indexed by a set is usually defined as a set together with a function . Think of the fibre as the th member of the family. In brief:
A family over is a function into .
In what follows, it won’t be of the slightest importance whether a family over literally is a function into — whether that’s the actual definition of family — as long as there’s an equivalence between families over and functions into .
Now, there’s another, genuinely different, conception of what a family is. Consider sequences, the case . Since we can form the power set of any set , we might intuitively feel that there should be a sequence of sets
After all, if I give you any natural number (7, say) then you can give me the corresponding member of this sequence (namely, ). So we’ve got a sequence of sets, right? Right?
Well, no. At least, not on the basis of ETCS alone. In order for these sets to form a sequence — meaning an -indexed family — we have to have a set and a function whose th fibre is the -fold iterated power set of . In the notation explained in Part 6, the th fibre is the beth , and itself is . And as shown in Part 6, it is consistent with ETCS that does not exist. In other words, it cannot be proved from ETCS that there is any function into whose fibres are the iterated power sets of (unless, of course, ETCS is inconsistent).
If we don’t like this — if we feel that it’s a shortcoming of ETCS — then we might consider adding to ETCS some kind of principle saying that whenever we have a set and a procedure or algorithm or formula specifying a set for each , then the sets form an -indexed family in the official sense. And that principle is exactly replacement.
The version of replacement I’m going to state here is due to Colin McLarty, from Section 8 of his wonderful paper Exploring categorical structuralism (2004).
In a brief attempt to be a conscientious scholar, let me mention some historical context. McLarty was drawing on a replacement-like reflection principle formulated by Gerhard Osius in 1973. Further back in the history are a reflection principle stated in the long version of Lawvere’s ETCS paper, as well as work of one J. C. Cole that I’ve never managed to lay my hands on. And there’s also at least one more recent categorical version of replacement, given in Section 5 of Mike Shulman’s 2010 paper on stack semantics.
Anyway, it’s McLarty’s version I know best, so it’s McLarty’s version I’m going to explain here. I think it has strong intuitive appeal.
Loosely, replacement says the following.
Axiom scheme of replacement (informal) Suppose we have a set and a first-order formula specifying a set up to isomorphism for each . Then there exists a function into with fibres .
To state it more carefully, I need to recall the way I set up ETCS back in Part 1:
we have some things called “sets”, some things called “functions from to ” for each set and set , and an operation of “composition” on functions, satisfying the following axioms: […]
We’re going to consider first-order formulas in the language of ETCS. Such formulas are built up in the usual way from variable symbols (with specified types), logical connectives such as “or” and “not”, brackets, and so on. Two kinds of quantification are allowed:
quantification over sets;
for given sets and , quantification over functions from to .
One kind of equality is allowed:
- for given sets and , equality between functions from to .
In particular, we can’t express equality between sets. But we can express isomorphism. Let’s recite the definition: “ and are isomorphic if there exist and such that and .” That definition uses only the kinds of quantification and equality that we’ve allowed. So, it’s expressible.
Notice that given a set , we can quantify over the elements of , because of the correspondence between elements of and functions .
(A technical aside: the typing I’ve used to state ETCS is a bit different from McLarty’s, which means that replacement has to look a bit different too. But the difference is purely cosmetic.)
Now we want to formalize the idea of a “function” that assigns a set (determined up to isomorphism) to each element of a set . We’ll do this by considering the relation given by .
So, take a formula whose only free variables are (of type “element of ”) and (of type “set”). Let’s say that is functional on if for every , there exists a set with the following property:
whenever is a set, .
Replacement says this:
Axiom scheme of replacement Let be a set and a functional formula on . Then there exist a set and a function such that for all .
And that’s it. If we have a “family” of sets specified by some formula in the language of ETCS, then it arises as the family of fibres of some function.
Aside I’ve said “axiom scheme” rather than “axiom” only so that logicians won’t think I’m a total heathen. In ordinary mathematical language, replacement is an axiom. But if we’re being careful about the logical structure, it’s an infinite collection of axioms, one for each . We can’t combine them into a single statement, because quantifying over formulas — “” — is not allowed in our formal language.
(Joachim Kock once half-jokingly suggested that we should have a new quantifier, “for each”, for this kind of purpose. “For each ” is different from “for all ”.)
In the introduction to this post, I quoted from the paper by Colin McLarty that introduced replacement in this form. Here he is at more length, from p.48-49 of “Exploring categorical structuralism”, talking about his ETCS version of the axiom scheme of replacement.
Probably few readers think this looks like the ZF axiom scheme. That scheme has no homophonic translation into ETCS as it uses membership so heavily. It has a tree-theoretic translation, since every ZF formula does, but you can see that our axiom does not deal in trees. Our axiom is not a translation from ZF. It is a plain categorical version of Cantor’s idea.
Hellman finds the motivation for the axiom scheme “remains characteristically set theoretic” […] Indeed. Categorical set theory is a theory of sets. Its motives are set theoretic.
We can ask: are there motives for the axiom scheme of replacement apart from the ZF version of the scheme, and more generally apart from Zermelo’s form of membership-based set theory? But that is obvious. Cantor did not have Zermelo’s membership-based conception. […] And replacement is Cantor’s idea. […] Hellman rightly suggests we have no “new motivation” for the axiom. […] Neither did Fraenkel or Skolem. We all have Cantor’s original motive which predates the membership-based ZF version by decades.
The rest of this post is mostly about two things: what replacement can do, and what it can’t do.
What can replacement do?
A first, easy, consequence is the principle of separation or specification (or comprehension, as some people call it for reasons I don’t comprehend). Loosely, this says that given a set and a reasonable property that an element of may or may not have, we can form a subset
of . Interpret “reasonable property ” to mean “formula with one free variable of type ”. Then there is indeed such a subset. For given such a formula , define a formula by
By replacement, there exist a set and a function whose th fibre has one element if is true and none if is false. In other words, is the subset of consisting of the elements satisfying . And that’s what we wanted.
(That proof of separation is from Section 9 of McLarty’s paper.)
Now let’s go back to the motivating example from the start of the post: the wannabe sequence
Assuming replacement, we’ll show that this really is a sequence in the ETCS sense: there is a set with a map whose fibres are the beths .
There’s a first-order formula in natural numbers and sets , such that
One way to see this is to write
then define to mean “there exist a set and a function such that , and for all , and .”
The plan now is to use replacement to conclude that there exist a set and a function such that for all . In order to do so, we have to show that is functional — that is, for all , there is a unique-up-to-iso set satisfying . And at this point we meet a subtlety.
Intuitively, we’d like to say “it’s true by induction”. After all, suppose we have some and we can construct the unique satisfying , namely, . Then we can construct the unique satisfying : it’s . But in order to use induction, we have to know that
is actually a subset of . In other words, there need to exist a set and an injection such that for ,
That’s not obvious! But it’s true, assuming replacement: it’s an instance of separation. And once we have this set , we can use induction to prove that , then use replacement to construct our set over with fibres . So everything goes through fine.
In summary, the subtlety was that to construct the sequence , we had to use replacement twice.
The “total space” of the family — I mean, the domain of the map whose fibres are — is . So, we’ve used replacement to construct . A similar argument shows that assuming replacement, we can construct for every well-ordered set . So all beths exist.
Not only does replacement imply that all beths exist, it implies that beth fixed points exist — unboundedly many of them, in fact. Recall from Part 7 that denotes the initial well-order on a set , and that is said to be a beth fixed point if .
Also in Part 7, I described a procedure for constructing a beth fixed point bigger than any given set . Roughly speaking, what we do is to define a sequence of sets by
and then take its sup, which can be shown to be a beth fixed point.
I say “roughly speaking” because this depends on this sequence actually existing, meaning that there exist a set and a map with for all . (If it does exist then there’s no problem with taking the sup: the sets have as an upper bound, so they must have a least upper bound.) That’s where replacement is needed.
To apply replacement to this problem, we write down a first-order formula in natural numbers and sets such that
using the same kind of manoeuvre with sets over as we used for the sequence . And then a similar argument, using replacement twice, gives us the sequence , hence a beth fixed point larger than .
Once you’ve done a few proofs like this, it becomes clear that they should all be instances of a single theorem. That theorem is the principle of transfinite recursion.
There are several versions of transfinite recursion. To explain the one I’m talking about here, I’ll warm up with recursion over .
Suppose we have some kind of machine that takes as its input a finite sequence of sets, and produces as its output a new set . Then there should be a unique sequence of sets such that
for all . No base step is needed, as the case gives .
“Should” means that it seems intuitively appealing. It doesn’t follow from ETCS. But it does follow from ETCS with replacement.
The two examples above, and , have the special feature that for , the th member of the sequence depends only on the th member — not on all the previous members.
A different kind of special case is where doesn’t depend on the sets themselves, but only on . In that case, what does it to assign a set to each . So transfinite recursion reduces in this case to replacement over the indexing set .
Now that we’ve warmed up with recursion over , let’s step up to an arbitrary well-ordered set . Suppose we have a first-order formula that takes as its input a well-ordered set and a family of sets, and produces as its output a set (determined up to isomorphism). Then there should be a family of sets, unique up to isomorphism, such that
for each .
That phrasing is a bit informal, but I hope it’s clear by now how you’d formalize it. Really is a first-order formula
where the variables denote a well-ordered set , a set , a function from to the underlying set of , and a set . To understand the relationship with the previous paragraph, think of as the family and as . We ask that for each , and , there is a unique-up-to-iso set such that holds.
Stated formally, the principle of transfinite recursion is about such formulas . But I’m not going to write out the formal statement of the theorem, because it would just look like a load of notation and you’d only want to convert it back into the informal version above, the one involving .
Replacement implies the principle of transfinite recursion, by an inductive argument on .
Conversely, the principle of transfinite recursion implies replacement. We already saw this in the case , and it’s true for a general indexing set by a similar argument (after choosing a well-ordering on ). Replacement is essentially the special case of transfinite recursion where depends only on .
(I learned about the converse from a blog post by Joel David Hamkins, based on work of Benjamin Rin.)
So that’s what replacement can do. In summary: it gives us separation and transfinite recursion, and in turn, transfinite recursion gives us the existence of all the beths and of unboundedly many beth fixed points.
What can’t replacement do?
The most important thing it can’t do is prove the existence of inaccessibles:
It is consistent with ETCS + replacement that there are no inaccessible sets.
The proof follows a similar pattern to the proofs of all the similar statements in previous posts, so I’ll skip it.
In the converse direction, I’m pretty sure replacement isn’t implied (over ETCS) by the existence of an inaccessible set, or even by the much stronger condition that there are unboundedly many measurable sets. But I don’t know a proof. Can anyone clarify this point, or at least give a reference? Are there, for instance, theorems on determinacy that can be proved using replacement but aren’t guaranteed by the existence of unboundedly many measurable sets?
Is assuming replacement a good idea?
Sometimes we want to perform operations that are beyond what ETCS provides. At that point, it’s tempting to supplement ETCS with replacement. I don’t have a strong argument against doing so, but it’s a matter of fact that replacement is often way more than necessary.
For example, the first thing that people tend to want beyond ETCS is to be able to form, for a given set , another set containing all of
(For instance, this lets us do an infinitely iterated dual vector space construction.) As I mentioned in Part 6, such a set exists if we assume the existence of all beths. This is a far weaker assumption than replacement.
But occasionally, the existence of all beths might still not be enough. In such cases, it may be enough to assume the existence of a beth fixed point, or, stronger still, unboundedly many beth fixed points. All of these assumptions are still weaker than replacement.
In the hierarchy of assumptions that we’ve been going through in the posts, the next one after “unboundedly many beth fixed points” is the existence of an inaccessible set. Assuming the existence of an inaccessible is (if I’m not mistaken) neither stronger nor weaker than replacement, and it’s something that people often do — for example, in the context of Grothendieck universes. Sometimes, one even wants to assume that unboundedly many inaccessibles exist.
Now although such assumptions about inaccessibles are (I think) incomparable with replacement — neither implies the other — they seem to me to be somehow more specific, more limited, than replacement. I can’t give a precise meaning to that, except to point out again that replacement is an axiom scheme: it gives one axiom for each suitable first-order formula , whereas the existence of inaccessibles is a single axiom.
The lesson I’d draw is that if we’re looking at supplementing ETCS with something like replacement, it’s at least worth asking whether it might be better to use an axiom such as “all beths exist” or “there are unboundedly many inaccessibles” instead.
ETCS, ZFC, and the spectre of inconsistency
I’ve somehow managed to get this far without mentioning the equivalence between ETCS + replacement and ZFC. It’s not just that they have the same consistency strength (one is consistent if and only if the other is). They’re actually biiinterpretable, which is equivalence in the strongest sense. As I wrote in Part 1, abbreviating replacement to “R”:
[Biinterpretability] means that there’s a way of translating statements in the language of ETCS+R into statements in the language of ZFC, and vice versa, such that the two translation processes are mutually inverse and make theorems of ETCS+R match up with theorems of ZFC. In particular, ETCS+R is a categorical set theory exactly as strong as ZFC.
And this brings me to one reason why it might be wise not to assume replacement unless it’s truly needed: the spectre of inconsistency.
I find it barely conceivable that ETCS could be inconsistent. Its axioms are all principles that mathematicians — not just category theorists, but all mathematicians — use day in, day out. If it’s inconsistent, that means there’s some problem with constructing a sequence from a function and an element , or with taking the characteristic function of a subset of , etc. The only axiom in ETCS that raises the slightest suspicion is choice, but that alone can’t cause inconsistency: it’s a theorem that if ETCS is inconsistent then so is ETCS without choice.
If ETCS is shown to be inconsistent then it will have truly cataclysmic implications. Vast amounts of mathematics — perhaps the majority of all published work, ever — will be invalidated.
What if ZFC is inconsistent? There was a lot of discussion of this question around the time that Vladimir Voevodsky was speaking on the topic (e.g. see this MathOverflow question), and obviously he’s not the only one to have considered it.
For myself, I’ll be pretty unflustered if, one day, ZFC is shown to be inconsistent, as long as ETCS remains intact. An equivalent statement: a proof of the inconsistency of ETCS+R wouldn’t bother me much if it used replacement in some essential way.
You might be thinking to yourself “ah, well, Tom’s just saying that because ETCS is the system he’s most familiar with”. But I don’t think this is just about personal feelings. ETCS alone provides enough foundation for a vast amount of mathematics. McLarty has argued that the entirety of EGA and SGA need nothing more. Assuming McLarty is correct, that is just an objective truth.
A proof of the inconsistency of ZFC (or equivalently, ETCS+R) that needed the full strength of replacement would certainly make a huge impact in the world of foundations, but I think its impact beyond would be limited. If it also proved the inconsistency of ETCS + “all beths exist”, or of ETCS plus some similarly mild additional axiom, that would cause upheaval across much more of mathematics. But still, probably many uses of principles beyond ETCS are out of convenience rather than true necessity. So the impact would be nothing compared to the global cataclysm that would be the inconsistency of ETCS.
Thanks, Todd!
Before I finish this series, I want to make sure to thank Todd Trimble. I’m doing it now because Todd helped me understand some points about replacement in conversations at Category Theory 2019 in Edinburgh. But my gratitude extends much further back. Todd’s 2008 series of posts on categorical set theory at the Topological Musings blog (1, 2, 3) had a big influence on me, both whetting my appetite to learn more categorical set theory and showing me how it’s done.
(Todd’s posts are also archived at the nLab, starting here.)
Let me quote from the third of Todd’s posts, where he nicely describes the relationship between the elementary approach to categorical set theory and the topos-theoretic approach, as well as the influence of the late Myles Tierney:
Most textbook treatments of the formal development of topos theory (as for example Mac Lane–Moerdijk) are efficient but highly technical, involving for instance the slice theorem for toposes and, in the construction of colimits, recourse to Beck’s theorem in monad theory applied to the double power-set monad [following the elegant construction of Paré]. The very abstract nature of this style of argumentation (which in the application of Beck’s theorem expresses ideas of fourth-order set theory and higher) is no doubt partly responsible for the somewhat fearsome reputation of topos theory.
In these notes I take a much less efficient but much more elementary approach, based on an arrangement of ideas which I hope can be seen as “natural” from the point of view of naive set theory. I learned of this approach from Myles Tierney, who was my PhD supervisor, and who with Bill Lawvere co-founded elementary topos theory, but I am not aware of any place where the details of this approach have been written up before now.
This series of posts is intended to continue in that same Tierney–Trimble spirit.
Next time
The final post in this series will be a summary of everything, including a table of contents, diagrams, and (because I won’t be able to help myself) some reflections on the whole enterprise.
Added later: but first, Mike Shulman will talk about different versions of replacement.
Re: Large Sets 12
I recently learned, far later than I should have, that Borel Determinacy (BD), the poster child for the use of Replacement in ZFC (there is a model of ZC where BD fails!), only requires Replacement where the domain is a countable ordinal. Sometimes people get a bit lazy and say it needs Replacement for , but people (mostly Don Martin) have done really serious analysis of the proof, and (apart from some minor coding issues around forming ordinals irrelevant in the structural setting) pinned it down precisely. There are even more fine-grained results, saying how Replacement for specific countable ordinals gets this-and-that fragment of BD. Even more, it’s not countable Replacement as such, but only the existence of for all countable ordinals ! So asking that all beths exist is overkill even for the most famous theorem that is advertised as requiring Replacement.
It’s also curious that Fraenkel, after the work where he (apparently tentatively!) suggested Replacement as a strengthening of Zermelo’s axioms, also proposed an alternative axiom equivalent to countable Replacement. It was really von Neumann and his version of ordinals that tipped the scales towards arbitrary Replacement.