## July 18, 2021

### Large Sets 12

#### Posted by Tom Leinster

Previously: Part 11. Next: Part 12.5

Today’s topic is replacement. Replacement is not directly about large sets, but it does imply that certain large sets exist.

Even among those who are familiar with and sympathetic to categorical set theory, I think there’s a lingering impression that replacement is somehow borrowed from ZFC. If categorical set theory is supposed to stand on its own two feet, without having to lean on membership-based set theory for conceptual motivation, then perhaps there are those who believe that to supplement ETCS with replacement would be an embarrassing admission of defeat.

I’ll explain why this is a misconception, stating replacement in a way that’s entirely natural from a structural/categorical perspective. The form of replacement I’ll use is due to Colin McLarty, who wrote of it “Our axiom is not a translation from ZF. It is a plain categorical version of Cantor’s idea.”

Let’s begin by thinking about families. In categorical set theory, a family of sets indexed by a set $I$ is usually defined as a set $X$ together with a function $p: X \to I$. Think of the fibre $p^{-1}(i)$ as the $i$th member of the family. In brief:

A family over $I$ is a function into $I$.

In what follows, it won’t be of the slightest importance whether a family over $I$ literally is a function into $I$ — whether that’s the actual definition of family — as long as there’s an equivalence between families over $I$ and functions into $I$.

Now, there’s another, genuinely different, conception of what a family is. Consider sequences, the case $I = \mathbb{N}$. Since we can form the power set $\mathcal{P}(X) = 2^X$ of any set $X$, we might intuitively feel that there should be a sequence of sets

$\mathbb{N}, \mathcal{P}(\mathbb{N}), \mathcal{P}\mathcal{P}(\mathbb{N}), \ldots.$

After all, if I give you any natural number (7, say) then you can give me the corresponding member of this sequence (namely, $\mathcal{P}\mathcal{P}\mathcal{P}\mathcal{P}\mathcal{P}\mathcal{P}\mathcal{P}(\mathbb{N})$). So we’ve got a sequence of sets, right? Right?

Well, no. At least, not on the basis of ETCS alone. In order for these sets to form a sequence — meaning an $\mathbb{N}$-indexed family — we have to have a set $X$ and a function $p: X \to \mathbb{N}$ whose $n$th fibre is the $n$-fold iterated power set of $\mathbb{N}$. In the notation explained in Part 6, the $n$th fibre is the beth $\beth_n$, and $X$ itself is $\beth_\omega$. And as shown in Part 6, it is consistent with ETCS that $\beth_\omega$ does not exist. In other words, it cannot be proved from ETCS that there is any function into $\mathbb{N}$ whose fibres are the iterated power sets of $\mathbb{N}$ (unless, of course, ETCS is inconsistent).

If we don’t like this — if we feel that it’s a shortcoming of ETCS — then we might consider adding to ETCS some kind of principle saying that whenever we have a set $I$ and a procedure or algorithm or formula specifying a set $X_i$ for each $i \in I$, then the sets $X_i$ form an $I$-indexed family in the official sense. And that principle is exactly replacement.

The version of replacement I’m going to state here is due to Colin McLarty, from Section 8 of his wonderful paper Exploring categorical structuralism (2004).

In a brief attempt to be a conscientious scholar, let me mention some historical context. McLarty was drawing on a replacement-like reflection principle formulated by Gerhard Osius in 1973. Further back in the history are a reflection principle stated in the long version of Lawvere’s ETCS paper, as well as work of one J. C. Cole that I’ve never managed to lay my hands on. And there’s also at least one more recent categorical version of replacement, given in Section 5 of Mike Shulman’s 2010 paper on stack semantics.

Anyway, it’s McLarty’s version I know best, so it’s McLarty’s version I’m going to explain here. I think it has strong intuitive appeal.

Loosely, replacement says the following.

Axiom scheme of replacement (informal)   Suppose we have a set $I$ and a first-order formula specifying a set $F(i)$ up to isomorphism for each $i \in I$. Then there exists a function into $I$ with fibres $F(i)$.

To state it more carefully, I need to recall the way I set up ETCS back in Part 1:

we have some things called “sets”, some things called “functions from $X$ to $Y$” for each set $X$ and set $Y$, and an operation of “composition” on functions, satisfying the following axioms: […]

We’re going to consider first-order formulas in the language of ETCS. Such formulas are built up in the usual way from variable symbols (with specified types), logical connectives such as “or” and “not”, brackets, and so on. Two kinds of quantification are allowed:

• quantification over sets;

• for given sets $X$ and $Y$, quantification over functions from $X$ to $Y$.

One kind of equality is allowed:

• for given sets $X$ and $Y$, equality between functions from $X$ to $Y$.

In particular, we can’t express equality between sets. But we can express isomorphism. Let’s recite the definition: “$X$ and $Y$ are isomorphic if there exist $f: X \to Y$ and $g: Y \to X$ such that $g \circ f = 1_X$ and $f \circ g = 1_Y$.” That definition uses only the kinds of quantification and equality that we’ve allowed. So, it’s expressible.

Notice that given a set $X$, we can quantify over the elements of $X$, because of the correspondence between elements of $X$ and functions $1 \to X$.

(A technical aside: the typing I’ve used to state ETCS is a bit different from McLarty’s, which means that replacement has to look a bit different too. But the difference is purely cosmetic.)

Now we want to formalize the idea of a “function” that assigns a set $F(i)$ (determined up to isomorphism) to each element $i$ of a set $I$. We’ll do this by considering the relation $\Phi(i, S)$ given by $\Phi(i, S) \iff F(i) \cong S$.

So, take a formula $\Phi = \Phi(i, S)$ whose only free variables are $i$ (of type “element of $I$”) and $S$ (of type “set”). Let’s say that $\Phi$ is functional on $I$ if for every $i \in I$, there exists a set $S$ with the following property:

whenever $S'$ is a set, $\Phi(i, S') \iff S' \cong S$.

Replacement says this:

Axiom scheme of replacement   Let $I$ be a set and $\Phi$ a functional formula on $I$. Then there exist a set $X$ and a function $p: X \to I$ such that $\Phi(i, p^{-1}(i))$ for all $i \in I$.

And that’s it. If we have a “family” of sets specified by some formula in the language of ETCS, then it arises as the family of fibres of some function.

Aside   I’ve said “axiom scheme” rather than “axiom” only so that logicians won’t think I’m a total heathen. In ordinary mathematical language, replacement is an axiom. But if we’re being careful about the logical structure, it’s an infinite collection of axioms, one for each $\Phi$. We can’t combine them into a single statement, because quantifying over formulas — “$\forall\Phi$” — is not allowed in our formal language.

(Joachim Kock once half-jokingly suggested that we should have a new quantifier, “for each”, for this kind of purpose. “For each $\Phi$” is different from “for all $\Phi$”.)

In the introduction to this post, I quoted from the paper by Colin McLarty that introduced replacement in this form. Here he is at more length, from p.48-49 of “Exploring categorical structuralism”, talking about his ETCS version of the axiom scheme of replacement.

Probably few readers think this looks like the ZF axiom scheme. That scheme has no homophonic translation into ETCS as it uses membership so heavily. It has a tree-theoretic translation, since every ZF formula does, but you can see that our axiom does not deal in trees. Our axiom is not a translation from ZF. It is a plain categorical version of Cantor’s idea.

Hellman finds the motivation for the axiom scheme “remains characteristically set theoretic” […] Indeed. Categorical set theory is a theory of sets. Its motives are set theoretic.

We can ask: are there motives for the axiom scheme of replacement apart from the ZF version of the scheme, and more generally apart from Zermelo’s form of membership-based set theory? But that is obvious. Cantor did not have Zermelo’s membership-based conception. […] And replacement is Cantor’s idea. […] Hellman rightly suggests we have no “new motivation” for the axiom. […] Neither did Fraenkel or Skolem. We all have Cantor’s original motive which predates the membership-based ZF version by decades.

The rest of this post is mostly about two things: what replacement can do, and what it can’t do.

#### What can replacement do?

A first, easy, consequence is the principle of separation or specification (or comprehension, as some people call it for reasons I don’t comprehend). Loosely, this says that given a set $I$ and a reasonable property $\phi$ that an element of $I$ may or may not have, we can form a subset

$\{ i \in I : \phi(i) \ \text{is true} \}$

of $I$. Interpret “reasonable property $\phi$” to mean “formula $\phi(i)$ with one free variable $i$ of type $I$”. Then there is indeed such a subset. For given such a formula $\phi = \phi(i)$, define a formula $\Phi = \Phi(i, S)$ by

$\Phi(i, S) \ \iff\ (\phi(i) \ \text{is true and}\ S \cong 1) \ \text{or}\ (\phi(i) \ \text{is false and}\ S \cong \emptyset).$

By replacement, there exist a set $X$ and a function $X \to I$ whose $i$th fibre has one element if $\phi(i)$ is true and none if $\phi(i)$ is false. In other words, $X$ is the subset of $I$ consisting of the elements satisfying $\phi$. And that’s what we wanted.

(That proof of separation is from Section 9 of McLarty’s paper.)

Now let’s go back to the motivating example from the start of the post: the wannabe sequence

$\beth_0 = \mathbb{N}, \ \beth_1 = \mathcal{P}(\mathbb{N}), \ \beth_2 = \mathcal{P}\mathcal{P}(\mathbb{N}), \ \ldots.$

Assuming replacement, we’ll show that this really is a sequence in the ETCS sense: there is a set $X$ with a map $X \to \mathbb{N}$ whose fibres are the beths $\beth_n$.

There’s a first-order formula $\Phi = \Phi(n, S)$ in natural numbers $n$ and sets $S$, such that

$\Phi(n, S) \ \iff\ \beth_n \cong S.$

One way to see this is to write

$[n] = \{ i \in \mathbb{N}: i \leq n \} = \{0, \ldots, n\},$

then define $\Phi(n, S)$ to mean “there exist a set $Y$ and a function $q: Y \to [n]$ such that $q^{-1}(0) \cong \mathbb{N}$, and $q^{-1}(i + 1) \cong 2^{q^{-1}(i)}$ for all $0 \leq i \lt n$, and $q^{-1}(n) \cong S$.”

The plan now is to use replacement to conclude that there exist a set $X$ and a function $p: X \to \mathbb{N}$ such that $p^{-1}(n) \cong \beth_n$ for all $n \in \mathbb{N}$. In order to do so, we have to show that $\Phi$ is functional — that is, for all $n \in \mathbb{N}$, there is a unique-up-to-iso set $S$ satisfying $\Phi(n, S)$. And at this point we meet a subtlety.

Intuitively, we’d like to say “it’s true by induction”. After all, suppose we have some $n$ and we can construct the unique $S$ satisfying $\Phi(n, S)$, namely, $\beth_n$. Then we can construct the unique $S'$ satisfying $\Phi(n + 1, S')$: it’s $2^{\beth_n}$. But in order to use induction, we have to know that

$\{ n \in \mathbb{N}: \text{there is a unique-up-to-iso set } \ S\ \text{such that}\ \Phi(n, S)\}$

is actually a subset of $\mathbb{N}$. In other words, there need to exist a set $M$ and an injection $M \to \mathbb{N}$ such that for $n \in \mathbb{N}$,

$n \in M \ \iff\ \text{there is a unique-up-to-iso set } \ S\ \text{such that}\ \Phi(n, S).$

That’s not obvious! But it’s true, assuming replacement: it’s an instance of separation. And once we have this set $M$, we can use induction to prove that $M = \mathbb{N}$, then use replacement to construct our set over $\mathbb{N}$ with fibres $\beth_0, \beth_1, \ldots$. So everything goes through fine.

In summary, the subtlety was that to construct the sequence $(\beth_n)_{n \in \mathbb{N}}$, we had to use replacement twice.

The “total space” of the family $(\beth_n)$ — I mean, the domain $X$ of the map $X \to \mathbb{N}$ whose fibres are $\beth_0, \beth_1, \ldots$ — is $\beth_\omega$. So, we’ve used replacement to construct $\beth_\omega$. A similar argument shows that assuming replacement, we can construct $\beth_W$ for every well-ordered set $W$. So all beths exist.

Not only does replacement imply that all beths exist, it implies that beth fixed points exist — unboundedly many of them, in fact. Recall from Part 7 that $I(X)$ denotes the initial well-order on a set $X$, and that $X$ is said to be a beth fixed point if $\beth_{I(X)} \cong X$.

Also in Part 7, I described a procedure for constructing a beth fixed point bigger than any given set $A$. Roughly speaking, what we do is to define a sequence of sets $(B_n)_{n \in \mathbb{N}}$ by

$B_0 = A, \ B_{n + 1} = \beth_{I(B_n)},$

and then take its sup, which can be shown to be a beth fixed point.

I say “roughly speaking” because this depends on this sequence actually existing, meaning that there exist a set $X$ and a map $p: X \to \mathbb{N}$ with $p^{-1}(n) \cong B_n$ for all $n$. (If it does exist then there’s no problem with taking the sup: the sets $B_n$ have $X$ as an upper bound, so they must have a least upper bound.) That’s where replacement is needed.

To apply replacement to this problem, we write down a first-order formula $\Psi(n, S)$ in natural numbers $n$ and sets $S$ such that

$\Psi(n, S) \ \iff\ B_n \cong S,$

using the same kind of manoeuvre with sets over $[n]$ as we used for the sequence $(\beth_n)$. And then a similar argument, using replacement twice, gives us the sequence $(B_n)$, hence a beth fixed point larger than $A$.

Once you’ve done a few proofs like this, it becomes clear that they should all be instances of a single theorem. That theorem is the principle of transfinite recursion.

There are several versions of transfinite recursion. To explain the one I’m talking about here, I’ll warm up with recursion over $\mathbb{N}$.

Suppose we have some kind of machine $G$ that takes as its input a finite sequence $(X_0, \ldots, X_{n - 1})$ of sets, and produces as its output a new set $G(X_0, \ldots, X_{n - 1})$. Then there should be a unique sequence $X_0, X_1, \ldots$ of sets such that

$X_n \cong G(X_0, \ldots, X_{n - 1})$

for all $n \in \mathbb{N}$. No base step is needed, as the case $n = 0$ gives $X_0 = G()$.

“Should” means that it seems intuitively appealing. It doesn’t follow from ETCS. But it does follow from ETCS with replacement.

The two examples above, $(\beth_n)$ and $(B_n)$, have the special feature that for $n \geq 1$, the $n$th member of the sequence depends only on the $(n - 1)$th member — not on all the previous members.

A different kind of special case is where $X_n$ doesn’t depend on the sets $X_0, \ldots, X_{n - 1}$ themselves, but only on $n$. In that case, what $G$ does it to assign a set $X_n$ to each $n \in \mathbb{N}$. So transfinite recursion reduces in this case to replacement over the indexing set $\mathbb{N}$.

Now that we’ve warmed up with recursion over $\mathbb{N}$, let’s step up to an arbitrary well-ordered set $W$. Suppose we have a first-order formula that takes as its input a well-ordered set $V \prec W$ and a family $(X_v)_{v \in V}$ of sets, and produces as its output a set $G((X_v)_{v \in V})$ (determined up to isomorphism). Then there should be a family $(X_w)_{w \in W}$ of sets, unique up to isomorphism, such that

$X_w \cong G\bigl( (X_v)_{v \lt w} \bigr)$

for each $w \in W$.

That phrasing is a bit informal, but I hope it’s clear by now how you’d formalize it. Really $G$ is a first-order formula

$\Theta = \Theta(V, X, q, S)$

where the variables denote a well-ordered set $V \prec W$, a set $X$, a function $q$ from $X$ to the underlying set of $V$, and a set $S$. To understand the relationship with the previous paragraph, think of $q: X \to V$ as the family $(X_v)_{v \in V}$ and $S$ as $G((X_v)_{v \in V})$. We ask that for each $V$, $X$ and $q$, there is a unique-up-to-iso set $S$ such that $\Theta(V, X, q, S)$ holds.

Stated formally, the principle of transfinite recursion is about such formulas $\Theta$. But I’m not going to write out the formal statement of the theorem, because it would just look like a load of notation and you’d only want to convert it back into the informal version above, the one involving $G$.

Replacement implies the principle of transfinite recursion, by an inductive argument on $W$.

Conversely, the principle of transfinite recursion implies replacement. We already saw this in the case $I = \mathbb{N}$, and it’s true for a general indexing set $I$ by a similar argument (after choosing a well-ordering on $I$). Replacement is essentially the special case of transfinite recursion where $G((X_v)_{v \in V})$ depends only on $V$.

(I learned about the converse from a blog post by Joel David Hamkins, based on work of Benjamin Rin.)

So that’s what replacement can do. In summary: it gives us separation and transfinite recursion, and in turn, transfinite recursion gives us the existence of all the beths and of unboundedly many beth fixed points.

#### What can’t replacement do?

The most important thing it can’t do is prove the existence of inaccessibles:

It is consistent with ETCS + replacement that there are no inaccessible sets.

The proof follows a similar pattern to the proofs of all the similar statements in previous posts, so I’ll skip it.

In the converse direction, I’m pretty sure replacement isn’t implied (over ETCS) by the existence of an inaccessible set, or even by the much stronger condition that there are unboundedly many measurable sets. But I don’t know a proof. Can anyone clarify this point, or at least give a reference? Are there, for instance, theorems on determinacy that can be proved using replacement but aren’t guaranteed by the existence of unboundedly many measurable sets?

#### Is assuming replacement a good idea?

Sometimes we want to perform operations that are beyond what ETCS provides. At that point, it’s tempting to supplement ETCS with replacement. I don’t have a strong argument against doing so, but it’s a matter of fact that replacement is often way more than necessary.

For example, the first thing that people tend to want beyond ETCS is to be able to form, for a given set $X$, another set containing all of

$X, \ \mathcal{P}(X), \ \mathcal{P}\mathcal{P}(X), \ \ldots.$

(For instance, this lets us do an infinitely iterated dual vector space construction.) As I mentioned in Part 6, such a set exists if we assume the existence of all beths. This is a far weaker assumption than replacement.

But occasionally, the existence of all beths might still not be enough. In such cases, it may be enough to assume the existence of a beth fixed point, or, stronger still, unboundedly many beth fixed points. All of these assumptions are still weaker than replacement.

In the hierarchy of assumptions that we’ve been going through in the posts, the next one after “unboundedly many beth fixed points” is the existence of an inaccessible set. Assuming the existence of an inaccessible is (if I’m not mistaken) neither stronger nor weaker than replacement, and it’s something that people often do — for example, in the context of Grothendieck universes. Sometimes, one even wants to assume that unboundedly many inaccessibles exist.

Now although such assumptions about inaccessibles are (I think) incomparable with replacement — neither implies the other — they seem to me to be somehow more specific, more limited, than replacement. I can’t give a precise meaning to that, except to point out again that replacement is an axiom scheme: it gives one axiom for each suitable first-order formula $\Phi$, whereas the existence of inaccessibles is a single axiom.

The lesson I’d draw is that if we’re looking at supplementing ETCS with something like replacement, it’s at least worth asking whether it might be better to use an axiom such as “all beths exist” or “there are unboundedly many inaccessibles” instead.

#### ETCS, ZFC, and the spectre of inconsistency

I’ve somehow managed to get this far without mentioning the equivalence between ETCS + replacement and ZFC. It’s not just that they have the same consistency strength (one is consistent if and only if the other is). They’re actually biiinterpretable, which is equivalence in the strongest sense. As I wrote in Part 1, abbreviating replacement to “R”:

[Biinterpretability] means that there’s a way of translating statements in the language of ETCS+R into statements in the language of ZFC, and vice versa, such that the two translation processes are mutually inverse and make theorems of ETCS+R match up with theorems of ZFC. In particular, ETCS+R is a categorical set theory exactly as strong as ZFC.

And this brings me to one reason why it might be wise not to assume replacement unless it’s truly needed: the spectre of inconsistency.

I find it barely conceivable that ETCS could be inconsistent. Its axioms are all principles that mathematicians — not just category theorists, but all mathematicians — use day in, day out. If it’s inconsistent, that means there’s some problem with constructing a sequence $(f^n(x))_{n \geq 0}$ from a function $f: X \to X$ and an element $x \in X$, or with taking the characteristic function $X \to \{0, 1\}$ of a subset of $X$, etc. The only axiom in ETCS that raises the slightest suspicion is choice, but that alone can’t cause inconsistency: it’s a theorem that if ETCS is inconsistent then so is ETCS without choice.

If ETCS is shown to be inconsistent then it will have truly cataclysmic implications. Vast amounts of mathematics — perhaps the majority of all published work, ever — will be invalidated.

What if ZFC is inconsistent? There was a lot of discussion of this question around the time that Vladimir Voevodsky was speaking on the topic (e.g. see this MathOverflow question), and obviously he’s not the only one to have considered it.

For myself, I’ll be pretty unflustered if, one day, ZFC is shown to be inconsistent, as long as ETCS remains intact. An equivalent statement: a proof of the inconsistency of ETCS+R wouldn’t bother me much if it used replacement in some essential way.

You might be thinking to yourself “ah, well, Tom’s just saying that because ETCS is the system he’s most familiar with”. But I don’t think this is just about personal feelings. ETCS alone provides enough foundation for a vast amount of mathematics. McLarty has argued that the entirety of EGA and SGA need nothing more. Assuming McLarty is correct, that is just an objective truth.

A proof of the inconsistency of ZFC (or equivalently, ETCS+R) that needed the full strength of replacement would certainly make a huge impact in the world of foundations, but I think its impact beyond would be limited. If it also proved the inconsistency of ETCS + “all beths exist”, or of ETCS plus some similarly mild additional axiom, that would cause upheaval across much more of mathematics. But still, probably many uses of principles beyond ETCS are out of convenience rather than true necessity. So the impact would be nothing compared to the global cataclysm that would be the inconsistency of ETCS.

#### Thanks, Todd!

Before I finish this series, I want to make sure to thank Todd Trimble. I’m doing it now because Todd helped me understand some points about replacement in conversations at Category Theory 2019 in Edinburgh. But my gratitude extends much further back. Todd’s 2008 series of posts on categorical set theory at the Topological Musings blog (1, 2, 3) had a big influence on me, both whetting my appetite to learn more categorical set theory and showing me how it’s done.

(Todd’s posts are also archived at the nLab, starting here.)

Let me quote from the third of Todd’s posts, where he nicely describes the relationship between the elementary approach to categorical set theory and the topos-theoretic approach, as well as the influence of the late Myles Tierney:

Most textbook treatments of the formal development of topos theory (as for example Mac Lane–Moerdijk) are efficient but highly technical, involving for instance the slice theorem for toposes and, in the construction of colimits, recourse to Beck’s theorem in monad theory applied to the double power-set monad [following the elegant construction of Paré]. The very abstract nature of this style of argumentation (which in the application of Beck’s theorem expresses ideas of fourth-order set theory and higher) is no doubt partly responsible for the somewhat fearsome reputation of topos theory.

In these notes I take a much less efficient but much more elementary approach, based on an arrangement of ideas which I hope can be seen as “natural” from the point of view of naive set theory. I learned of this approach from Myles Tierney, who was my PhD supervisor, and who with Bill Lawvere co-founded elementary topos theory, but I am not aware of any place where the details of this approach have been written up before now.

This series of posts is intended to continue in that same Tierney–Trimble spirit.

#### Next time

The final post in this series will be a summary of everything, including a table of contents, diagrams, and (because I won’t be able to help myself) some reflections on the whole enterprise.

Added later: but first, Mike Shulman will talk about different versions of replacement.

Posted at July 18, 2021 8:04 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3333

### Re: Large Sets 12

I recently learned, far later than I should have, that Borel Determinacy (BD), the poster child for the use of Replacement in ZFC (there is a model of ZC where BD fails!), only requires Replacement where the domain is a countable ordinal. Sometimes people get a bit lazy and say it needs Replacement for $\omega_1$, but people (mostly Don Martin) have done really serious analysis of the proof, and (apart from some minor coding issues around forming ordinals irrelevant in the structural setting) pinned it down precisely. There are even more fine-grained results, saying how Replacement for specific countable ordinals gets this-and-that fragment of BD. Even more, it’s not countable Replacement as such, but only the existence of $\beth_\alpha$ for all countable ordinals $\alpha$! So asking that all beths exist is overkill even for the most famous theorem that is advertised as requiring Replacement.

It’s also curious that Fraenkel, after the work where he (apparently tentatively!) suggested Replacement as a strengthening of Zermelo’s axioms, also proposed an alternative axiom equivalent to countable Replacement. It was really von Neumann and his version of ordinals that tipped the scales towards arbitrary Replacement.

Posted by: David Roberts on July 19, 2021 1:30 AM | Permalink | Reply to this

### Re: Large Sets 12

Wow! I didn’t know that either.

Of course, there’s a formal argument that any particular use of replacement cannot possibly use anything close to the full strength of replacement. Namely, replacement is, as Tom mentioned, an axiom schema consisting of infinitely many axioms; whereas any single, finite, proof can only use finitely many instances of it. But it’s interesting to see exactly which instance(s) of replacement are used in the proof of BD, and that it’s a familiar and easy-to-state one like “$\beth_\alpha$ exists for countable $\alpha$” that looks very weak.

In light of this fact, I’m curious whether anyone can give an intuitive justification for the existence of beths that doesn’t essentially appeal to our intuition for replacement?

Posted by: Mike Shulman on July 19, 2021 3:47 AM | Permalink | Reply to this

### Re: Large Sets 12

David wrote:

[Borel Determinacy requires] only the existence of $\beth_\alpha$ for all countable ordinals $\alpha$!

Wow from me too! Do you have a reference for that?

Mike wrote:

I’m curious whether anyone can give an intuitive justification for the existence of beths that doesn’t essentially appeal to our intuition for replacement?

Interesting question. My only thought here is that some people might find transfinite recursion a more (or differently) intuitive principle than replacement. And the existence of all beths is just one instance of transfinite recursion.

The idea that transfinite recursion may be more fundamental (whatever that means) than replacement was the subject of the paper by Benjamin Rin (“In this paper, we seek intrinsically justified reasons for believing in recursion…”) and the associated blog post by Joel David Hamkins cited in my own post.

(Hamkins proves in his post that over ZC, replacement and transfinite recursion are equivalent, and I mentioned in mine that the same is true over ETCS.)

I’ve only skimmed Rin’s paper, but here are a couple of relevant snippets. For context (p.3):

The first historical use of transfinite recursion is due to Cantor, in the proof of what is now called the Cantor–Bendixson Theorem. The theorem states that every closed uncountable set $X$ of real numbers contains a perfect subset. A perfect set is a set containing no isolated points. Cantor’s proof of the theorem uses a procedure that we now recognize as transfinite recursion [removing the isolated points from $X$, then doing the same to the resulting set, and so on].

Then on p.4, after noting that the standard proof of transfinite recursion uses replacement, Rin writes:

However, though Replacement might be a good reason to think that recursive definitions are justified, it is natural to ask: is this the reason why we in fact think that they are justified? Certainly Cantor did not justify his argument this way. Historically, the Axiom of Replacement was not introduced until 1922, and it was not until Von Neumann later made the connection between Replacement and recursion that the two were formally associated. As a matter of actual practice, definitions by recursion require no understanding of or reference to the Axiom of Replacement at all. A mathematician working in topology can perfectly well understand the definition of Cantor’s function without this formal grounding.

I’m not sure this example of transfinite recursion is well-chosen for the point he’s making, because I don’t think it actually does need replacement. I think Cantor’s theorem on perfect sets is provable in ETCS. For the proof uses the kind of transfinite recursion where the things we’re defining recursively are elements of a fixed set (in this case, $\mathcal{P}(\mathbb{R})$), and that kind of transfinite recursion is doable in ETCS. The kind of transfinite recursion discussed in my post, the one that’s equivalent over ETCS to replacement, is where the things we’re defining recursively are sets. Nevertheless, I think Rin still has a point.

Posted by: Tom Leinster on July 19, 2021 10:49 AM | Permalink | Reply to this

### Re: Large Sets 12

I got it from a draft version of Don Martin’s book Determinacy of infinitely long games. It’s Corollary 2.3.8, though check Remark (i) after it as well. The thing is stated for more general game trees $T$, but I guess that one could take Baire space for concreteness.

Posted by: David Roberts on July 20, 2021 2:31 AM | Permalink | Reply to this

### Re: Large Sets 12

Thanks, David. I’d never have found that.

I’m only just getting to grips with the very basics of games and determinacy, and what the notation is, so I have some elementary questions. Corollary 2.3.8 states:

If $\mathcal{P}^\alpha(T)$ exists for every countable $\alpha$, then all Borel games in $\mathbf{T}$ are determined.

This is a corollary of Theorem 2.3.7, in which it is specified that $\mathbf{T}$ is a “game tree with taboos”. And in the introduction to Section 2.1, a game tree with taboos is defined to be a game tree $T$ together with some extra stuff that I don’t think need concern us here.

Wikipedia’s statement of the Borel determinacy theorem is this:

any Gale–Stewart game whose payoff set is a Borel set is determined.

First question: is this a consequence of Corollary 2.3.8? The definitions in Wikipedia and Martin’s book draft seem to match up, but I wanted to check with someone who actually knows what’s going on.

(I say “a consequence” because Cor 2.3.8 involves games with taboos, a level of generality that doesn’t seem to be in the Wikipedia statement.)

Second question: since $T$ can be arbitrarily large, don’t we need the existence of all beths in order for $\mathcal{P}^\alpha(T)$ to exist for all countable $\alpha$?

Maybe this is related to your comment:

The thing is stated for more general game trees $T$, but I guess that one could take Baire space for concreteness.

I understand that a game tree is a set of finite sequences of elements of a set $A$, satisfying axioms. “Borel” refers to the product topology on $A^\mathbb{N}$, regarding $A$ as discrete. And I understand that a typical case is $A = \mathbb{N}$, in which case $A^\mathbb{N}$ with this topology is called Baire space.

But Cor 2.3.8 and the Wikipedia statement of Borel determinacy are about an arbitrary $A$, right? So if I’m understanding correctly (a big “if”), the existence of $\beth_\alpha$ for countable $\alpha$ isn’t enough to prove Borel determinacy.

Posted by: Tom Leinster on July 20, 2021 10:32 AM | Permalink | Reply to this

### Re: Large Sets 12

As far as I am aware, Borel Determinacy is about Borel sets in Polish spaces, See for instance this answer. This comment also gives some insight.

It might be the case that sets of arbitrary cardinality could be used as input, but I feel this is even further from “generic mathematics”; at least BD for Baire or Cantor space can claim to prove things about the Borel $\sigma$-algebra of the reals.

Posted by: David Roberts on July 20, 2021 2:29 PM | Permalink | Reply to this

### Re: Large Sets 12

I see. Thanks.

Martin’s paper proving the Borel determinacy theorem is only six pages, of which the proof only makes up about four. It looks very self-contained. So I’m tempted to just work through it, translating it into ETCS terms and seeing what further assumptions are needed. For someone like me, that might actually be easier than trying to understand what set theorists have written about the strength of Borel determinacy, which would involve figuring out the intricacies of the different fragments of ZFC and understanding how a ZFC-ist views subtle distinctions such as ordinal vs. well-ordered set. It would probably be more fun too.

From a quick look at Martin’s paper, I don’t see anywhere that replacement is mentioned explicitly. My guess is that replacement — or whatever fraction of it suffices — is used in the proof of Lemma 4, which begins by saying that the idea is to show that a certain “inverse limit” exists.

Posted by: Tom Leinster on July 20, 2021 3:15 PM | Permalink | Reply to this

### Re: Large Sets 12

For future reference, I’m putting here a link to a MathOverflow question by David R about Borel determinacy not needing full replacement. No one gives an actual answer, but there are many helpful comments.

David’s question was actually prompted by a comment by Asaf Karagila on a question of mine, so in principle I should have already known that Borel determinacy only needs the existence of all beths. By “in principle” I mean “if I had a perfect memory and understood every mathematical statement that passed before my eyes.”

Posted by: Tom Leinster on July 21, 2021 2:14 PM | Permalink | Reply to this

### Re: Large Sets 12

Yes, I also had some discussion with Andrés Caicedo on Twitter, and he pointed me to Martin’s book. I should really post an answer to that MO question, giving a (very!) brief outline of Martin’s result.

Posted by: David Roberts on July 21, 2021 11:08 PM | Permalink | Reply to this

### Re: Large Sets 12

Let me have another bash at Mike’s question:

I’m curious whether anyone can give an intuitive justification for the existence of beths that doesn’t essentially appeal to our intuition for replacement?

Here’s a justification that’s perhaps not wholly convincing, but I think it is at least a bit different from the usual intuition for replacement.

It starts from the principle “there should be more sets than elements of any particular set”. Since everything is supposed to be isomorphism-invariant, a formalization of that principle is:

for all sets $I$, there exists an $I$-indexed family of pairwise non-isomorphic sets.

As mentioned in Part 5, this is equivalent to the existence of all alephs.

After some experience with the continuum hypothesis, weak limits vs strong limits, etc., one might decide that not being isomorphic is too weak a condition. Say that sets $X$ and $Y$ are “strongly non-isomorphic” if $2^X \leq Y$ or $2^Y \leq X$. We might think of strengthening the principle above to:

for all sets $I$, there exists an $I$-indexed family of pairwise strongly non-isomorphic sets.

As mentioned in Part 6, this is equivalent to the existence of all beths.

Posted by: Tom Leinster on July 19, 2021 1:56 PM | Permalink | Reply to this

### Re: Large Sets 12

Yes, I can certainly agree that transfinite recursion has a different intuition than replacement. Moreover, it’s possible to give a direct formal description of “the result of a transfinite recursion” that doesn’t require actually iterating over well-founded relations: a (higher) inductive type.

However, since general transfinite recursion is equivalent to replacement, maybe the question I should have asked is whether anyone can give an intuitive justification for the existence of beths that doesn’t also justify full replacement?

Posted by: Mike Shulman on July 19, 2021 1:58 PM | Permalink | Reply to this

### Re: Large Sets 12

Is there a project similar to the ‘Consequences of the Axiom of Choice Project’ but for replacement?

### Re: Large Sets 12

A very nice post! I have a lot that I want to say about replacement, some of which I may make into a post 12.5. But I’ll start by responding to this:

such assumptions about inaccessibles are (I think) incomparable with replacement.

I believe this is true (at least, modulo the usual consistency assumptions). For instance, let $X$ be an inaccessible and let $Y$ be the least strong limit larger than $X$; then I would expect the collection of sets $\lt Y$ to model ETCS + there exists an inaccessible, but not replacement.

However, there’s a certain sense in which replacement is the axiom “the proper class of all sets is inaccessible” (or more precisely, something weaker than inaccessible).

Of course the class of all sets is a “strong limit”, since one of the ETCS axioms says that the powerset of a set is still a set. And the axiom of replacement says, roughly, that given a “family of sets indexed by a set” (in the sense of a functional formula $\Phi$), their sum is again a set (the domain set $X$ in McLarty’s version) — which is to say, smaller than a proper class.

So whereas axioms like “there is an inaccessible” or “there are a proper class of inaccessibles” are about the existence of large sets, the replacement axiom says that a proper class is large. There are other “large cardinal” axioms that act like this too, e.g. Vopenka’s principle is about proper classes rather than the existence of large sets (although it implies that many very large sets exist), and long-time Cafe readers may recall that strong Feferman set theory is equiconsistent with ZFC plus “the proper class of sets is Mahlo”.

However, there’s also a difference between inaccessibility and replacement in that regularity of a small set is a second-order axiom whereas replacement is a first-order one. The first means that the family of sets in the definition of regularity of a set $X$ is an arbitrary family: it doesn’t have to be definable by any logical formula. This makes regularity (and hence inaccessibility) a second-order property. By contrast, the “function” from elements to sets in replacement does have to be definable by a logical formula (and in this case, we don’t have any way to even say what an “arbitrary” such function would be).

In the presence of separation, any functional formula whose codomain is a set defines an actual function. Thus, at least in this case, second-order regularity is stronger than first-order regularity. A strong limit set that satisfies first-order regularity is, I believe, called a worldly set; this is equivalent to saying that the collection of sets $\lt X$ satisfies ETCS+R. Worldly sets are large, but much smaller than inaccessibles: if there is an inaccessible, then the worldly sets are unbounded below it. So in this sense, at least, the existence of an inaccessible is stronger than replacement: it’s a statement about a stronger property, although it asserts it of a set rather than of a proper class.

Posted by: Mike Shulman on July 19, 2021 3:36 AM | Permalink | Reply to this

### Re: Large Sets 12

Thank you for this very interesting series of posts!

Before I raise a few questions about replacement, let me slightly amplify one thing you said:

I find it barely conceivable that ETCS could be inconsistent. Its axioms are all principles that mathematicians — not just category theorists, but all mathematicians — use day in, day out.

For me, it’s just about conceivable that ETCS (and indeed full second-order arithmetic) could be inconsistent, because it incorporates impredicativity in the form of unrestricted power sets. Do we really have a clear conception even of $\mathcal{P}(\mathcal{P}(\mathbb{N}))$, the power set of the power set of the natural numbers, let alone further iterations?

And it seems very little mathematics actually relies on full-blown impredicativity, except as a convenient short-cut.

On the other hand, it’s not quite clear how best to formulate an ergonomic classical predicative foundational system. The problem of course is that it seems from a constructive point of view that function types are perfectly acceptable, and certainly we cannot do without them, but if we add the law of excluded middle (LEM), then we get all power sets from function types. Some possible work-arounds include Feferman’s systems of Variable Finite Types, Explicit Mathematics, and Operational Set Theory, Aczel and Gambino’s logic-enriched type theory, and Maietti and Sambin’s minimalist type theory, all extended with LEM at the level of logic.

But instead of a fully classical system, we may also consider semi-intuitionistic theories, which only postulate excluded middle for a restricted class of propositions, for example first-order arithmetical propositions. This seems to agree with mathematical practice, where LEM is often used in rather concrete situations, while abstract categorical arguments are usually wholly constructive.

Which brings me back to type theory and replacement. In type theory we usually work with an externally indexed hierarchy of universes $\mathcal{U}$. In HoTT with pushouts, we can prove a result that’s reminiscent of replacement: For a map $f : A \to B$ where $A$ is a $\mathcal{U}$-small type and $B$ is a locally $\mathcal{U}$-small type, the image of $f$ is essentially $\mathcal{U}$-small, i.e., equivalent to a type in $\mathcal{U}$. (I think this was first pointed out by Rijke.) If we have impredicativity/propositional resizing, then any large set is locally $\mathcal{U}$-small, and as a corollary, we have that any map from a $\mathcal{U}$-small set to a large set has an essentially $\mathcal{U}$-small image.

So that smells a bit like replacement, and is indeed often called the “replacement principle” in HoTT, but maybe the resemblance is misleading? (Cf. Mike’s comment above on the relationship between inaccessibles (universes) and replacement.)

Should a proper type-theoretic version of replacement allow us to construct the union $\sum_{i:\mathbb{N}}\mathcal{U}_i$ from the external sequence $\mathcal{U}_0,\mathcal{U}_1,\ldots$?

Posted by: Ulrik Buchholtz on July 19, 2021 2:52 PM | Permalink | Reply to this

### Re: Large Sets 12

Do we really have a clear conception even of $\mathcal{P}(\mathcal{P}(\mathbb{N}))$, the power set of the power set of the natural numbers, let alone further iterations?

I take your point. Personally, I’d hesitate to say that I have a clear conception of $\mathcal{P}(\mathcal{P}(\mathbb{N}))$, if that means an accurate mental picture of it. Then again, with the same interpretation, I probably don’t have a clear conception of the number 181.

This kind of question is slippery, I find, and highly subjective. What I find more helpful is the question “what would be the impact on mathematics if such-and-such a system turns out to be inconsistent?”, as in the post.

To be sure, that question has subjective aspects too. If replacement is found to be a source of inconsistency, set theorists will be probably judge this discovery to be much more impactful than differential geometers. But it seems to me that this question comes closer to having a quantifiable answer.

Posted by: Tom Leinster on July 22, 2021 6:13 PM | Permalink | Reply to this

### Re: Large Sets 12

One thing that’s quite objective is the consistency strength hierarchy (– and the remarkable phenomenon of its linearity when restricted to “natural” systems and axioms).

And here it’s a fact that there’s a large jump in strength from (generalized) predicative systems to genuinely impredicative ones.

Clearly, the conceivability of an inconsistency grows the further we move up this hierarchy, but of course it’s subjective where the “threshold of conceivability” lies, if anywhere.

Set theorists would find Z and even ZF at the ridiculously weak end, whereas proof theorists would tend to think of Z as already ridiculously strong.

Posted by: Ulrik Buchholtz on July 22, 2021 8:26 PM | Permalink | Reply to this

### Re: Large Sets 12

A request: If you’re not worn out by this really terrific series of posts, could you say a few words in some future post about how the Algebraic Set Theory of Joyal and Moerdijk might relate to what you have done? In any case, thanks so much for bringing so many set-theoretical issues into the limelight. (I had been blissfully ignorant of most of them!)

Posted by: Keith Harbaugh on July 19, 2021 5:33 PM | Permalink | Reply to this

### Re: Large Sets 12

Thanks for the kind words!

I’m afraid I don’t know much about algebraic set theory. I spent a while with Joyal and Moerdijk’s book some time ago, and retain the impression of it being very neat, but I definitely couldn’t talk about the relationship with these posts. Maybe Mike has thoughts. I see that he has a section on algebraic set theory in his Set theory for category theory, and Steve Awodey also has an introductory paper on the subject.

Posted by: Tom Leinster on July 19, 2021 7:35 PM | Permalink | Reply to this

### Re: Large Sets 12

Thanks for the references. Two quick add-ons:
1) Steve Awodey, Carsten Butz, Alex Simpson, Thomas Streicher, “Relating first-order set theories, toposes and categories of classes”, Annals of Pure and Applied Logic 165 (2014) looks to be an excellent development of the AST project (Mike cited it in his 2018 “Comparing” paper). It’s definitely worth a good look for what AST can do. This paper seems highly relevant to the Marquis/Lawvere/Tarski/mathematics/metamathematics discussion below.
2) For LOTS of references, see the Algebraic Set Theory web site at CMU.

Posted by: Keith Harbaugh on July 21, 2021 11:49 AM | Permalink | Reply to this

### Re: Large Sets 12

BTW, I’d be really interested in what people think about that ABSS paper.

Posted by: Keith Harbaugh on July 21, 2021 12:23 PM | Permalink | Reply to this

### Re: Large Sets 12

The goal of the original Joyal–Moerdijk paper, as I understand it, was to give a universal property to the class of membership-based sets; but subsequent work in algebraic set theory (including ABSS) has focused more on the “categories of classes” they introduced. These are a categorical analogue of a class-set theory like NBG in the same way that an elementary topos is a categorical analogue of a pure set theory like ZFC.

Just like in ZFC we can talk about “proper classes” (at least, definable ones) by way of the formulas that specify them, in ETCS we can talk about proper classes by way of formulas (which Tom indeed did here when discussing replacement). But if we need or want to reify proper classes as primitive objects, then the categories of classes in AST give us a structural analogue of theories like NBG.

There is one important difference that in ETCS, a “proper class” is more of a large category, or at least a large groupoid, rather than a “large set” (as we saw in McLarty’s replacement axiom, where “uniqueness” means “uniqueness up to isomorphism”) — whereas in a category of classes, the classes are indeed like large sets, without any automorphisms of their objects. Of course, one could speculate about using “2-categories of classes” instead, but I don’t know of any published work in that direction.

I don’t know whether anyone has explicitly written down an ETCS-like theory of a “well-pointed category of classes”; most of the work I know of has been focused on more general ones and their internal logic. ABSS, I remember correctly, is (at least partly) about comparing membership-based and categorical class-set theories, analogously to how membership-based pure set theories like ZFC can be compared to categorical ones like ETCS.

Steve Awodey sometimes frequents the cafe, so perhaps he’ll have more to say.

Posted by: Mike Shulman on July 22, 2021 5:43 AM | Permalink | Reply to this

### Re: Large Sets 12

If ETCS+R were inconsistent, it seems likely to me that the “culprit” would not be replacement. Of course, ETCS+R is indeed much stronger than ETCS. However, there’s a “slight” modification of ETCS — let me call it $ETCS'$ — such that if $ETCS'$ is consistent then so is $ETCS'+R$. In other words, replacement doesn’t add any consistency strength to $ETCS'$. Moreover, $ETCS'$ also suffices for enormous amounts of mathematics, so if ETCS+R were inconsistent I would be inclined to instead blame the gap between $ETCS'$ and $ETCS$.

To explain what $ETCS'$ is, recall that ETCS is formulated in classical logic, with the axiom of excluded middle: any statement is either true or false. This implies, in particular, that the category of sets in ETCS is boolean: every subset has a complement.

$ETCS'$ consists of the same axioms of ETCS, but formulated in an ambient intuitionistic logic. This may seem like a big change, but in fact it’s not very big, because in $ETCS'$ the category of sets is still boolean! This follows from the Diaconescu-Goodman-Myhill theorem that the axiom of choice implies the law of excluded middle, because $ETCS'$ still includes the axiom of choice. Booleanness of Set is equivalent to saying that LEM holds for all bounded formulas, i.e. those involving only quantifiers of the form “for all elements $a\in A$” rather than “for all sets $A$”. Since most quantifiers encountered in mathematics are bounded, mathematics in $ETCS'$ doesn’t look much different than in ETCS.

At this point you might wonder, given the DGM theorem, why $ETCS'$ differs from ETCS at all. The reason is that $ETCS'$ lacks the full axiom of separation, and the DGM theorem only implies that LEM holds for propositions that we can apply the axiom of separation to. Specifically, in the proof of DGM we start with a proposition $\phi$ and quotient the set $\mathbf{2}=\{0,1\}$ by the equivalence relation such that $0\sim 1 \iff \phi$, but constructing such a quotient requires that such a relation exists, which requires separation: $[\sim] = \{ (x,y) \in \mathbf{2}\times\mathbf{2} \mid x=y \vee \phi \}$.

But didn’t Tom prove that replacement implies separation, so that $ETCS'+R$ would have separation and hence full LEM? Well, if you look at the proof he gave you’ll see that it uses LEM. That use of LEM may look inessential: it’s natural to think that instead of

$\Phi(i,S)\iff (\phi(i) \;\text{ is true and }\; S\cong 1) \;\text{ or }\;(\phi(i)\;\text{ is false and }\;S\cong \emptyset)$

we could just say

$\Phi(i,S)\iff S\;\text{ is a subsingleton and } \;(S \;\text{ is inhabited } \;\iff \phi).$

But for an arbitrary $\phi$ in intuitionistic logic, we can’t prove that such an $S$ exists without already having full separation! The only way to construct it is as something like $\{ \star \mid \phi\}$.

Thus, I would say that the strength of ETCS+R “really” resides in having LEM for unbounded formulas. Or, more precisely, in having separation for unbounded formulas: adding full separation to $ETCS'+R$ yields a theory equiconsistent to ETCS+R. This is related to Ulrik’s point about impredicativity, but at a higher consistency level: $ETCS'+R$ is already very impredicative, since it has bounded classical logic and powersets, but the extra bit of impredicativity in full separation makes it much stronger.

The equiconsistency of $ETCS'+R$ with $ETCS'$ comes from passing to the stack semantics. Specifically, if $E$ is a model of $ETCS'$ (such as, for instance, a model of ETCS), then its stack semantics is a model of $ETCS'+R$ (but not a model of ETCS+R, even if $E$ was a model of ETCS). Intuitively, the meaning of “existence” in the stack semantics is adjusted in such a way that replacement (and collection) become true essentially by definition: the hypothesis that “for each $i\in I$ there exists a set $S$” is essentially defined to mean that there is a family of such sets as in replacement/collection.

Finally, it’s worth noting that McLarty’s argument that ETCS suffices for EGA+SGA goes by way of something very much like a stack semantics translation. He avoids replacement and collection by working explicitly with families of objects all the time. This works but is tedious, and the stack semantics can “automate” it. Put differently, if you believe in the consistency of ETCS, or even just $ETCS'$, then you’re free to use replacement without committing yourself to any stronger consistency assumptions, as long as you stick to $ETCS'$, with only bounded LEM and bounded separation.

Posted by: Mike Shulman on July 19, 2021 7:27 PM | Permalink | Reply to this

### Colin

You say “you’re free to use replacement without committing yourself to any stronger consistency assumptions, as long as you stick to ETCS’, with only bounded LEM and bounded separation.” But I do not see how LEM comes into it at all. What matters is bounded separation.

Posted by: Colin S McLarty on July 21, 2021 5:53 PM | Permalink | Reply to this

### Re: Colin

Unbounded LEM together with replacement implies unbounded separation, by the proof that Tom gave. So if you want to avoid unbounded separation in the presence of replacement, you also need to avoid unbounded LEM.

Posted by: Mike Shulman on July 22, 2021 4:59 AM | Permalink | Reply to this

### Re: Large Sets 12

If ETCS+R were inconsistent, it seems likely to me that the “culprit” would not be replacement.

So wait, let me make sure I understand.

You take a trip to the cavern that is the home of the All-Knowing Oracle. The Oracle reveals to you that ETCS+R is, in fact, inconsistent, but tells you no more.

Shortly, the Oracle will also reveal whether ETCS alone is consistent. But first, you are required to place a bet on what you think the answer will be. In that situation, would you bet that ETCS is inconsistent?

My understanding is that yes, you would. In that case, do you have some feeling for which axioms of ETCS would likely be to blame — I mean, a minimal set of axioms that could be dropped to get a consistent system?

Posted by: Tom Leinster on July 22, 2021 6:23 PM | Permalink | Reply to this

### Re: Large Sets 12

Maybe I’m stating the obvious here, but if a system is inconsistent (or merely judged too strong), it’s not in general the fault of a single axiom (or rule), but a consequence of how everything in the system interacts. Each axiom may be fine on its own in another context.

Also, finding a weaker consistent system is not always as easy as just dropping an axiom. One would have to re-think the whole to get something meaningful, possibly adding new things to the mix to make up for old parts thrown away.

Posted by: Ulrik Buchholtz on July 22, 2021 8:42 PM | Permalink | Reply to this

### Re: Large Sets 12

I agree, you can change an axiom system in other ways than dropping axioms. Mike’s ETCS$'$ is an example of that.

But my question stands. If ETCS is inconsistent, then one can ask which subsets of the ETCS axioms are consistent, and one can ask what the maximal consistent subsets are. That’s a black and white mathematical question, and that’s the last thing I asked in my previous comment.

Of course, the answer might be tremendously complicated. I stated ETCS as 10 axioms, so there are 1024 subsets, each of which may or may not be consistent. But we’re probably not interested in statements like “ETCS becomes consistent if we drop the requirement that composition is associative and unital”. We might agree to keep some fragment of ETCS such as “well-pointed finite product category with an empty set”. Those conditions take up five of the axioms, so we’re only talking about switching the five remaining axioms on or off, and the possibilities are reduced to $2^5 = 32$.

I think we’re saying things to each other that we both know well! This is “just” a matter of communication. What I’m interested in is whether Mike (or you, or anyone else) has some intuition such as “I can imagine that ETCS is inconsistent, but I can’t imagine ETCS minus existence of a subset classifier being inconsistent”. I’ve stated my own position; I’m interested to hear other people’s.

Posted by: Tom Leinster on July 22, 2021 9:05 PM | Permalink | Reply to this

### Re: Large Sets 12

But my question stands. If ETCS is inconsistent, then one can ask which subsets of the ETCS axioms are consistent, and one can ask what the maximal consistent subsets are. That’s a black and white mathematical question, and that’s the last thing I asked in my previous comment.

My point was that maybe that isn’t the best question! :-P

At any rate, to your question: One very weak 4-element set of axioms is the one without a natural number system. This has a model in the finite sets, and this should be true even in very weak fragments of arithmetic (exponential function arithmetic plus a tiny bit more; primitive recursive arithmetic would certainly suffice). So that’s a maximal proper subset whose inconsistency I can’t really conceive of.

In your framework, does the existence of a subobject classifier follow from choice? I don’t think so, but I don’t have any intuition of what that system models. But if so, we don’t get anything new by just dropping that axiom. Otherwise, we’d have to drop both choice and a subobject classifier. But in that case, to get something useful, you’d have to add in a bunch of things (finite colimits to start), etc.

Posted by: Ulrik Buchholtz on July 23, 2021 9:13 AM | Permalink | Reply to this

### Re: Large Sets 12

I’ve stated my own position; I’m interested to hear other people’s.

Thanks for the nice series of posts!

I think it’s difficult to say much about possible inconsistency until someone comes up with a clever observation concerning how an inconsistency might arise.

Take for example the challenge I posed in the middle of the series: if one disallows the obvious contradictions and use of the one clever tool we have (the diagonal argument, or related things like Lawvere’s fixed point theorem) is there are any logical problem with assuming that $2^{\mathbb{N}} \lt \mathbb{N}$?

For another example, suppose that we disallow use of Girard’s paradox or related things. Is there then a logical problem with assuming that $Set:Set$ in Martin-Löf dependent type theory?

What I am trying to say is that deep thinkers (Martin-Löf, or Kronecker for example in the first example) often had good reasons to believe that certain foundations were reasonable before someone (Girard, Cantor, …) came up with some clever trick to illustrate a ‘problem’.

Thus maybe there could be a problem even with things that almost no-one questions, like induction, never mind more complicated things; just no-one has been clever enough to come up with an argument. There was a respected mathematician who died about 10 years ago I believe who did think there was a possible problem with induction in general; this was discussed on the n-café once upon a time. But, as in the examples I gave, I tend to think that an inconsistency wouldn’t really matter. I think one could perfectly well do mathematics in a system with $Set:Set$ or similar (e.g. the category of all categories is a category), and the same in a system with $2^{\mathbb{N}} \lt \mathbb{N}$, just as we are currently doing mathematics in foundations which could be inconsistent for all we know.

Posted by: Richard Williamson on July 24, 2021 9:25 AM | Permalink | Reply to this

### Re: Large Sets 12

Richard wrote:

There was a respected mathematician who died about 10 years ago I believe who did think there was a possible problem with induction in general; this was discussed on the n-café once upon a time.

Are you thinking of Edward Nelson and this post?

(If you are thinking of Nelson, then respect to you for your slight overestimation of the time elapsed since he died (2014). I’m now of an age where that feels like about three years ago.)

Posted by: Tom Leinster on July 24, 2021 12:42 PM | Permalink | Reply to this

### Re: Large Sets 12

Yes, thank you! Hehe!

Posted by: Richard Williamson on July 25, 2021 9:33 AM | Permalink | Reply to this

### Re: Large Sets 12

Hah! Good question.

To start with, the answer to a question “would you bet…?” always depends on how much I’m being asked to bet. So we’re talking about (conditional) probabilities here.

Now because it’s a fact that ETCS+R is much stronger in consistency strength than ETCS, I would still say that even if ETCS+R is inconsistent, the probability that ETCS is also inconsistent is still rather low.

So what did I mean? I think I meant that if ETCS+R is inconsistent (or, really, even in the current situation where we don’t know if it’s inconsistent), it’s more likely that ETCS is inconsistent than that $ETCS'+R$ is inconsistent. I think this is also fairly evident given the above observations, since $ETCS'+R$ is equiconsistent with $ETCS'$, which is weaker than ETCS.

As Ulrik says, it’s rarely correct to blame a single axiom for the inconsistency of a whole system, so I was rather sloppy in my phrasing. But if T+A+B is inconsistent, and it’s more likely that T+A is inconsistent than that T+B is inconsistent, one might be tempted to say that the inconsistency of T+A+B is more the fault of A than B.

However, after thinking about this more carefully, I’m not sure I would agree any more that that’s a good way to phrase it, so I’m glad you pushed me on it. The precise thing to say is what I said above: there is a theory that retains replacement but weakens ETCS in a different way that also doesn’t affect everyday mathematics very much, and which is less likely to be inconsistent than plain ETCS without replacement. It’s true that adding replacement makes ETCS much stronger, consistency-wise, but the reason it does that is because of the presence of unbounded classical logic in ETCS.

Posted by: Mike Shulman on July 23, 2021 2:37 PM | Permalink | Reply to this

### Re: Large Sets 12

Thanks, Mike. That’s food for thought.

Posted by: Tom Leinster on July 24, 2021 12:54 PM | Permalink | Reply to this

### Re: Large Sets 12

This is a comment brought on by this particular post, rather than being directly about the topic. [And is a partial transcription of what I posted on Twitter, but David Corfield urged me to post it here as well.]

I found it interesting when things went from quantifying over ‘semantics’ (sets, power sets, classes, etc) to ‘syntactic’: i.e. formulas. There seems to be an under-appreciation, in mathematics and computer science, of the non-trivial jump between quantifying over potentially large things (most ‘semantic’ spaces are quite large, i.e. not countable) down to a countable set of terms.

In CS, this occurs as the difference between saying “all things of type X” and “all terms that denote things of type X”.

My favourite example: classically, “all $\mathbb{R}\rightarrow\mathbb{R}$ functions are continuous” is nonsense but “all $\mathbb{R}\rightarrow\mathbb{R}$ functions that you can write down from a constructively-valid vocabulary” is true. One can also phrase this as: if your vocabulary only consists of uniformly continuous functions and operators that preserve uniform continuity, you can only write down continuous functions.

Andrej Bauer pointed to the mathematically deeper Continuity begets continuity work along the same vein.

Main point: “things you can actually write down” is an extremely powerful assumption that seems to often be overlooked.

Posted by: Jacques Carette on July 20, 2021 9:03 PM | Permalink | Reply to this

### Re: Large Sets 12

Thanks, that’s a good point. I did try to flag up this distinction in my aside about quantifying over formulas, but you’ve put it in much bolder and more compelling terms (as well as saying more).

Re syntax versus semantics, I half-remember reading that when Lawvere introduced Lawvere theories, someone like Tarski is supposed to have said something like “this man doesn’t know the difference between syntax and semantics!” (And I can see the benefit of blurring the distinction.) Does anyone know the quote I’m talking about and have the source?

(When I search the web, the main thing I find is me asking the same question here a couple of years ago.)

Posted by: Tom Leinster on July 20, 2021 11:32 PM | Permalink | Reply to this

### Re: Large Sets 12

Ah, I’ve finally found that Tarski quote — except it’s probably not an actual quote. It’s from a set of talk slides by Jean-Pierre Marquis, called Kreisel and Lawvere on Category Theory and the Foundations of Mathematics. Here are pages 38 and 39:

Text: “This man confuses mathematics and metamathematics, syntax and semantics!”

Posted by: Tom Leinster on July 20, 2021 11:44 PM | Permalink | Reply to this

### Re: Large Sets 12

I can see from Jean-Pierre’s From a geometrical point of view that Lawvere attended seminars by Tarski in Palo Alto (p. 198) and that he, Lawvere, gave a talk attended by Tarski (p. 203) (reported by Lambek).

Posted by: David Corfield on July 21, 2021 7:19 AM | Permalink | Reply to this

### Re: Large Sets 12

Thanks, David. It would be nice to know what Tarski actually said, if there’s a record. (I’m assuming Jean-Pierre would have put it in quote marks if it was a verbatim quote.)

Posted by: Tom Leinster on July 21, 2021 10:42 AM | Permalink | Reply to this

### Re: Large Sets 12

I don’t quite understand the following:

“In categorical set theory, a family of sets indexed by a set $I$ is usually defined as a set $X$ together with a function $p: X \to I$. Think of the fibre $p^{-1}(i)$ as the $i$th member of the family. In brief:

A family over $I$ is a function into $I$.”

I don’t know the ETCS-formalisation of preimage/fibre, so I may misunderstand something. Usually, fibres of distinct points are disjoint, so this definition would only permit families of disjoint sets. But that can’t be the case, otherwise you wouldn’t use this definition. Where do I go wrong?

In ZF, indexed families $P_i$ are defined as functions out of $I$ into the powerset of some set $A \supseteq \bigcup_{i\in I} P_i$. Is such a definition not natural for ETCS?

Posted by: Stéphane Desarzens on July 22, 2021 7:40 AM | Permalink | Reply to this

### Re: Large Sets 12

this definition would only permit families of disjoint sets

Good question. The answer is that in ETCS, there is no concept of two abstract sets being disjoint or not. More generally, there’s no notion of the intersection of two abstract sets.

I say “abstract set” to distinguish from subset. If $X$ and $Y$ are presented as subsets of another set $U$, meaning that we’re given injections $i: X \to U$ and $j: Y \to U$, then we can form their intersection. It’s the pullback of $i$ and $j$. So we can, then, ask whether $X$ and $Y$ are disjoint as subsets of $U$.

The fact that you can’t form the intersection of two abstract sets in ETCS can be seen as a feature rather than a bug.

E.g. in ordinary mathematics, what is the intersection of the circle $S^1$ and the 2-sphere $S^2$? One “ordinary” mathematician might say the question doesn’t make sense, because $S^1$ and $S^2$ are only really defined up to isomorphism. Another might visualize $S^1$ as the unit circle in $\mathbb{R}^2$, and $S^2$ as the unit sphere in $\mathbb{R}^3$, and $\mathbb{R}^2$ as embedded as the $x y$-plane of $\mathbb{R}^3$. Then they would answer “$\{(x, y, 0) \in \mathbb{R}^3: x^2 + y^2 = 1\}$”.

Both of our hypothetical mathematicians’ reactions agree with how ETCS behaves. The intersection of $S^1$ and $S^2$ as abstract objects is undefined. But when we choose certain maps $i: S^1 \to \mathbb{R}^3$ and $j: S^2 \to \mathbb{R}^3$, we can indeed intersect them.

A fundamental point that distinguishes ETCS from ZF and all its cousins is that in ETCS, an element can’t be an element of more than one set. We can’t have $x \in X$ and $x \in Y$ unless $X = Y$. (This is a special case of the idea that a function has a specified codomain.) So, for instance, the element of $\mathbb{N}$ that we would normally call $2$ is not the same as the element of $\mathbb{R}$ that we would normally call $2$. They’re related, in that the usual injection $\mathbb{N} \to \mathbb{R}$ maps the $2$ of $\mathbb{N}$ to the $2$ of $\mathbb{R}$. But they’re not the same.

In ZF, indexed families $P_i$ are defined as functions out of $I$ into the powerset of some set $A \supseteq \bigcup_{i\in I} P_i$. Is such a definition not natural for ETCS?

Yes and no.

“No” because $\bigcup_{i \in I} P_i$ isn’t defined in ETCS. The union of even two abstract sets isn’t defined, just as for intersections. (E.g. take finite sets $X$ and $Y$. If we could form their union then we’d know its cardinality, which by the inclusion-exclusion formula would tell us the cardinality of the intersection.)

But again as for intersections, we can always take the union of two subsets of a set. We can also take the disjoint union (coproduct) of two sets.

But “yes” because something similar can be said in ETCS. Preamble: for any set $X$, we can talk about families of elements of $X$: they’re just functions $I \to X$. Of course, we often use different notation when we’re viewing a function as a family (as is very familiar from the case of sequences): maybe $(x_i)_{i \in I}$ rather than $f$, but formally $x_i = f(i)$.

Now in one direction, suppose we start with a set $A$ and an $I$-indexed family $(X_i)_{i \in I}$ of subsets of $A$, i.e. a function $I \to \mathcal{P}(A)$. Then we can turn this $I$-indexed family of subsets of $A$ into an $I$-indexed family of abstract sets. That is, we can build a set $X$ and a map $p: X \to I$ into $I$ whose fibres are $X_i$. So, starting from a family of subsets, we built a corresponding family of abstract sets.

(How do we build $X$? Elementwise, it’s the set of pairs $(i, x)$ where $i \in I$ and $x \in X_i$ — like the classic ZF recipe for forming disjoint unions. From the axioms of ETCS, we take our function $I \to \mathcal{P}(A) = 2^A$, which corresponds to a function $I \times A \to 2$, which in turn corresponds to a subset $X$ of $I \times A$. That’s our $X$.)

In the other direction, suppose we start with an $I$-indexed family of sets, $X \to I$. Then each fibre $X_i$ comes equipped with an inclusion into $X$, so we should be able to define a map $I \to \mathcal{P}(X)$ such that $i \mapsto X_i$, and indeed we can. (I’ll skip the details, but it’s simple enough.) And a map $I \to \mathcal{P}(X)$ is exactly an $I$-indexed family of subsets of $X$. So, starting from a family of abstract sets, we built a corresponding family of subsets.

In summary, in ETCS we can pass back and forth between families of subsets of some fixed set and families of abstract sets. But they’re not quite the same thing.

Posted by: Tom Leinster on July 22, 2021 11:51 AM | Permalink | Reply to this

### Re: Large Sets 12

Thanks for the detailed answer! The precise distinction between “subsets” (monomorphisms) and objects is what I was missing.

This way it looks very much like how set-theory is done inside type-theory, which I’m quite familiar with. So I see the following notational equivalences IINM:

$\begin{array}[t]{lll} \in-based & object/morphism-based & Type-Theory \\ x \in A & 1 \overset{x}{\to} A & x : A \\ U \subseteq A & X \overset{U}{\hookrightarrow} A & U : A \to Type \end{array}$

Then operations are defined sometimes on subsets (inclusion, intersection, union), sometimes on objects/types (product, sum). What one looses, when passing to a multi-sorted theory coming from the one-sorted set theory, is that one now needs to keep track of the inclusion-mappings. And passing these inclusions around from one object to another.

Your example about $S^1$ and $S^2$ is very nice. Really saying: sometimes we aren’t interested in a subobject/subset relative to its ambient space, but standing freely on its own. In such cases we pass from “up” again to objects.

Finally, about the $I$-indexed families: you show/sketch how $I \to \mathscr{P}(X)$ and $X \to I$ are equivalent definitions. We can pass freely from one to the other, and in your setting the latter was more convenient. I’m intrigued and got something to think about. :)

Posted by: Stéphane Desarzens on July 22, 2021 9:57 PM | Permalink | Reply to this
Read the post Borel Determinacy Does Not Require Replacement
Weblog: The n-Category Café
Excerpt: Despite what they say.
Tracked: July 24, 2021 1:45 AM

Post a New Comment