Universe Polymorphism and Typical Ambiguity
Posted by Mike Shulman
Sorry I’ve been kind of quiet recently. There’s been a lot of good stuff happening at IAS this year, but (aside from my being busy helping to make it happen), not so much of it has been easily bloggable. But here’s something that I’ve just learned: do you know the difference between universe polymorphism and typical ambiguity (or even what either of them means)?
In category theory we use “universes” of one sort or another a fair amount. Any time we talk about “small” versus “large” categories, we are using a universe. It might be a Grothendieck universe, or it might be the proper-class universe of sets, but whatever it is, we are employing some device to separate out some of the “things” under consideration which belong to the “universe”, in such a way that we can treat the universe itself as a “thing”.
One of the problems with universes is that sometimes you need to change your universe. For instance, suppose you prove something about small categories. Later on, you find that you need to know that the corresponding theorem for large categories is also true. What do you do?
In practice, one may make some remark about this being obvious, or decline to remark on it at all. But when being absolutely precise, there are several options. One convenient one is to consider the theorem about small categories to have been (implicitly) universally quantified over the universe which defines the notion of “small”. Thus, if we choose a different universe, with respect to which the categories that we’re currently calling “large” are actually “small”, then our original theorem can be applied to them.
In type theory, this is called universe polymorphism: a notion (theorem, definition, proof, etc.) is universe polymorphic if it is universally quantified over one or more universes.
The quantification may be implicit or explicit. A few books and papers in category theory actually make it explicit: rather than “small category” they always write “-small category” (or “-category”) for some universe , and so on. I find this to be extremely tedious and to be eschewed at all costs. But is that ever a problem?
Well, when writing a paper where there’s only one universe in play, there’s no problem with just saying “small category” to mean “small category with respect to the universe”. In a sense, the entire paper is universally quantified over a single universe. If there are multiple universes in use, then one can introduce phrases like “small category”, “moderate category”, “large category”, “very large category”, etc.
However, if within the scope of one text, you want to prove something about small categories and then apply it to both small and large ones, then the universe polymorphism has to happen at the level of individual theorems. This is no problem if we make the quantification explicit: we prove one theorem of the form “for any , blah blah -small categories…” and later on we say “by theorem such-and-such with [resp. ] …”. Can we avoid that? Can the phrase “small category” mean one thing in one place and another thing in another one? That sounds… ambiguous.
It is ambiguous, of course, but sometimes ambiguity is okay, as long as there is a consistent way to resolve it. The sort of ambiguity we’re interested in is called typical ambiguity, and dates back to Bertrand Russell’s original theory of types.
Basically, we want to let ourselves use a single variable “” referring to a universe, but to allow that variable to mean different things when we use it in different places. Of course, this ambiguity will carry over to anything defined in terms of . For instance, if we define “small category” to mean “category whose objects and morphisms belong to ”, then the phrase “small category” will likewise be allowed to mean different things in different places.
But actually, in the world of typical ambiguity, it would be silly to define “small category” at all: since every category belongs to some universe, and our universe variable could refer to any universe, we might as well just define a category to be one whose objects and morphisms belong to . (Similarly, we ought to simply define “set” — or, in type theory, “type” — to mean “element of ”. This is, I guess, the origin of the name “typical ambiguity”: the meaning of the word “type” is ambiguous.) Now we may prove a theorem about categories in one place, and then later on apply that theorem both to a particular category , and to the category of categories of which is an object.
Why is this okay? Obviously, a completely unfettered use of language like this would allow us to reproduce the classical paradoxes. What we need to know is that there is some assignment of a particular universe to each occurrence of the variable which makes all the size issues come out okay. For instance, whenever we write the phrase “category of categories”, the universe assigned to the occurring in its first word “category” must contain the universes assigned to the s occurring in all particular categories that are ever said to be objects of this “category of categories”. Of course, we might say “the category of categories” in more than one place, in which case the universes that get assigned to them might also be different.
This sounds like it might be tedious, but fortunately, when we are doing formalized mathematics with a computer, our proof assistant can often check automatically for us that such an assignment of universe levels exists. For instance, Coq does this. In fact, Coq does not even support explicit universe polymorphism: the only thing you can write to refer to a universe is “”, which is typically ambiguous: Coq then automatically tries to assign consistent universe levels to all occurrences of “”.
(Actually, the current version of Coq, though typically ambiguous in this way, is not fully universe polymorphic: when we define “category” the occurrence of “” therein might refer to any universe, but it must refer to the same universe everywhere that we use the word “category”. This is obviously problematic. Fortunately, there are better algorithms available, due to Bob Harper and others, which can handle “true” universe polymorphism with typical ambiguity. Mathieu Sozeau is almost done implementing one of these in a new version of Coq. The proof assistant Agda, on the other hand, is universe polymorphic, but not typically ambiguous: you have to write universal quantifications over universes explicitly.)
I would like to argue for more use of typical ambiguity in non-formalized mathematics as well. It’s true that one has to verify that there is a consistent assignment of universe levels, which to do completely precisely might be tedious. However, in ordinary mathematics we do plenty of other things that would also be tedious if made completely precise, and which one might hope could be automated when we go to a formalized world. We often mark them with “it is easy to show that” or “it is obvious that” or “it is left to the reader to show that”, or just assert them without justification.
Typical ambiguity with universe polymorphism takes a little getting used to, but once you’ve internalized it, it is very convenient. For instance, here’s an example I was writing about the other day in the informal type theory project: we can simply define a “cardinal number” to be an isomorphism class of sets. Since there is a set of sets, there is no problem forming this quotient. (In univalent type theory, this would be the 0-truncation of the type of h-sets.) There is then a set of cardinal numbers, which inherits addition and multiplication operations making it into a semiring. Likewise, we can define an “ordinal number” to be an isomorphism class of well-ordered sets. There is a set of ordinal numbers, which is itself an ordinal number. And there is a function from ordinals to cardinals which forgets the orderings, which by the well-ordering principle is surjective and has a section, assigning to each cardinal number the least element of the fiber over it.
Even in a set-theoretic foundation, I find this use of language to be much more congenial than the usual circumlocutions regarding proper classes, axioms of replacement, representatives of least rank, etc. etc.
What about Cantor’s paradox that there is no set of sets, or Burali-Forti’s paradox that the set/class of ordinals is not an ordinal? Those proofs can be written in the language of typical ambiguity, but they fail the universe consistency check. However, as long as you aren’t deliberately perverse, you won’t usually have to worry about universe inconsistencies. Moreover, in non-formalized mathematics, we are always free to drop into explicit universe polymorphism if it ever seems necessary for clarity. But most of the time, it isn’t.
Re: Universe Polymorphism and Typical Ambiguity
Since you’ve mentioned typical ambiguity, I cannot resist mentioning that NF has typical ambiguity built right into its construction – and that is how it avoids Cantor’s paradox, Burali-Forti’s paradox, etc. Unfortunately, as you are doubtless aware, NF has its own problems…