Universe Polymorphism and Typical Ambiguity

December 9, 2012

Universe Polymorphism and Typical Ambiguity

Posted by Mike Shulman

$MathML-enabled post (click for more details).$

Sorry I’ve been kind of quiet recently. There’s been a lot of good stuff happening at IAS this year, but (aside from my being busy helping to make it happen), not so much of it has been easily bloggable. But here’s something that I’ve just learned: do you know the difference between universe polymorphism and typical ambiguity (or even what either of them means)?

$MathML-enabled post (click for more details).$

In category theory we use “universes” of one sort or another a fair amount. Any time we talk about “small” versus “large” categories, we are using a universe. It might be a Grothendieck universe, or it might be the proper-class universe of sets, but whatever it is, we are employing some device to separate out some of the “things” under consideration which belong to the “universe”, in such a way that we can treat the universe itself as a “thing”.

One of the problems with universes is that sometimes you need to change your universe. For instance, suppose you prove something about small categories. Later on, you find that you need to know that the corresponding theorem for large categories is also true. What do you do?

In practice, one may make some remark about this being obvious, or decline to remark on it at all. But when being absolutely precise, there are several options. One convenient one is to consider the theorem about small categories to have been (implicitly) universally quantified over the universe which defines the notion of “small”. Thus, if we choose a different universe, with respect to which the categories that we’re currently calling “large” are actually “small”, then our original theorem can be applied to them.

In type theory, this is called universe polymorphism: a notion (theorem, definition, proof, etc.) is universe polymorphic if it is universally quantified over one or more universes.

The quantification may be implicit or explicit. A few books and papers in category theory actually make it explicit: rather than “small category” they always write “ $U$ -small category” (or “ $U$ -category”) for some universe $U$ , and so on. I find this to be extremely tedious and to be eschewed at all costs. But is that ever a problem?

Well, when writing a paper where there’s only one universe in play, there’s no problem with just saying “small category” to mean “small category with respect to the universe”. In a sense, the entire paper is universally quantified over a single universe. If there are multiple universes in use, then one can introduce phrases like “small category”, “moderate category”, “large category”, “very large category”, etc.

However, if within the scope of one text, you want to prove something about small categories and then apply it to both small and large ones, then the universe polymorphism has to happen at the level of individual theorems. This is no problem if we make the quantification explicit: we prove one theorem of the form “for any $U$ , blah blah $U$ -small categories…” and later on we say “by theorem such-and-such with $U=V$ [resp. $U=V'$ ] …”. Can we avoid that? Can the phrase “small category” mean one thing in one place and another thing in another one? That sounds… ambiguous.

It is ambiguous, of course, but sometimes ambiguity is okay, as long as there is a consistent way to resolve it. The sort of ambiguity we’re interested in is called typical ambiguity, and dates back to Bertrand Russell’s original theory of types.

Basically, we want to let ourselves use a single variable “ $U$ ” referring to a universe, but to allow that variable to mean different things when we use it in different places. Of course, this ambiguity will carry over to anything defined in terms of $U$ . For instance, if we define “small category” to mean “category whose objects and morphisms belong to $U$ ”, then the phrase “small category” will likewise be allowed to mean different things in different places.

But actually, in the world of typical ambiguity, it would be silly to define “small category” at all: since every category belongs to some universe, and our universe variable $U$ could refer to any universe, we might as well just define a category to be one whose objects and morphisms belong to $U$ . (Similarly, we ought to simply define “set” — or, in type theory, “type” — to mean “element of $U$ ”. This is, I guess, the origin of the name “typical ambiguity”: the meaning of the word “type” is ambiguous.) Now we may prove a theorem about categories in one place, and then later on apply that theorem both to a particular category $A$ , and to the category of categories of which $A$ is an object.

Why is this okay? Obviously, a completely unfettered use of language like this would allow us to reproduce the classical paradoxes. What we need to know is that there is some assignment of a particular universe to each occurrence of the variable $U$ which makes all the size issues come out okay. For instance, whenever we write the phrase “category of categories”, the universe assigned to the $U$ occurring in its first word “category” must contain the universes assigned to the $U$ s occurring in all particular categories that are ever said to be objects of this “category of categories”. Of course, we might say “the category of categories” in more than one place, in which case the universes that get assigned to them might also be different.

This sounds like it might be tedious, but fortunately, when we are doing formalized mathematics with a computer, our proof assistant can often check automatically for us that such an assignment of universe levels exists. For instance, Coq does this. In fact, Coq does not even support explicit universe polymorphism: the only thing you can write to refer to a universe is “ $Type$ ”, which is typically ambiguous: Coq then automatically tries to assign consistent universe levels to all occurrences of “ $Type$ ”.

(Actually, the current version of Coq, though typically ambiguous in this way, is not fully universe polymorphic: when we define “category” the occurrence of “ $Type$ ” therein might refer to any universe, but it must refer to the same universe everywhere that we use the word “category”. This is obviously problematic. Fortunately, there are better algorithms available, due to Bob Harper and others, which can handle “true” universe polymorphism with typical ambiguity. Mathieu Sozeau is almost done implementing one of these in a new version of Coq. The proof assistant Agda, on the other hand, is universe polymorphic, but not typically ambiguous: you have to write universal quantifications over universes explicitly.)

I would like to argue for more use of typical ambiguity in non-formalized mathematics as well. It’s true that one has to verify that there is a consistent assignment of universe levels, which to do completely precisely might be tedious. However, in ordinary mathematics we do plenty of other things that would also be tedious if made completely precise, and which one might hope could be automated when we go to a formalized world. We often mark them with “it is easy to show that” or “it is obvious that” or “it is left to the reader to show that”, or just assert them without justification.

Typical ambiguity with universe polymorphism takes a little getting used to, but once you’ve internalized it, it is very convenient. For instance, here’s an example I was writing about the other day in the informal type theory project: we can simply define a “cardinal number” to be an isomorphism class of sets. Since there is a set of sets, there is no problem forming this quotient. (In univalent type theory, this would be the 0-truncation of the type of h-sets.) There is then a set of cardinal numbers, which inherits addition and multiplication operations making it into a semiring. Likewise, we can define an “ordinal number” to be an isomorphism class of well-ordered sets. There is a set of ordinal numbers, which is itself an ordinal number. And there is a function from ordinals to cardinals which forgets the orderings, which by the well-ordering principle is surjective and has a section, assigning to each cardinal number the least element of the fiber over it.

Even in a set-theoretic foundation, I find this use of language to be much more congenial than the usual circumlocutions regarding proper classes, axioms of replacement, representatives of least rank, etc. etc.

What about Cantor’s paradox that there is no set of sets, or Burali-Forti’s paradox that the set/class of ordinals is not an ordinal? Those proofs can be written in the language of typical ambiguity, but they fail the universe consistency check. However, as long as you aren’t deliberately perverse, you won’t usually have to worry about universe inconsistencies. Moreover, in non-formalized mathematics, we are always free to drop into explicit universe polymorphism if it ever seems necessary for clarity. But most of the time, it isn’t.

Posted at December 9, 2012 8:11 PM UTC

TrackBack URL for this Entry: https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2579

11 Comments & 0 Trackbacks

Re: Universe Polymorphism and Typical Ambiguity

Since you’ve mentioned typical ambiguity, I cannot resist mentioning that NF has typical ambiguity built right into its construction – and that is how it avoids Cantor’s paradox, Burali-Forti’s paradox, etc. Unfortunately, as you are doubtless aware, NF has its own problems…

Posted by: Zhen Lin on December 9, 2012 10:49 PM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

$MathML-enabled post (click for more details).$

In what sense does NF have typical ambiguity? I thought in NF there was literally one set of all sets.

Maybe you’re thinking of the stratification requirement on formulas in the comprehension scheme? That is similar, but not I think the same….

Posted by: Mike Shulman on December 10, 2012 12:55 AM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

It goes back to Russellian type theory: the typing here refers not to set-theoretic universes but how many levels of sets of sets of … of sets we can have. The stratification (or rather, stratifiability) requirement in NF is essentially what we get by saying “everything that can be constructed in a stratified way exists, but we don’t bother distinguishing between different types once we’ve constructed what we want”. As far as I can tell, the difference is only in how finely we stratify the universe.

Posted by: Zhen Lin on December 10, 2012 8:44 AM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

$MathML-enabled post (click for more details).$

It sounds like what you’re saying is that Russellian type theory (and NF) mean something different by “typical ambiguity” than is meant in modern type theory. That doesn’t surprise me in principle (although it does surprise me a little in this case, given what I thought I knew), since modern type theory bears little resemblance to the Russellian original. Anyway, if that’s right, then it sounds like the Russellian/NF version also won’t be solving the same problems (although of course NF has its own approach to “solving” the problem of universes).

Posted by: Mike Shulman on December 10, 2012 3:04 PM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

$MathML-enabled post (click for more details).$

Hmmm. I think it is the same ‘typical ambiguity’: in both cases we have some stratification of the universe, and in both cases we are basically working locally and ignoring the precise level of the hierarchy we are at and focusing more on the relative levels of the various constructions under consideration. At least to me, there seems to be no conceptual difference between typical ambiguity in NF and typical ambiguity in say, the system $\mathrm{ZFC}/U_{\lt \omega}$ in [Feferman, 2004].

Posted by: Zhen Lin on December 10, 2012 6:47 PM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

$MathML-enabled post (click for more details).$

In the systems I was writing about, there are many different universes, and when we write $Type:Type$ its meaning is ambiguous: it could mean $Type_n : Type_{n+1}$ for any $n$ .

Similarly, in Feferman’s system there are, actually, a countable infinity of universes, and a given statement can be interpreted relative to any of them. His reflectivity property of the universes means that its truth in one universe is equivalent to its truth in any other, but there still are all the different universes.

By contrast, my understanding of NF is that there is one set of all sets, $V = \{ x | \top \}$ , and that when we write $V\in V$ this is just a single true statement, not ambiguous at all.

Posted by: Mike Shulman on December 11, 2012 2:09 AM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

$MathML-enabled post (click for more details).$

Hmmm. Perhaps I was not clear. Essentially, NF is the _result_ of extracting the typically ambiguous fragment of Russellian type theory TST. (If I understand correctly, Specker [1962] showed that NF is consistent if and only if TST with typical ambiguity is consistent.) For example, although $X : Type_n$ implies $\mathcal{P} X : Type_{n+1}$ in TST, once we erase the types we get the NF theorem “ $X \in V$ implies $\mathcal{P} X \in V$ ”. Similarly, $V \in V$ in NF corresponds to the uncontroversial $Type_n : Type_{n+1}$ in TST. The fact that the category of NF sets fails to be cartesian closed corresponds to the fact that the graph of the evaluation map cannot be consistently typed. (In TST, the entries of an ordered pair must have the same type.)

Anyway, maybe what I really want to say is that typical ambiguity is a subtle beast. But perhaps we can get away with it in the case where our universes $Type_n$ look like a model of ZFC…

Posted by: Zhen Lin on December 11, 2012 9:10 AM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

$MathML-enabled post (click for more details).$

Okay, I think I get what you’re saying. NF is one thing that might result if you start from a typically ambiguous system and say “now let’s suppose instead that we actually have only one universe, but let’s restrict the things that we can do with it in some way roughly motivated by the typically ambiguous statements that passed the universe consistency check.”

I also didn’t realize that Russell’s type theory was as bizarre as NF is! Typical ambiguity in modern type theory is completely unproblematic and known to be consistent (with no need for any ZFC or reflection principles). Bob Harper’s algorithm just tells you how to assign universe levels, and if it can’t be done, then you fail.

Posted by: Mike Shulman on December 11, 2012 4:43 PM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

Hey Mike, you forgot to mention Bob Harper here.
He is behind Mathieu’s work on this, and much earlier work.

Steve

Posted by: Steve Awodey on December 10, 2012 2:26 AM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

$MathML-enabled post (click for more details).$

Sorry! I was aiming for mere exposition and hoping I could get away without giving lots of references. But I made the mistake of mentioning at least one person’s name, so to be fair I should try to mention everyone else as well. As Steve says, the formal theory of universe polymorphism and typical ambiguity in type theory which Mathieu is implementing is essentially from this paper.

On the set-theoretic side, one should perhaps also mention Solomon Feferman, who has been working on modifications of ZFC that support a related sort of typical ambiguity using reflection principles. I blogged about one of those theories back here (with some discussion in the comments that is interesting to look back on from three years down the road). There are a bunch of other interesting-looking things in his list of papers that I still intend to read…

Posted by: Mike Shulman on December 10, 2012 3:49 AM | Permalink | Reply to this

Re: Universe Polymorphism and Typical Ambiguity

$MathML-enabled post (click for more details).$

I’m interested in the relationship of this topic to classical Algebra and to universal algebra; however, I don’t see here a definition of “polymorphism” that is compatible with the notion of a homomorphism, even in an extended sense. Are there references where I can find out more about such a relationship or definitions?

For instance, to me, a natural use of the term “polymorphism” would in algebra-ish terms or model theoretical terms, mean a generalization of the notion of a homomorphism (the classical morphisms of traditional algebra or of universal algebra, and more generally, of relational structures in model theory, where the definition is expanded to include “weak” homomorphisms and “strong” homomorphisms”) to include multivariate functions from one algebra into another, but here the notion seems to not mean a kind of morphism; multivariate because of the prefix “poly”.

From my reading, it seems like computer science uses of the term “polymorphism” refers to what I know as “overloading operators”, which is a kind of intuitionally descriptive phraseology abbot a notion that is in mathematics referred to often as “function extension”, in which one extends the domain of a function in some way to create a broader usage for the properties of a given function, which in complex analysis usually means a variation on “analytic continuation”, but in other areas may merely mean finding a function on a large donation whose restriction to a given domain is a particular given function. A way to see this connection is that one can view a vector space as a sorted structure having a vector sort and a scalar sort, and view “addition” on such a structure (as a partial operation or as a total operation) as either scalar addition extended to the union of the sorts or as an overloaded binary operator that either adds scalars or adds vectors, and in some interpretations can add a scalar to a vector. In a given context, this can be made precise so that the computer science (programming) notion of an overloaded operator can be studied formally mathematically, or of an extension of a function to a larger domain, either extended totally or partially.

If you know of a “structure-preserving mapping” notion like this, will you please point me to the literature where careful deductions are given along these lines? I’m fine with category-theoretical definitions for what I’m requesting, but I need to see how the notion can be adapted to the traditional settings without having to do brain somersaults to get there.

Posted by: Matt on October 14, 2025 4:09 PM | Permalink | Reply to this

The n-Category Café

Skip to the Main Content

December 9, 2012