### In Praise of Dependent Types

#### Posted by Mike Shulman

I’ve written before about “structural” foundations for mathematics, which can also be thought of as “typed” foundations, in the sense that every object comes with a type, and typing restrictions prevent us from even asking certain nonsensical questions. For instance, whether the cyclic group of order 17 is equal to the square root of 2 is such a nonsensical question, since the first is a group while the second is a real number, and only things of the same type can be compared for equality.

The structural axiom systems we’ve talked about, such as ETCS and SEAR, are presented as first-order theories whose objects of study are things with names such as “sets,” “functions,” and “relations.” Although such presentations look a bit different (and hopefully less syntactically imposing) than what is usually called type theory, it’s not too hard to write down more traditional-looking “type theories” that capture essentially the same models as ETCS and SEAR. However, these theories are missing an important aspect that is present in many type theories, namely *dependent types*.

If your introduction to type theory has been through topos theory and categorical logic, as mine was, it may take some time to really appreciate dependent types. At least, it’s taken me until now. The Elephant spends all of one section of one chapter (D4.4) on dependent type theory as the internal logic of locally cartesian closed categories, closing with the remark that the generalization from topoi to such categories “…has been won at the cost of a very considerable increase in the complexity of the calculus itself, and so we shall not pursue it further.” However, while dependent types are not *necessary* in any theory containing powersets (which, I think, includes all the foundational theories of interest to most mathematicians), I’m coming to believe that they are really a very convenient, natural, and ubiquitous notion.

Dependent types are really everywhere in mathematics. Wherever we speak about “a family of sets” indexed by some other set, that’s a dependent type. For instance, let $C$ be a small category and consider the notion of a $C$-diagram in $Set$, i.e. a functor $F\colon C\to Set$. This consists of a *family* of sets $F_0(c)$ for each $c\in ob(C)$, together with a family of functions $F_1(f)\colon F_0(c)\to F_0(c')$ for each $f\colon c\to c'$ in $C$, satisfying functoriality axioms. In other words, $F_0$ is a *dependent type* which depends on specifying an element of the set $ob(C)$.

Dependent types are also everywhere in natural language. One standard example is that for any month $m$, there is a type (or set) $D(m)$ of the days in that month. Since different months have different numbers of days, the cardinality of $D(m)$ is different for different $m$. Of course, because of leap years, $D$ actually depends not only on $m$ but also on a year $y$. David gave a couple of nice examples here of the need for dependent types in expressing quantifiers in everyday language.

In a material set theory such as ZFC, where everything is a set, including the elements of other sets, we can talk about “sets of sets” instead of “families of sets.” This isn’t quite a dependent type, though, since (in the functor example) we need each set $F_0(c)$ to be associated to the element $c$ of the previously given set $ob(C)$. But we can instead consider $F_0$ to be the set of ordered pairs $(c,F_0(c))$, which again is possible since the components of an ordered pair can themselves *be* sets. Another way of saying this is that $F_0$ is a “function” from $ob(C)$ to the proper class of all sets.

In a structural set theory such as ETCS or SEAR, dependent types have to be expressed differently, since the elements of sets and ordered pair are not themselves sets. The standard trick is to model $F_0$ by a single set which is intuitively the disjoint union $\coprod_{c\in ob(C)} F_0(c)$, equipped with the obvious function $F_0 \to ob(C)$. The particular sets $F_0(c)$ can then be recovered (up to isomorphism, of course) as the fibers of this projection.

However, there’s something a bit unsatisfying about this; shouldn’t the notion of a family of sets exist prior to the assertion that any such family has a coproduct? The alternative adopted by DTT is to simply take “dependent types” as one of the basic notions of our theory. (For now, consider “type” to be a synonym for “set”.) Thus, for every type $A$ we have a basic notion of “$A$-indexed family of sets/types.” If $B$ is a type depending on $A$, then for each $a\in A$ we have a type $B(a)$. Of course, we can also have further types dependent on $B(a)$, and so on. We can then assert, as an axiom (or “type constructor”), the possibility of taking the coproduct of any dependent type—this is usually called a *dependent sum* and written $\Sigma_{a\in A} B(a)$. An element of $\Sigma_{a\in A} B(a)$ consists of an element $a\in A$ together with an element $b\in B(a)$.

Adding dependent types doesn’t make the theory any more *expressive* than ETCS or SEAR, since as long as we have dependent sums, we can always interpret a dependent type $B(a)$ by $\Sigma_{a\in A} B(a)$ equipped with its projection to $A$. So if your overriding goal is parsimony of notions, then you’d reject dependent types as unnecessary. But the point I’m making is that in mathematics and in everyday language, dependent types are what we really talk about, and their encoding in ZFC or ETCS is just that—an encoding. And once you start looking at them in their own right, the theory of dependent types is really quite beautiful, and has other important applications; let me try to give you a feel for a few.

First of all, there is the dual notion of a dependent sum, called a *dependent product* $\Pi_{a\in A} B(a)$. An element of $\Pi_{a\in A} B(a)$ consists of a function which assigns to each $a\in A$ an element of $B(a)$. This is exactly the sort of thing that the axiom of choice says will always exist whenever each $B(a)$ is nonempty. Dependent sums and products are dual because when a type dependent on $A$ is interpeted as a projection to $A$, i.e. an object of the slice category $Set/A$ (or with some more exotic category replacing $Set$), then they are represented by left and right adjoints, respectively, to the pullback functor $A^*\colon Set \to Set/A$.

Dependent products include, as a particular case, function spaces, since if $B$ is a type *not* dependent on $A$ (i.e. $B(a)$ is independent of $a\in A$), an element of $\Pi_{a\in A} B$ is just a function $A\to B$. Similarly, dependent sums include *binary* cartesian products, since if $B$ doesn’t depend on $A$, an element of $\Sigma_{a\in A} B$ is just a pair $(a,b)$ with $a\in A$ and $b\in B$. So dependent sums and products together encode all the structure of a locally cartesian closed category.

Dependent types can also be used to represent predicate logic. Recall that we like to think of a *truth value* as a $(-1)$-category, or equivalently as a subsingleton set. Operations on truth values, like “and” and “or” and “implies,” can then be represented by operations in the poset of subsingletons. This is all well and good for propositional logic, but for predicate logic we need to talk about logical formulas with free variables, like $(x+y = y+x)$ for real numbers $x$ and $y$. But hey, if for each fixed $x,y\in \mathbb{R}$ the formula $(x+y = y+x)$ is a truth value, i.e. a subsingleton, then the whole formula $(x+y = y+x)$ is a dependent type which depends on $x,y\in \mathbb{R}$, each of whose individual types is a subsingleton.

Under this interpretation, called “propositions as types,” the operations on dependent types relate to quantifiers in predicate logic. For instance, if $\varphi(a)$ is a proposition dependent on $a\in A$, then an element of the dependent product $\Pi_{a\in A} \varphi(a)$ consists of a function assigning to each $a\in A$ an element of $\varphi(a)$, i.e. an assertion that $\varphi(a)$ is true. In other words, $\Pi_{a\in A} \varphi(a)$ is the subsingleton representing the universal quantifier $\forall a\in A.\, \varphi(a)$. If we have a particular $a_0\in A$ and an element $f\in \Pi_{a\in A} \varphi(a)$, then the evaluation $f(a_0)$ shows us that $\varphi(a_0)$ is true. It’s no coincidence that we use the same word for “applying a function” and “applying a theorem”!

The propositions-as-types interpretation is very convenient if you’re implementing a computer system for doing mathematics, since it means that you can essentially write one piece of code for dealing with dependent types, and hey-presto you’ve also got code that works for predicate logic. One could argue that this really recovers whatever parsimony was lost by introducing dependent types as a basic notion, since whatever the system you use, you’ve got to have predicate logic somehow.

It also admits of vast and wonderful generalizations, for once we’re interpreting propositions as types, who says the types always have to be subsingletons? We can instead interpret a proposition $\varphi$ by “the set of all proofs of $\varphi$” or “the set of all reasons why $\varphi$ is true”—a set which contains strictly more information than just whether $\varphi$ is true. You might have already noticed that dependent sums already take us out of the world of subsingletons: if $\varphi(a)$ is a proposition dependent on $a$, then $\Sigma_{a\in A}\varphi(a)$ isn’t just a subsingleton representing the proposition $\exists a\in A. \varphi(a)$ (which we might naively expect); rather, its the set of all pairs $(a,t)$ where $a\in A$ and $t\in\varphi(a)$ is an assertion/proof/reason why $\varphi(a)$ is true. In other words, it’s the set of *witnesses* to the assertion $\exists a\in A. \varphi(a)$, which is strictly more informative.

Another nice thing about dependent types is that they can be used to eliminate evil. The problem with a naive interpretation of “do no evil,” i.e. “never talk about two objects of a category being equal,” is that even in order to compose morphisms $f\colon a\to b$ and $g\colon c\to d$ in a category, we need to know that $b$ and $c$ are the same object; it’s not enough to know that they’re isomorphic. But with dependent types, we can say that a category consists of a type $Ob$ of objects, together with a *dependent* type $Hom(x,y)$, where $x,y\in Ob$, and composition operations $Hom(x,y)\times Hom(y,z)\to Hom(x,z)$. In other words, when we compose $f$ and $g$, the assertion that the target of $f$ equals the source of $g$ is a typing assertion, not a statement about equality of two separately given objects. I believe this really matches the way we talk about categories in ordinary mathematical language; we always only consider a morphism in some category as a morphism in some particular hom-set. Saying that $f$ is a morphism from $a$ to $b$ isn’t an assertion that $s(f) = a$ and $t(f)=b$ for some functions $s,t:Arr\to Ob$, rather it’s a typing assertion saying to what set $f$ belongs. We can then eliminate evil by simply saying that the type $Ob$ has no equality predicate, while each type $Hom(x,y)$ does. This also leads to what I call 2-categorical logic, and, when taken to the limit, to (∞,1)-categorical logic which we’ve been discussing a bit here.

Finally, one can even argue that dependent types may be the future of computer programming. In general, proponents of strongly statically typed languages such as C++, Java, ML, and Haskell claim that unlike in dynamically typed languages such as Perl, Python, and PHP, static typing means that more errors can be caught at compile-time. And catching errors at compile-time is really what writing reliable code is all about—a “bug” is, by definition, an error in programming that escapes detection until run-time (the user’s run-time). For instance, in a dynamically typed language, we might have a function $multiply\underline{ }matrices(A,B)$ defined in one place, and somewhere else we could call that function like $multiply\underline{ }matrices("hello, world", -15)$ (maybe I was having an off-day when I wrote that). The compiler would notice nothing wrong until we ran the program and the function $multiply\underline{ }matrices$ complained about being given a string and an integer instead of a pair of matrices (and in practice, you’d be lucky if its error message were that explicit). If the offending call to $multiply\underline{ }matrices$ were embedded deep in some complicated logic in such a way that it only got invoked on the second Tuesday of August in a leap year, then every so often our users would notice an inexplicable crash that we might have a lot of trouble tracking down. But in a statically typed language, the function would be declared as $multiply\underline{ }matrices(matrix A, matrix B)$, so that when the compiler encountered the call $multiply\underline{ }matrices("hello, world", -15)$, it would complain to us, the programmers, about a typing error. We could then try to figure out what the heck we must have meant by $multiply\underline{ }matrices("hello, world", -15)$, without causing any such headaches to our users.

None of that requires *dependent* types, but there’s a closely related (and, unfortunately, even more plausible) error that ordinary static typing won’t save us from. In order to multiply two matrices $A$ and $B$, we need to know that the number of columns of $A$ equals the number of rows of $B$. Presumably the function $multiply\underline{ }matrices$ will do some checking along these lines, but it can only do that *at run-time*, since the single type $matrix$ has to include matrices of all sizes. So if it happens that buried somewhere in our complicated code we happened to call $multiply\underline{ }matrices$ on a pair of matrices whose dimensions don’t match (a much easier mistake to make than calling it on a string and an integer), then on the second Tuesday of August in leap years our users will be presented with the error “mismatch in matrix dimensions”.

However, if we’re programming in a *dependently typed* language, then you can have a type $(matrix n m)$ which depends on a pair of natural numbers $n$ and $m$. (This is different from parametric polymorphism, in which you have a type like $(List X)$ which depends on another *type* $X$; here the type $(matrix n m)$ depends on two *values* $n,m$ of type $Nat$.) Then our multiplication function can be defined as $multiply\underline{ }matrices( (matrix n m) A, (matrix m k) B)$, and when we call it elsewhere with a value $A$ of type $(matrix 2 4)$ and $B$ of type $(matrix 7 3)$, the *compiler* will complain about a type mismatch, sparing our users from experiencing the same error at run-time.

If we take this to extremes and incorporate propositions-as-types as well, then we can actually define a function whose *type* includes the assertion that the behavior of the function is correct. For instance, suppose we’re defining a function to compute the smallest prime factor of some natural number. In an ordinary statically typed language, this function might be typed as $int \to int$. A more precise typing involving dependent types would be $\Pi_{n\in \mathbb{N}} \{ k | 0 \le k \le n\}$, since the smallest prime factor of $n$ must be no bigger than $n$. But with propositions as types, we can actually write a type for the function which is essentially “the type of all functions which take as input a natural number and output its smallest prime factor.” If our compiler does its type-checking job correctly, then if we manage to write a function with that type which compiles correctly, we will be *guaranteed* that the function does what we intended it to. This is what some people mean when they say that “all errors are type errors.”

## Re: In Praise of Dependent Types

Technically, the matrix example does not require dependent types as such (although examples like it are probably the most widely used to illustrate dependent types). For instance, Haskell, with some extensions, can do it, despite not

reallyhaving dependent types.The price you have to pay is writing everything down twice. For the matrices, we might have:

`datakind Nat = Z | S Nat`

`Matrix : Nat -> Nat -> *`

where we have a

kindof naturals (instead of a type), whose elements are at the type level, and thus can be used in type expressions. But if we wanted naturals at the value-level, there’d be a wholly separate:`datatype Nat = Z | S Nat`

Or, if we have generalized algebraic datatypes, which allow some information to pass from the value level to the type leve, we might choose to index the value-level naturals by the type-level ones:

`datatype Nat : Nat -> * where`

Z : Nat Z

S : Nat n -> Nat (S n)

`Natural : *`

Natural = exists n. Nat n

The idea here being inspection of the GADTs at the value level refine the corresponding types, giving very similar effects to genuine dependent typing, while keeping a technical demarcation between values and types (which is useful from a compiler perspective, because it clearly delineates all the stuff you can check statically, and then erase). ATS, for instance, takes this tack, I believe.

Anyhow, I suppose that’s a bit of a digression, since that isn’t even necessary for the matrices. Only the datakind is. Where genuine dependent types really start to show their worth (as far as programming) are the areas in your final paragraph, where we want to encode complex specifications in the types of our functions. We might have:

$sort : \Pi_{xs : List \mathbb{N}} \Sigma_{ys : List \mathbb{N}} Sorted ys \wedge xs \cong ys$

saying that given a list, we produce a list, together with proofs that it is sorted and a permutation of the original list. This is (I think) encodable into the digression scheme above (ATS has a verified sorting example), but the duplication gets more and more awkward (in my experience) the more you try to encode in the types.