### Native Type Theory (Part 2)

#### Posted by John Baez

*guest post by Christian Williams*

We’re continuing the story of Native Type Theory.

In Part 1 we introduced the internal language of a topos. Now, we describe the kinds of categories $T$ from which we will construct native type systems, via the internal language of the presheaf topos $\mathscr{P}T$.

We can represent a broad class of languages as categories with products and exponents. The idea of native type theory is simply that for any such language $T$, we can understand the internal language of $\mathscr{P}T$ as the language of $T$ *expanded* by predicate logic and dependent type theory.

The type system reasons about *abstract syntax trees*, the trees of constructors which form terms:

The types of native type theory are basically indexed sets of these operations, built from $T$ and constructors of $\mathscr{P}T$. For example, given the “theory of a monoid” with operations $e:1\to M$ and $m:M^2\to M$, we can construct the predicate:

(shorthand for formal notation in the presheaf topos.) This corresponds to the subset of terms $P\rightarrowtail T(-,M)$ which are not the unit and not the product of two nonunits. In a concurrent language where multiplication is parallel execution, this is the predicate for “single-threaded” processes.

By including a graph of rewrites and terms $s,t:E\to V$, we can also reason not only about term structure but also *behavior*. Using logic on this graph, one can reconstruct modalities, bisimulation, and more. For example, given the type “single-threaded” we can take preimage along $t$ to get rewrites whose target is in that type, and then take image along $s$ to get their source. Then $s(t^\ast(\mathsf{single.thread}))$ is “terms which rewrite to be single-threaded”. One can express “terms which *always* satisfy a security condition $\varphi$”.

Though native types are built from the constructors of the original language, they naturally expand the language with logic and type theory, to give a sound and complete system for reasoning about the structure and behavior of terms.

First we describe the class of categories which represent our basic languages.

## Higher-order algebraic theories

Many structures can be represented abstractly as categories with finite products or limits, and then modelled by a functor, e.g. a ring object in sheaves. In logic and computer science (and mathematics), there are *higher-order operations* whose operands are themselves operations, such as universal quantification with predicates and $\lambda$-abstraction with expressions.

The concept of Higher-order algebraic theory has recently been defined by Arkor and McDermott. A **higher-order algebraic theory of sort $S$ and order $n$** is a cartesian category with exponents up to order $n$, freely generated from a set of objects $S$ and a presentation of operations and equations.

An operation $f:\mathtt{[S,T]\to A}$ is a rule which inputs terms of the form $x:\mathtt{S}\vdash f(x):\mathtt{T}$ and outputs a term of sort $\mathtt{A}$. Higher-order operations bind the variables of their operands, which enables the definition of substitution. Our main example is the *input* operation in a concurrent language, which is essentially a $\lambda$-abstraction “located at a channel” (see $\rho$-calculus below).

In the nexus of category theory, computer science, and logic, the idea of representing binding operations using exponents is well-known. But this model does not tell us how to actually *implement* these operations, i.e. it does not tell us how to implement capture-avoiding substitution. *There is a model which does*, using the “substitution monoidal structure” on $[\mathbb{F},\mathrm{Set}]$, as given by Fiore, Plotkin, & Turi. Fortunately this was proven equivalent to exponential structure by Fiore & Mahmoud.

Higher-order algebraic theories extend multisorted algebraic theories with binding operations. Because the concept is parameterized by order, the tower of $n$-theories provides a bridge all the way from cartesian categories to cartesian closed categories. Best of all, order-$(n+1)$ theories correspond to *monads* on order-$n$ theories: this justifies the idea that “higher-order operations are those whose operands are operations”. So, the theory-monad correspondence extends to incorporate binding.

An order-$(n+1)$ theory $T$ defines a monad on order-$n$ theories by a formula which generalizes the one for Lawvere theories: $\overline{T}:Law_n\to Law_n$

$\overline{T}(L)(X,Y) \simeq \int^{\Gamma :\Lambda_{n+1}(S)} \lceil L\rceil(1,\Gamma)\times \T(\Gamma,Y^X).$

where $\Lambda_{n+1}(S)$ is the free cartesian category with exponents up to order $n+1$ on the set of sorts $S$, and $\lceil L\rceil$ is the free $n+1$-theory on the $n$-theory $L$.

We now proceed with our main example of a language given by a higher-order algebraic theory.

### $\rho$-calculus

RChain is a distributed computing system based on the **$\rho$-calculus**, a *concurrent* language in which a program is not a sequence of instructions but a multiset of parallel processes. Each process is opting to send or receive information along a channel or “name”, and when they sync up on a name they can “communicate” and transfer information. So computation is *interactive*, providing a unified language for both computers and networks.

The classic concurrent language is the $\pi$-calculus. Its basic rule is *communication*:

$comm_\pi : \mathtt{out(n,a) \vert in(n,\lambda x.p) \Rightarrow p[a/x]}$

$\{[\text{output on name}\; \mathtt{n} \; \text{the name}\; \mathtt{a}] \; \text{in parallel with}\; [\text{input on}\; \mathtt{n}\; \text{, then}\; \mathtt{p(x)}]\}\; \text{evolves to}\; \{\mathtt{p[a/x]}\}.$

The $\rho$-calculus adds *reflection*: operators which turn processes (code) into names (data) and vice versa. Names are not atomic symbols; they are references to processes. Code can be sent as a message, then evaluated in another context — this “code mobility” allows for deep self-reference.

$comm_\rho : \mathtt{out(n,q) \vert in(n, \lambda x.p) \Rightarrow p[﹫ q/x]}$

$\{[\text{output on name}\; \mathtt{n}\; \text{the process}\; \mathtt{q}]\; \text{in parallel with}\; [\text{input on}\; \mathtt{n}\; \text{, then}\; \mathtt{p(x)}]\}\; \text{evolves to}\; \{\mathtt{p[﹫ q/x]}\}.$

We can model the $\rho$-calculus as a second-order theory $\T_\rho$ with sorts $\mathtt{P}$ for processes and $\mathtt{N}$ for names. These act as code and data respectively, and the operations of reference $﹫$ and dereference $\ast$ transform one into the other. There is a third sort $\mathtt{E}$ for rewrites. Terms are built up from one constant, the null process 0. The two basic actions of a process are output $\mathtt{out}$ and input $\mathtt{in}$, and parallel $-\vert-$ gives binary interaction: these get their meaning in the communication rule.

We present the operations, equations, and rewrites of the $\rho$-calculus.

The basic operations:

The rewrites (bottom two allow a rewrite to occur inside a parallel):

The main rule, communication, plus the basic equations:

The rules that make rewrites-inside-parallels make sense:

(here $f \rightsquigarrow g$ is shorthand for $s(e)=f$ and $t(e)=g$.)

As an example, we can construct recursive processes using reflection, without any special operator. On a given name $n:1\to \mathtt{N}$, we can define a context which replicates processes as follows.

Then when you plug in a process $p$, one can check that $!(p)_n \;\; \rightsquigarrow \;\; !(p)_n \; \vert \; p$. This magical $(-)!$ operation acts as a “replicating machine” which duplicates a process, executing one copy and passing the other back as a message.

The $\rho$-calculus serves as the main running example in our paper, as concurrency allows for reasoning about networks (such as security), and reflection enables the type system to reason about both code and data, and explore connections between the two basic notions of computation.

In Part 3 we present the native type system of the $\rho$-calculus, the internal language of the presheaf topos on $T_\rho$, and demonstrate applications. Thank you for reading, and let me know your thoughts.

## Re: Native Type Theory (Part 2)

Let me ask the same question I always do when I talk to you about this, which is why we care about order-$n$ theories for $n\notin \{1,\infty\}$?

A first-order algebraic theory is an ordinary algebraic theory, whose semantic incarnation is a category with finite products. An $\infty$-order algebraic theory is an ordinary $\lambda$-calculus, whose semantic incarnation is a cartesian closed category. What’s the value in stratifying things further?

I can sort of see the motivation for second-order algebraic theories, in which only a specified set of generating objects are exponentiable. They seem like sort of a simply-typed counterpart of the representable map categories introduced by Taichi Uemura for describing type theories, in which only a restricted class of judgments can be “hypothesized”. On the other hand, Gratzer and Sterling have recently shown that locally cartesian closed categories are conservative over representable map categories, so no expressivity is lost in the dependently typed case by just using LCCCs. It seems likely that something similar would be true in the simply typed case.

Are there important examples of $n$th order algebraic theories for $2\lt n\lt \infty$? Is there something concrete we gain by allowing $n\notin \{1,\infty\}$?