### Turing Categories

#### Posted by John Baez

*guest post by Georgios Bakirtzis and
Christian Williams*

We continue the Applied Category Theory Seminar with a discussion of the paper Introduction to Turing Categories by Hofstra and Cockett. Thank you to Jonathan Gallagher for all the great help in teaching our group, and to Daniel Cicala and Jules Hedges for running this seminar.

## Introduction

Turing categories enable the categorical study of computability theory, and thereby abstract and generalize the classical theory of computable functions on the natural numbers.

A **Turing category** is a category $\C$ equipped with:

- cartesian products – to pair (the codes of) data and programs,
- a notion of partiality – to represent programs (morphisms) which do not necessarily halt,
- and a
*Turing object*$A$ – to represent the “codes” of all programs.

The Turing object acts as a weak exponential for all pairs of objects
in the category. While this is impossible in most “set-theoretical”
categories, it is natural in computer science, because primitive
formal languages—the “machine code” of the higher languages—are
*untyped* (really *one*-typed).

For example, the untyped $\lambda$-calculus is represented by a
*reflexive object* $A$ in a cartesian closed category, i.e. an object
equipped with an embedding $app:A\to A^A$ which sends a $\lambda$
term $M$ to its action induced by application, and a retraction $lam:
A^A\to A,$ which sends an endo-hom term $M(-)$ to the lambda term
$\lambda x.M(x)$. The morphism $\bullet_A:= eval\circ (app\times
A):A\times A\to A$ is a *Turing morphism*, meaning that every
morphism $f: A^n\to A$ has a “code” $(f):1\to A$ such that $f =
\bullet_A^{n} \circ ((f)\times A^n).$

Turing categories are closely connected with *partial combinatory
algebras*, magma objects with a certain form of “computational
completeness”. The paradigmatic example is the “SK combinator
calculus”, a variable-free representation of the $\lambda$ calculus.

Turing categories and partial combinatory algebras (PCAs) are in a certain sense equivalent, which constitutes the main result of this work and justifies the notion of Turing category. (This is made into a precise equivalence of categories between relative PCAs and Turing categories over a fixed base category in the paper Categorical Simulations.)

Our purpose is to introduce the reader to some basic ideas in computability, and demonstrate how they can be expressed categorically.

## Computability Theory

In the early 1900s, we focused intently on determining “what can we prove (with formal systems)?”, or analogously, “what can we compute (with computers)?”. This simple question inspired great advances in logic, mathematics, and computer science, and motivated their mutual development and deepening interconnection.

Three main perspectives emerged independently in order to answer this question:

- Alan Turing thought about “hardware”, with Turing machines;
- Alonzo Church thought about “software”, with the $\lambda$-calculus; and
- Kurt Gödel thought about logic, with general recursive functions.

The Church-Turing thesis states that these are *equivalent* notions of
computation: a function $f:\mathbb{N}\to \mathbb{N}$ is Turing
computable iff $f$ can be written as a closed $\lambda$-term, iff $f$
is general recursive. This has led computer scientists to believe that
the concept of computability is accurately characterized by these
three equivalent notions.

Stephen Kleene, a student of Church, developed and unified these ideas in his study of “metamathematics”, focusing on the perspective of partial recursive functions (Introduction to Metamathematics – S.C. Kleene), which are equivalent to Gödel’s.

We can generate the class of *partial recursive functions* from basic
functions and operators, and utilize the idea of *Gödel
numbering* to consider the natural numbers both as the space of all
*data* and all *programs*. The two main theorems which characterize an
effective enumeration of all such functions—the Universality and
Parameter theorems described below—are the original motivation for
the definition of Turing categories.

### Recursive Functions

Computability is humbly rooted in simple arithmetic. Using only three
kinds of primitive functions, we can build up all computable
functions, the *partial recursive functions*:

The set of *primitive* recursive functions, $R$, is generated by two
operators. First, we of course need composition:

Second, to derive real computational power we need the eponymous
*recursion*:

This can be understood as a “for loop”—$g$ represents the initial parameters, $y$ is the counter, and $h$ describes a successive evaluation of the accumulated values—these are combined into the function $f$:

Now we have enough to do arithmetic, and we’re off. For example, we can define addition: numbers are sequences of successors, and adding is simply a process of peeling the $S$’s off of one number and appending them to the other, until the former reaches 0:

As a simple demonstration:

Of course, not all computations actually *halt*. We can use a
diagonalization argument to show that not all Turing-computable
functions are primitive recursive. To complete the equivalence, the
more general *partial* recursive functions are given by adding a
minimization operator, which inputs a $(k+1)$-ary function and outputs
a $k$-ary function, which searches for a minimal zero in the first
coordinate of $f$, and returns only if one exists:

This allows for “unbounded minimization of safe predicates”, to check the consistency of rules on infinitary data structures. For example, with this operator we can explicitly construct a universal Turing machine. The outputs of $\mu$ are certainly partial in general, because we can construct a predicate whose minimal zero would decide the halting problem.

All Turing-computable functions are represented by partial recursive functions, translating the complexity of Turing machines to a unified mathematical perspective of functions of natural numbers.

#### Gödel Numbering

Upon first hearing the Church-Turing thesis, one may ask “why do we
only care about functions of natural numbers?” By the simple idea of
*Gödel numbering*, we need only consider (the partial recursive
subset of) the hom $Set(\mathbb{N}, \mathbb{N})$ to think about the
big questions of computability.

Kurt Gödel proved that a formal language capable of expressing arithmetic cannot be both consistent, meaning it does not deduce a contradiction, and complete, meaning every sentence in the language can be either proven or disproven. The method of proof is remarkably simple: as a language uses a countable set of symbols, operations, and relations, we can enumerate all terms in the language. We can then use this enumeration to create self-referential sentences which must be independent of the system.

The idea is to use the fundamental theorem of arithmetic. Finite lists of natural numbers can be represented faithfully as single natural numbers, using an encoding with prime powers: order the primes $\{p_i\}$, and define the embedding

This can be applied recursively, to represent lists of lists of
numbers, giving encoding functions *enc* for any depth and length of
nested lists. The decoding, *dec*, is repeated factorization, forming a tree
of primes. For example,

We can then enumerate the partial recursive functions: because they are generated by three basic functions and three basic operators, we can represent them as lists of natural numbers by the following recursive encoding “#”:

Then, a partial recursive function gives a list of lists, which can be encoded with Gödel numbering. For example,

So why do we want to enumerate all partial recursive functions? In contrast to the incompleteness theorems, this gives the theory of computability a very positive kind of self-reference, in the form of two theorems which characterize a “good” enumeration of partial recursive functions—this “good” means “effective” in a technical sense, namely that the kinds of operations one performs on functions, such as composition, can be computed on the level of codes.

### Kleene’s Recursion Theorems

There is an enumeration of partial recursive functions $R = \{\phi_i\}$, such that the following theorems hold.

**Universality**: The functions $\phi^{(n)}$ are partial recursive:
$\phi^{(n)}(e,x_1,\dots ,x_n) = \phi_e(x_1,\dots ,x_n)$.

**Parameter**: There are total recursive functions $S_m^n$ such that
$\phi^{(n+m)}(e,x_1,\dots ,x_m,u_1,\dots ,u_n)=
\phi^{(m)}(S_m^n(e,x_1,\dots ,x_m),u_1,\dots ,u_n).$

The first says $\mathbb{N}$ is a “universal computer”—we have partial $\phi^{(k)}$ for each arity, which takes (the code of) a program and (the code of) input data, and computes (the code of) the result. The second says this computer has generalized currying—there are total $S_m^n$ for each arity, which transforms $(n+m)$-ary universal programs into $m$-ary ones, by taking the first $m+1$ arguments and returning the code of another program.

This is a remarkable fact—imagine having a computer which comes preloaded with every conceivable program, and one simply inputs the number of the program, and the number of the data. Furthermore, if one does not know which program to use, one can determine even that one step at a time.

Turing categories are meant to provide a minimal setting in which these two theorems hold. The Universality Theorem describes the universal property of the Turing object/morphism pair, and the Parameter Theorem describes the fact that the factorizations of the $n$-ary functions are determined by that of the unary ones, via the Turing morphism.

## Restriction Categories

We have already discussed that a notion of partiality is necessary to
model computation because not all computations halt at all inputs.
This notion is concrete captured through the notion of *partial
functions*. Partiality can be categorically described through
restriction categories.

Formally, a **restriction category** is a category $C$ endowed with a
combinator $\overline{(-)}$, sending

such that the following axioms are satisfied.

The first axiom states that precomposing with the domain of a map does
not change or otherwise restrict the original. The second axiom
states that restricting two functions behaves similarly to
intersection in the sense that they commute. The third axiom states
that taking the intersection of two domains is the same as taking
either map and restricting it to the domain of the *other* map. The
fourth axiom states that restricting a map in some other domain means
that you restrict it on some other composite.

Some examples of restriction categories are:

If $\overline{f}=1_A$, then $f$ is *total*. Note that all monics are
total. The total maps form a subcategory $Tot(C)$, which is of
interest in the accompanying paper Total Maps of Turing
Categories by
Hofstra, Cockett, and Hrube\v{s}. If $e=\overline{e}$, then $e$ is a
**domain of definition**. For any $a\in C$ Denote by $O(a)$ the set of
domains of definition of $a$. For these examples:

### Enrichment and Cartesian Structure

The notion of restriction entails a partial order on homsets—given $f,g:A\to B$, define $(f\leq g) \coloneqq (f=g\overline{f}).$ Intuitively this means that the domain of $f$ is contained in the domain of $g$. In the enrichment, $O(X)$ forms a meet semilattice.

Accordingly, to define the cartesian structure we need to weaken the terminal object and product to the poset-enriched case.

**Partial Terminal Object**. Once we have partial maps, the terminal
object acts like a “domain classifier”. Rather than having a unique
map, we have a natural bijection $C(A,1)\simeq O(A).$ For partial
recursive functions over $\mathbb{N}$, we have precisely the
*recursively enumerable* subsets. The partial terminal object
satisfies the weaker universal property that given a domain $e\in
O(A)$ and a map $f:A\to B$, the domain of the composite with any
domain of $B$ is contained in $e$.

The restriction terminal object, $1$, is then describing the case when there is a unique total map $!_A: A \rightarrow 1$ such that $!_1 = 1$. Furthermore, for each $f: A \rightarrow B$ this family of maps must have $!_B f = !_A \bar{f}$ to make the following diagram commute.

**Partial Product**. The partial product has the property that maps
$\langle f, g \rangle$ into the product, $A \times B$, which has the
domain contained in that of both $f$ and $g$. This is to account for
the case in which computing $f$ and $g$ simultaneously might produce a
result that either does not halt in $f$ or $g$. Therefore, the domain
of $\langle f, g \rangle$ needs to be the intersection of the
corresponding domains of $f$ and $g$.

### Splitting Idempotents

Turing categories are essentially meant to describe collections that
can be computed by the maps of the category. This can be expressed
with idempotent splitting. For example, in the “Turing category of
computable maps on $\mathbb{N}$”, explained later, a restriction
idempotent $e:\mathbb{N}\to \mathbb{N}$ is a recursively enumerable
set. Therefore the *splitting* of $e$, if it exists, makes this subset
explicit as an object of the category. Hence, the idempotent
completion of $\mathrm{Comp}(\mathbb{N})$ yields the category whose
objects are all recursively enumerable sets of natural numbers. This
will imply that these categories of computable maps are determined
uniquely *up to* Morita equivalence.

There is, however, a big difference between split restriction categories and general restriction categories. In the former, all domains are “out in the open”, rather than “hiding” as idempotents on some object. If we completely understood the idempotent completion of $Comp(\mathbb{N})$, we would know all recursively enumerable subsets of $\mathbb{N}$; one could argue that the study of computability would be complete. However, Turing categories are meant to abstract away from the natural numbers, in order to distinguish what is essential about computation, and remove anything about arithmetic that might be contingent.

## Turing Categories

Turing categories provide an abstract framework for computability: a “category with partiality” equipped with a “universal computer”, whose programs and codes thereof constitute the objects of interest.

Let $C$ be a cartesian restriction category, and let
$\tau_{X,Y}:A\times X\to Y$. A morphism $f:Z\times X\to Y$ admits a
**$\tau_{X,Y}$-index** when there exists $h: Z\to A$ such that the
following commutes.

The $\tau_{X, Y}$-index can be thought of as an evaluation function and $h$ as the function $\lambda^*(f)$. When every $f: Z \times X \rightarrow Y$ admits such an evaluation; that is $\tau_{X, Y}$-index, then $\tau_{X, Y}$ is universal. If the object $A$ does this for any $X, Y$ then $C$ is a Turing category and $A$ is a Turing object.

Given such a Turing category, $C$, an *applicative family* for the
Turing object $A$ is a family of maps $\tau = \{\tau_{X, Y}: A \times
X \rightarrow Y\mid X, Y\}$, where $X$, $Y$ range over all objects in
$C$. An applicative family for $A$ is universal if for every $X,Y$, we
have that $\tau_{X, Y}$ is universal. In such a case $\tau$ is a
Turing structure on $A$. We can then think of a Turing structure on
$C$ as a pair $(A, \tau)$.

We can further express the universality of the Turing object in $C$ by taking every object of $X\in C$ is a retract of $A$, by taking the $\tau_{1, X}$-index for $\pi_X$.

This extends to all finite powers $1, A, A^2, A^3,...$, which indicates two important points about the Turing object:

- The cardinality of the elements of $A$, is either one or infinite; and
- There is an internal pairing operation $A\times A\to A$, which is a section.

The latter morphism generalizes and abstracts the Gödel pairing $\mathbb{N}\times \mathbb{N}\to \mathbb{N}$.

So given such a universal object $A$ and a Turing structure
$\{\tau\}$, it is natural to wonder if a *universal self-application
map* $\tau_{A,A}$ is special. It is indeed, and it is called a
**Turing morphism** $\bullet:A\times A\to A$. Such a map, plus the
universal property of $A$, is in fact enough to reconstruct the entire
Turing structure of $\mathsf{C}$.

The classic example is Kleene application $\mathbb{N}\times \mathbb{N}\xrightarrow{\bullet}\mathbb{N}$. In general it can be thought of as a “universal Turing machine” that takes in a program (a Turing machine) and a datum, and computes the result.

This Turing morphism is the heart of the whole structure—we can construct all other universal applications $\tau_{X, Y}: A \times X \rightarrow Y$ by $A \times X \xrightarrow{A \times m_x} A \times A \xrightarrow{\bullet} A \xrightarrow{r_y} Y$ where pairs $(m_X, r_X)$ and $(m_Y, r_Y)$ retractions on $A$ and $Y$ respectively.

To verify that $\tau_{A,A}$ is a universal application, consider the following commutative diagram, which shows that all subsets can be represented as formal objects and all computable functions, like $f$, can be computed by representing them in $\mathbb{N}$.

The important idea is that a Turing object represents the collection
of (codes of) all programs in some formal system; the universal
applicative family is a method of encoding, and the Turing morphism is
the universal program which implements all others. The innovation of
Turing categories is that they abstract away from $\mathrm{Set}$, and
can therefore describe programs which are not explicitly
set-theoretical (not necessarily *extensional*, only intensional).

The axioms of a Turing structure correspond precisely to the
Universality and Parameter theorems. This fact culminates in the
**Recognition Theorem**: a cartesian restriction category $C$ is a
Turing category iff there is an object $A$ of which every object is a
retract, and for which there exists a Turing morphism $A \times A
\xrightarrow{\bullet} A$.

## Partial Combinatory Algebras

From the Recognition Theorem, we know that Turing categories are
essentially generated by a single object equipped with a universal
binary application. We can alternatively take the internal view,
starting with a magma object $(A,\bullet)$ in a cartesian restriction
category $C$, and consider maps $A^n \to A$ which are “computable”
with respect to $\bullet$. These maps form a category, and indeed a
Turing category, iff $A$ is *combinatory complete*—essentially
meaning that $(A,\bullet)$ is expressive enough to represent the
(partial) lambda calculus. First, we should understand combinators.

### Combinatory Logic

The practice of variable binding is subtle: when a computation uses a
set of symbols as variables, one must distinguish between *bound* and
*free* variables—the former being placeholders, and the latter being
data. In formal languages, preserving this distinction requires
conditions of the semantic rules, which can add baggage to otherwise
simple systems. In the $\lambda$-calculus, beta reduction $\beta:
(\lambda x.M\; N) \to M[x:=N]$ precludes “variable
capture”—accidentally substituting free into bound, e.g.: $(\lambda
y.(\lambda z.(y\;z))\; z) \to \lambda z.(z\; z).$ One must rename the
bound variables of $M$ and $N$ so they are distinct from variables in
both terms: $(\lambda y.(\lambda z.(y\;z))\; z) \equiv (\lambda
y.(\lambda a.(y\;a))\; z) \to \lambda a.(z\; a).$

This problem was originally confronted by Moses Schonfinkel in the
1920s, who created a theory of *combinators*—primitive units of
computation defined only by function application. Lambda terms can be
translated into combinators by a process of *abstraction elimination*,
which does away with variables completely. This can be understood by
observing that we only need to handle the three basic things that
occur within $\lambda$-abstraction: variables, application, and
further abstraction. These are represented by the fundamental
combinators $S$ and $K$:

We can then define a recursive translation of terms:

For example, $T[\lambda x.\lambda y. (y\; x)] = ((S\; (K\; (S\; I))\; ((S\; (K\; K))\; I))$. This translation preserves exactly the way this term acts in the $\lambda$ calculus. For more information, see To Mock a Mockingbird, and Eliminating Binders for Easier Operational Semantics.

## Applicative Systems and Computable Maps

Let $C$ be a cartesian restriction category, and let
$\mathbb{A}=(A,\bullet:A^2\to A)$ be a magma object, a.k.a. an
**applicative system**, in $C$. A guiding example is Kleene
application, $\phi^{(-)}:\mathbb{N}\times \mathbb{N}\to
\mathbb{N}$. While we care about those which intuitively “act as
universal Turing machines”, these are in fact a far more general
concept. $\mathbb{A}$ will be seen to generate a “category of
computations”.

### Comp(A)

A morphism $f:A^n\to A$ is $\mathbb{A}$-$\mathbf{computable}$, or simply computable, when there exists a total point $p:1\to A$, meaning $\overline{p}=id_1$ has total domain, which provides the “code” for $f$ (like the $\tau$-index):

This code $p$ picks out a program, i.e. an element of the universal Turing machine, which implements the function $f$ by application. When $n\geq 2$, we also require that $\bullet^{n-1}\circ (p\times 1): 1\times A^{n-1}\to A$ is total. A morphism $A^n \to A^m$ is computable if all components are, and a morphism $A^n\to A$ is computable if its domain is (as a map $A^n\to A^n$).

By weakening this diagram from commuting to being filled by a
restriction 2-cell, i.e. $f \leq \bullet (p\times 1)$, we generalize
from computability to the *realizability* of $f$, meaning that $f$ can
be implemented or “realized” by restricting the code-application
composite to the domain of $f$, even if the former is defined more
generally.

Of course in defining a morphism, we intend to form a category. With this general definition of computability, however, categorical structure does not come for free—for $\mathbb{A}$-computable morphisms to include identities and to be closed under composition, $\mathbb{A}$ needs to have the right structure. This structure is directly connected to combinatory logic.

Let $Poly(\mathbb{A})$ by the sub- (cartesian restriction) category of
$C$ generated by all total points $1\to A$ and application. These
morphisms are called $\mathbb{A}$-*polynomial morphisms*. This
essentially represents “all computations that $\mathbb{A}$ knows how
to do”. We should like that these are all computable, i.e. there
exists codes for all of these “computations”. An applicative system is
combinatory complete if every polynomial morphism is computable.

When is this true? Precisely when $\mathbb{A}$ has the combinators S and K; that is, the following morphisms are $\mathbb{A}$-computable.

Let’s think about why this is true. As in the original use of combinators, we give a translation from $\mathbb{A}$-polynomial morphisms to composites of $s_0,k_0$, application, and cartesian structure. This suffices because the generators of $Poly(\mathbb{A})$, the total points $p:1\to A$, are computable; then the polynomial construction essentially makes all possible application trees of these codes—and this is exactly what $s$ and $k$ do.

So, we define a **partial combinatory algebra** (PCA) to be an
applicative system which is combinatory complete. We can now prove the
simple categorical characterization of this completeness:

**Theorem**: For an applicative system $\mathbb{A} = (A,\bullet)$ in a
cartesian restriction category $C$, the following are equivalent:

$\mathbb{A}$ is combinatory complete.

$\mathbb{A}$-computable morphisms form a cartesian restriction category.

(The forward direction: computable is a special case of being polynomial. The backward direction: the code of $id_A$ is the code of $\bullet$.)

What about the *restriction* structure? That is, given computable $f$
how do we know that $\overline{f}$ is computable? This is actually
given by the combinator $k$:

i.e., $\overline{f} = k_0\langle 1{,}f\rangle$.

Pairing a function with $id_A$ and projecting onto the first component gives the subdomain of $A$ on which the function is defined, and this operation is computable by the existence of the code $k$. These combinators are humble but very useful.

Hence, we have a direct equivalence between the classical notion of
combinatory completeness and the general, abstract computability
concept of cartesian restriction category. (Like many
“generalizations” in abstract mathematics, it’s really more an
*extension* than generalization, because we still depend on the
original idea of combinatory completeness; but “grounding” the latter
by seeing it as a natural structure in a fully abstract setting
demonstrates essential understanding, and connects the concept to the
rest of category theory and computer science.)

### Examples

The most important example of a category of computable maps is $Comp(\mathbb{N})$, of natural numbers and Kleene application.

Another example is the (partial) lambda calculus, modelled by a reflexive object $A$ as mentioned in the introduction.

One can generalize these examples by adding further expressive power such as oracles, or by considering “basic recursive function theories” which are sets of functions satisfying the Universality and Parameter theorems.

### Structure Theorem

Now, for the unsurprising and important conclusion: given a PCA
$\mathbb{A}$, the computable maps $Comp(\mathbb{A})$ form not only a
cartesian restriction category, but a *Turing category*. The
application $\bullet:A\times A\to A$ is the Turing morphism, and the
codes are the indices. The pairing operator *pair*: $A\to A\times A$
is represented by the lambda term $\lambda xyz.((zx)y)$, and
projections *fst*,*snd*:$A\times A\to A$ are given by $\lambda xy.x$,
$\lambda xy.y$. One can check that these satisfy *fst* *pair* =
$\pi_0$ and *snd* *pair* = $\pi_1$. We then have (*pair*,$\langle$
*fst*, *snd* $\rangle): A\times A \triangleleft A$ gives the product
as a retract of $A$, and this extends to $A^n$. Hence, $\mathbb{A}$ is
a Turing object, and $Comp(\mathbb{A})$ is a Turing category.

Thus, we have the main theorem:

**Structure Theorem**

Let $C$ be a cartesian restriction category with applicative system $\mathbb{A}=(A,\bullet)$.

When $\mathbb{A}$ is a $PCA$ in $C$, then $Comp(\mathbb{A})$ is a Turing category with Turing object $\mathbb{A}$.

If $C$ is a Turing category and $\mathbb{A}$ is a Turing object, then $\mathbb{A}$ is a $PCA$ in $C$ and $C$ is Morita-equivalent to $Comp(\mathbb{A})$.

## Conclusion

We learned some about the roots of computability theory, and discussed how Turing categories abstract these ideas, using the guiding example of natural numbers and partial recursive functions. Every Turing category arises from a partial combinatory algebra, in particular the natural numbers with Kleene application. Conversely, every partial combinatory algebra determines a Turing category, establishing the latter and raising the former into more general abstraction. Turing categories remain an active area of research, and our group looks forward to studying and contributing to this work.

## Re: Turing Categories

Thank you for the introductory post on this subject, it looks interesting!

I was just wondering about the remark that you made about Turing objects:

If I understand it correctly, there is the universal Turing morphism $\tau_{A,A} : A \times A \to A$, and it is a retraction with a section $\tilde{\tau}_{A,A}$. And there is a section $\tilde{\tau}_{1, A^2}$. I guess internal pairing is the later operation, because the former one is an evaluation map?