Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

August 27, 2019

Turing Categories

Posted by John Baez

guest post by Georgios Bakirtzis and Christian Williams

We continue the Applied Category Theory Seminar with a discussion of the paper Introduction to Turing Categories by Hofstra and Cockett. Thank you to Jonathan Gallagher for all the great help in teaching our group, and to Daniel Cicala and Jules Hedges for running this seminar.

Introduction

Turing categories enable the categorical study of computability theory, and thereby abstract and generalize the classical theory of computable functions on the natural numbers.

A Turing category is a category C\C equipped with:

  • cartesian products – to pair (the codes of) data and programs,
  • a notion of partiality – to represent programs (morphisms) which do not necessarily halt,
  • and a Turing object AA – to represent the “codes” of all programs.

The Turing object acts as a weak exponential for all pairs of objects in the category. While this is impossible in most “set-theoretical” categories, it is natural in computer science, because primitive formal languages—the “machine code” of the higher languages—are untyped (really one-typed).

For example, the untyped λ\lambda-calculus is represented by a reflexive object AA in a cartesian closed category, i.e. an object equipped with an embedding app:AA Aapp:A\to A^A which sends a λ\lambda term MM to its action induced by application, and a retraction lam:A AA,lam: A^A\to A, which sends an endo-hom term M()M(-) to the lambda term λx.M(x)\lambda x.M(x). The morphism A:=eval(app×A):A×AA\bullet_A:= eval\circ (app\times A):A\times A\to A is a Turing morphism, meaning that every morphism f:A nAf: A^n\to A has a “code” (f):1A(f):1\to A such that f= A n((f)×A n).f = \bullet_A^{n} \circ ((f)\times A^n).

Turing categories are closely connected with partial combinatory algebras, magma objects with a certain form of “computational completeness”. The paradigmatic example is the “SK combinator calculus”, a variable-free representation of the λ\lambda calculus.

Turing categories and partial combinatory algebras (PCAs) are in a certain sense equivalent, which constitutes the main result of this work and justifies the notion of Turing category. (This is made into a precise equivalence of categories between relative PCAs and Turing categories over a fixed base category in the paper Categorical Simulations.)

Our purpose is to introduce the reader to some basic ideas in computability, and demonstrate how they can be expressed categorically.

Computability Theory

In the early 1900s, we focused intently on determining “what can we prove (with formal systems)?”, or analogously, “what can we compute (with computers)?”. This simple question inspired great advances in logic, mathematics, and computer science, and motivated their mutual development and deepening interconnection.

Three main perspectives emerged independently in order to answer this question:

  • Alan Turing thought about “hardware”, with Turing machines;
  • Alonzo Church thought about “software”, with the λ\lambda-calculus; and
  • Kurt Gödel thought about logic, with general recursive functions.

The Church-Turing thesis states that these are equivalent notions of computation: a function f:f:\mathbb{N}\to \mathbb{N} is Turing computable iff ff can be written as a closed λ\lambda-term, iff ff is general recursive. This has led computer scientists to believe that the concept of computability is accurately characterized by these three equivalent notions.

Stephen Kleene, a student of Church, developed and unified these ideas in his study of “metamathematics”, focusing on the perspective of partial recursive functions (Introduction to Metamathematics – S.C. Kleene), which are equivalent to Gödel’s.

We can generate the class of partial recursive functions from basic functions and operators, and utilize the idea of Gödel numbering to consider the natural numbers both as the space of all data and all programs. The two main theorems which characterize an effective enumeration of all such functions—the Universality and Parameter theorems described below—are the original motivation for the definition of Turing categories.

Recursive Functions

Computability is humbly rooted in simple arithmetic. Using only three kinds of primitive functions, we can build up all computable functions, the partial recursive functions:

(1)constant: C a(x 1,,x n) =a successor: S(x) =x+1 projection: P i k(x 1,,x k) =x i \begin{array}{lll} \text{constant: } & C_a(x_1,\dots ,x_n) & = a\\ \text{successor: } & S(x) & = x+1\\ \text{projection: } & P_i^k(x_1,\dots ,x_k) & = x_i\\\\ \end{array}

The set of primitive recursive functions, RR, is generated by two operators. First, we of course need composition:

(2)composition: (g i: k)R(h: n)R h(g 1,g n)S k n(h,g 1,,g n)R.\begin{array}{ll} \text{composition: } & (g_i:\mathbb{N}^k\to \mathbb{N})\in R \wedge (h:\mathbb{N}^n\to \mathbb{N})\in R\\ \circ & \implies h\circ (g_1,\dots g_n) \eqqcolon \mathbf{S_k^n}(h,g_1,\dots ,g_n)\in R. \end{array}

Second, to derive real computational power we need the eponymous recursion:

(3)recursion: g: kRh: k+2R ρ R k(g,h)fR f(0,x¯)=g(x¯) f(y+1,x¯)=h(y,f(y,x¯),x¯) \begin{array}{ll} \text{recursion: } & g:\mathbb{N}^k\to \mathbb{N}\in R \wedge h:\mathbb{N}^{k+2}\to \mathbb{N}\in R\\ \rho & \implies \mathbf{R^k}(g,h) \eqqcolon f\in R\\\\ & f(0,\overline{x}) = g(\overline{x})\\ & f(y+1,\overline{x}) = h(y,f(y,\overline{x}),\overline{x})\\\\ \end{array}

This can be understood as a “for loop”—gg represents the initial parameters, yy is the counter, and hh describes a successive evaluation of the accumulated values—these are combined into the function ff:

(4)f(n,x¯) {y=g(x¯) for(i=0;n) {y=h(x¯,i)}}\begin{array}{l} f(n,\overline{x})\coloneqq\\ \{y = g(\overline{x})\\ for (i = 0;n)\\ \{ y = h(\overline{x},i) \} \} \end{array}

Now we have enough to do arithmetic, and we’re off. For example, we can define addition: numbers are sequences of successors, and adding is simply a process of peeling the SS’s off of one number and appending them to the other, until the former reaches 0:

(5)base step: add(0,a)=g(a)=a=P 1 1(a) induction: add(S(b),a)=S(g(b,a))h(b,f(b,a),a)=S(P 2 3(b,c,a)) recursion: add:=R 2(P 1 1,S 1 3(S,P 2 3)) \begin{array}{ll} \text{base step: } & add(0,a) = g(a) = a = P_1^1(a)\\ \text{induction: } & add(S(b),a) = S(g(b,a)) \eqqcolon h(b, f(b,a), a) = S(P_2^3(b,c,a))\\ \text{recursion: } & add := R^2(P_1^1,S_1^3(S,P_2^3))\\ \end{array}

As a simple demonstration:

(6)add(2,1) =S 1 3(S,P 2 3)(1,add(1,1),1) =S(add(1,1)) =S(S 1 3(S,P 2 3)(0,add(0,1),1)) =S(S(P 1 1(1))) =S(S(S(0)))=3.\begin{array}{ll} add(2,1) & = S_1^3(S,P_2^3)(1,add(1,1),1)\\ & = S(add(1,1))\\ & = S(S_1^3(S,P_2^3)(0,add(0,1),1))\\ & = S(S(P_1^1(1)))\\ & = S(S(S(0))) = 3. \end{array}

Of course, not all computations actually halt. We can use a diagonalization argument to show that not all Turing-computable functions are primitive recursive. To complete the equivalence, the more general partial recursive functions are given by adding a minimization operator, which inputs a (k+1)(k+1)-ary function and outputs a kk-ary function, which searches for a minimal zero in the first coordinate of ff, and returns only if one exists:

(7)minimization: (f: k+1)R(f is total) μ (μ(f))R μ(f)(x 1,,x k)=z (f(z,x 1,,x k)=0) (iz1f(i,x 1,,x k)1)\begin{array}{ll} \text{minimization: } & (f:\mathbb{N}^{k+1}\to \mathbb{N})\in R \wedge (\text{f is total})\\ \mu & \implies (\mu\mathbf{(f)})\in R\\ & \mu(f)(x_1,\dots ,x_k) = z \iff\\ (f(z,x_1,\dots ,x_k) = 0) & \wedge (\forall i \leq z-1 \;\; f(i,x_1,\dots ,x_k) \geq 1) \end{array}

This allows for “unbounded minimization of safe predicates”, to check the consistency of rules on infinitary data structures. For example, with this operator we can explicitly construct a universal Turing machine. The outputs of μ\mu are certainly partial in general, because we can construct a predicate whose minimal zero would decide the halting problem.

All Turing-computable functions are represented by partial recursive functions, translating the complexity of Turing machines to a unified mathematical perspective of functions of natural numbers.

Gödel Numbering

Upon first hearing the Church-Turing thesis, one may ask “why do we only care about functions of natural numbers?” By the simple idea of Gödel numbering, we need only consider (the partial recursive subset of) the hom Set(,)Set(\mathbb{N}, \mathbb{N}) to think about the big questions of computability.

Kurt Gödel proved that a formal language capable of expressing arithmetic cannot be both consistent, meaning it does not deduce a contradiction, and complete, meaning every sentence in the language can be either proven or disproven. The method of proof is remarkably simple: as a language uses a countable set of symbols, operations, and relations, we can enumerate all terms in the language. We can then use this enumeration to create self-referential sentences which must be independent of the system.

The idea is to use the fundamental theorem of arithmetic. Finite lists of natural numbers can be represented faithfully as single natural numbers, using an encoding with prime powers: order the primes {p i}\{p_i\}, and define the embedding

(8)f: kf:\mathbb{N}^k\to \mathbb{N}
(9)[n 1,,n k] ip i n i.[n_1,\dots,n_k]\mapsto \prod_i p_i^{n_i}.

This can be applied recursively, to represent lists of lists of numbers, giving encoding functions enc for any depth and length of nested lists. The decoding, dec, is repeated factorization, forming a tree of primes. For example,

(10)enc([[1,4,8,2],[1,3]])=2 23 45 87 23 23 3enc([[1,4,8,2],[1,3]]) = 2^{2\cdot 3^4\cdot 5^8\cdot 7^2\cdot 3^{2\cdot 3^3}}
(11)dec(3553)=[0,0,0,0,1,0,1,1].dec(3553) = [0,0,0,0,1,0,1,1].

We can then enumerate the partial recursive functions: because they are generated by three basic functions and three basic operators, we can represent them as lists of natural numbers by the following recursive encoding “#”:

(12)#C k =[1,k] #S =[2] #P i k(x 1,,x k) =[3,k,i] #S k n(h,g 1,,g n) =[4,#h,#g 1,,#g n] #R k(g,h) =[5,k,#g,#h] #M(f) =[6,#f]\begin{array}{ll} \#C_k & = [1,k]\\ \#S & = [2]\\ \#P_i^k(x_1,\dots,x_k) & = [3,k,i]\\\\ \#S_k^n(h,g_1,\dots,g_n) & = [4,\#h,\#g_1,\dots ,\#g_n]\\ \#R^k(g,h) & = [5,k,\#g,\#h]\\ \#M(f) & = [6,\#f] \end{array}

Then, a partial recursive function gives a list of lists, which can be encoded with Gödel numbering. For example,

(13)add =R 2(P 1 1,S 1 3(S,P 2 3)) #add =[5,2,[3,2,2],[4,[2],[3,4,3]]] enc(#add) =2 53 25 2 3325 27 2 43 2 25 2 33 45 3.\begin{array}{ll} add & = R^2(P_1^1,S_1^3(S,P_2^3))\\ \#add & = [5,2,[3,2,2],[4,[2],[3,4,3]]]\\ enc(\#add) & = 2^5\cdot 3^2\cdot 5^{2^3\cdot 3\cdot 2\cdot 5^2}\cdot 7^{2^4\cdot 3^{2^2}\cdot 5^{2^3\cdot 3^4\cdot 5^3}}\text{.} \end{array}

So why do we want to enumerate all partial recursive functions? In contrast to the incompleteness theorems, this gives the theory of computability a very positive kind of self-reference, in the form of two theorems which characterize a “good” enumeration of partial recursive functions—this “good” means “effective” in a technical sense, namely that the kinds of operations one performs on functions, such as composition, can be computed on the level of codes.

Kleene’s Recursion Theorems

There is an enumeration of partial recursive functions R={ϕ i}R = \{\phi_i\}, such that the following theorems hold.

Universality: The functions ϕ (n)\phi^{(n)} are partial recursive: ϕ (n)(e,x 1,,x n)=ϕ e(x 1,,x n)\phi^{(n)}(e,x_1,\dots ,x_n) = \phi_e(x_1,\dots ,x_n).

Parameter: There are total recursive functions S m nS_m^n such that ϕ (n+m)(e,x 1,,x m,u 1,,u n)=ϕ (m)(S m n(e,x 1,,x m),u 1,,u n).\phi^{(n+m)}(e,x_1,\dots ,x_m,u_1,\dots ,u_n)= \phi^{(m)}(S_m^n(e,x_1,\dots ,x_m),u_1,\dots ,u_n).

The first says \mathbb{N} is a “universal computer”—we have partial ϕ (k)\phi^{(k)} for each arity, which takes (the code of) a program and (the code of) input data, and computes (the code of) the result. The second says this computer has generalized currying—there are total S m nS_m^n for each arity, which transforms (n+m)(n+m)-ary universal programs into mm-ary ones, by taking the first m+1m+1 arguments and returning the code of another program.

This is a remarkable fact—imagine having a computer which comes preloaded with every conceivable program, and one simply inputs the number of the program, and the number of the data. Furthermore, if one does not know which program to use, one can determine even that one step at a time.

Turing categories are meant to provide a minimal setting in which these two theorems hold. The Universality Theorem describes the universal property of the Turing object/morphism pair, and the Parameter Theorem describes the fact that the factorizations of the nn-ary functions are determined by that of the unary ones, via the Turing morphism.

Restriction Categories

We have already discussed that a notion of partiality is necessary to model computation because not all computations halt at all inputs. This notion is concrete captured through the notion of partial functions. Partiality can be categorically described through restriction categories.

Formally, a restriction category is a category CC endowed with a combinator ()¯\overline{(-)}, sending

(14)(f:AB)(f¯:AA)(f:A\to B) \mapsto (\overline{f}:A\to A)

such that the following axioms are satisfied.

(15)1 ff¯=f 2 f¯g¯=g¯f¯ dom(f)=dom(g) 3 gf¯¯=g¯f¯ dom(f)=dom(g) 4 g¯f=fgf¯ cod(f)=dom(g)\begin{array}{lll} \text{1 } & f\overline{f} = f &\\\\ \text{2 } & \overline{f}\overline{g} = \overline{g}\overline{f} & dom(f)=dom(g)\\\\ \text{3 } & \overline{g\overline{f}} = \overline{g}\overline{f} & dom(f)=dom(g)\\\\ \text{4 } & \overline{g}f = f\overline{gf} & cod(f)=dom(g) \end{array}

The first axiom states that precomposing with the domain of a map does not change or otherwise restrict the original. The second axiom states that restricting two functions behaves similarly to intersection in the sense that they commute. The third axiom states that taking the intersection of two domains is the same as taking either map and restricting it to the domain of the other map. The fourth axiom states that restricting a map in some other domain means that you restrict it on some other composite.

Some examples of restriction categories are:

(16)Par sets and partial functions. f¯(x)=x if f(x) is defined, else . Par(C,M) ob(C) and left-monic spans. PR N k and partial recursive functions.\begin{array}{ll} \mathsf{Par} & \text{sets and partial functions.}\\ & \overline{f}(x) = x \; \text{ if }\; f(x) \; \text { is defined, else }\; \bot.\\ \mathsf{Par(C,M)} & \mathsf{ob(C)}\; \text{ and left-monic spans.}\\ \mathsf{PR} & \N^k \; \text{ and partial recursive functions.} \end{array}

If f¯=1 A\overline{f}=1_A, then ff is total. Note that all monics are total. The total maps form a subcategory Tot(C)Tot(C), which is of interest in the accompanying paper Total Maps of Turing Categories by Hofstra, Cockett, and Hrubeš. If e=e¯e=\overline{e}, then ee is a domain of definition. For any aCa\in C Denote by O(a)O(a) the set of domains of definition of aa. For these examples:

(17)Par O(X)={subsets of X}. Par(C,M) O(c)={subobjects of c}. PR O(N k)={recursively enumerable subsets of N k}.\begin{array}{ll} \mathsf{Par} & O(X) = \{ \text{subsets of }\; X\}.\\ \mathsf{Par(C,M)} & O(c) = \{ \text{subobjects of }\; c\}.\\ \mathsf{PR} & O(\N^k) = \{ \text{recursively enumerable subsets of }\; \N^k\}. \end{array}

Enrichment and Cartesian Structure

The notion of restriction entails a partial order on homsets—given f,g:ABf,g:A\to B, define (fg)(f=gf¯).(f\leq g) \coloneqq (f=g\overline{f}). Intuitively this means that the domain of ff is contained in the domain of gg. In the enrichment, O(X)O(X) forms a meet semilattice.

Accordingly, to define the cartesian structure we need to weaken the terminal object and product to the poset-enriched case.

Partial Terminal Object. Once we have partial maps, the terminal object acts like a “domain classifier”. Rather than having a unique map, we have a natural bijection C(A,1)O(A).C(A,1)\simeq O(A). For partial recursive functions over \mathbb{N}, we have precisely the recursively enumerable subsets. The partial terminal object satisfies the weaker universal property that given a domain eO(A)e\in O(A) and a map f:ABf:A\to B, the domain of the composite with any domain of BB is contained in ee.

The restriction terminal object, 11, is then describing the case when there is a unique total map ! A:A1!_A: A \rightarrow 1 such that ! 1=1!_1 = 1. Furthermore, for each f:ABf: A \rightarrow B this family of maps must have ! Bf=! Af¯!_B f = !_A \bar{f} to make the following diagram commute.

Partial Product. The partial product has the property that maps f,g\langle f, g \rangle into the product, A×BA \times B, which has the domain contained in that of both ff and gg. This is to account for the case in which computing ff and gg simultaneously might produce a result that either does not halt in ff or gg. Therefore, the domain of f,g\langle f, g \rangle needs to be the intersection of the corresponding domains of ff and gg.

Splitting Idempotents

Turing categories are essentially meant to describe collections that can be computed by the maps of the category. This can be expressed with idempotent splitting. For example, in the “Turing category of computable maps on \mathbb{N}”, explained later, a restriction idempotent e:e:\mathbb{N}\to \mathbb{N} is a recursively enumerable set. Therefore the splitting of ee, if it exists, makes this subset explicit as an object of the category. Hence, the idempotent completion of Comp()\mathrm{Comp}(\mathbb{N}) yields the category whose objects are all recursively enumerable sets of natural numbers. This will imply that these categories of computable maps are determined uniquely up to Morita equivalence.

There is, however, a big difference between split restriction categories and general restriction categories. In the former, all domains are “out in the open”, rather than “hiding” as idempotents on some object. If we completely understood the idempotent completion of Comp()Comp(\mathbb{N}), we would know all recursively enumerable subsets of \mathbb{N}; one could argue that the study of computability would be complete. However, Turing categories are meant to abstract away from the natural numbers, in order to distinguish what is essential about computation, and remove anything about arithmetic that might be contingent.

Turing Categories

Turing categories provide an abstract framework for computability: a “category with partiality” equipped with a “universal computer”, whose programs and codes thereof constitute the objects of interest.

Let CC be a cartesian restriction category, and let τ X,Y:A×XY\tau_{X,Y}:A\times X\to Y. A morphism f:Z×XYf:Z\times X\to Y admits a τ X,Y\tau_{X,Y}-index when there exists h:ZAh: Z\to A such that the following commutes.

The τ X,Y\tau_{X, Y}-index can be thought of as an evaluation function and hh as the function λ *(f)\lambda^*(f). When every f:Z×XYf: Z \times X \rightarrow Y admits such an evaluation; that is τ X,Y\tau_{X, Y}-index, then τ X,Y\tau_{X, Y} is universal. If the object AA does this for any X,YX, Y then CC is a Turing category and AA is a Turing object.

Given such a Turing category, CC, an applicative family for the Turing object AA is a family of maps τ={τ X,Y:A×XYX,Y}\tau = \{\tau_{X, Y}: A \times X \rightarrow Y\mid X, Y\}, where XX, YY range over all objects in CC. An applicative family for AA is universal if for every X,YX,Y, we have that τ X,Y\tau_{X, Y} is universal. In such a case τ\tau is a Turing structure on AA. We can then think of a Turing structure on CC as a pair (A,τ)(A, \tau).

We can further express the universality of the Turing object in CC by taking every object of XCX\in C is a retract of AA, by taking the τ 1,X\tau_{1, X}-index for π X\pi_X.

This extends to all finite powers 1,A,A 2,A 3,...1, A, A^2, A^3,..., which indicates two important points about the Turing object:

  • The cardinality of the elements of AA, is either one or infinite; and
  • There is an internal pairing operation A×AAA\times A\to A, which is a section.

The latter morphism generalizes and abstracts the Gödel pairing ×\mathbb{N}\times \mathbb{N}\to \mathbb{N}.

So given such a universal object AA and a Turing structure {τ}\{\tau\}, it is natural to wonder if a universal self-application map τ A,A\tau_{A,A} is special. It is indeed, and it is called a Turing morphism :A×AA\bullet:A\times A\to A. Such a map, plus the universal property of AA, is in fact enough to reconstruct the entire Turing structure of C\mathsf{C}.

The classic example is Kleene application ×\mathbb{N}\times \mathbb{N}\xrightarrow{\bullet}\mathbb{N}. In general it can be thought of as a “universal Turing machine” that takes in a program (a Turing machine) and a datum, and computes the result.

This Turing morphism is the heart of the whole structure—we can construct all other universal applications τ X,Y:A×XY\tau_{X, Y}: A \times X \rightarrow Y by A×XA×m xA×AAr yYA \times X \xrightarrow{A \times m_x} A \times A \xrightarrow{\bullet} A \xrightarrow{r_y} Y where pairs (m X,r X)(m_X, r_X) and (m Y,r Y)(m_Y, r_Y) retractions on AA and YY respectively.

To verify that τ A,A\tau_{A,A} is a universal application, consider the following commutative diagram, which shows that all subsets can be represented as formal objects and all computable functions, like ff, can be computed by representing them in \mathbb{N}.

The important idea is that a Turing object represents the collection of (codes of) all programs in some formal system; the universal applicative family is a method of encoding, and the Turing morphism is the universal program which implements all others. The innovation of Turing categories is that they abstract away from Set\mathrm{Set}, and can therefore describe programs which are not explicitly set-theoretical (not necessarily extensional, only intensional).

The axioms of a Turing structure correspond precisely to the Universality and Parameter theorems. This fact culminates in the Recognition Theorem: a cartesian restriction category CC is a Turing category iff there is an object AA of which every object is a retract, and for which there exists a Turing morphism A×AAA \times A \xrightarrow{\bullet} A.

Partial Combinatory Algebras

From the Recognition Theorem, we know that Turing categories are essentially generated by a single object equipped with a universal binary application. We can alternatively take the internal view, starting with a magma object (A,)(A,\bullet) in a cartesian restriction category CC, and consider maps A nAA^n \to A which are “computable” with respect to \bullet. These maps form a category, and indeed a Turing category, iff AA is combinatory complete—essentially meaning that (A,)(A,\bullet) is expressive enough to represent the (partial) lambda calculus. First, we should understand combinators.

Combinatory Logic

The practice of variable binding is subtle: when a computation uses a set of symbols as variables, one must distinguish between bound and free variables—the former being placeholders, and the latter being data. In formal languages, preserving this distinction requires conditions of the semantic rules, which can add baggage to otherwise simple systems. In the λ\lambda-calculus, beta reduction β:(λx.MN)M[x:=N]\beta: (\lambda x.M\; N) \to M[x:=N] precludes “variable capture”—accidentally substituting free into bound, e.g.: (λy.(λz.(yz))z)λz.(zz).(\lambda y.(\lambda z.(y\;z))\; z) \to \lambda z.(z\; z). One must rename the bound variables of MM and NN so they are distinct from variables in both terms: (λy.(λz.(yz))z)(λy.(λa.(ya))z)λa.(za).(\lambda y.(\lambda z.(y\;z))\; z) \equiv (\lambda y.(\lambda a.(y\;a))\; z) \to \lambda a.(z\; a).

This problem was originally confronted by Moses Schonfinkel in the 1920s, who created a theory of combinators—primitive units of computation defined only by function application. Lambda terms can be translated into combinators by a process of abstraction elimination, which does away with variables completely. This can be understood by observing that we only need to handle the three basic things that occur within λ\lambda-abstraction: variables, application, and further abstraction. These are represented by the fundamental combinators SS and KK:

(18)(((Sx)y)z) =((xz)(yz)) ((Kx)y) =x \begin{array}{ll} (((S\; x)\; y)\; z) & = ((x\; z)\; (y\; z))\\ ((K\; x)\; y) & = x\\ \end{array}

We can then define a recursive translation of terms:

(19)T[x] =x T[(MN)] =(T[M]T[N]) T[λx.M] =(KT[M]) (x not free in M) T[λx.x] =((SK)K) T[λx.λy.M] =T[λx.T[λy.M]] (x free in M) T[λx.(MN)] =((ST[λx.M])T[λx.N]) (x free in M) \begin{array}{lll} T[x] & = x & \\ T[(M\; N)] & = (T[M]\; T[N]) &\\ T[\lambda x.M] & = (K\; T[M]) & \text{(x not free in M)}\\ T[\lambda x.x] & = ((S\; K)\; K) &\\ T[\lambda x.\lambda y.M] & = T[\lambda x.T[\lambda y.M]] & \text{(x free in M)}\\ T[\lambda x.(M\; N)] & = ((S\; T[\lambda x.M])\; T[\lambda x.N]) & \text{(x free in M)}\\ \end{array}

For example, T[λx.λy.(yx)]=((S(K(SI))((S(KK))I))T[\lambda x.\lambda y. (y\; x)] = ((S\; (K\; (S\; I))\; ((S\; (K\; K))\; I)). This translation preserves exactly the way this term acts in the λ\lambda calculus. For more information, see To Mock a Mockingbird, and Eliminating Binders for Easier Operational Semantics.

Applicative Systems and Computable Maps

Let CC be a cartesian restriction category, and let 𝔸=(A,:A 2A)\mathbb{A}=(A,\bullet:A^2\to A) be a magma object, a.k.a. an applicative system, in CC. A guiding example is Kleene application, ϕ ():×\phi^{(-)}:\mathbb{N}\times \mathbb{N}\to \mathbb{N}. While we care about those which intuitively “act as universal Turing machines”, these are in fact a far more general concept. 𝔸\mathbb{A} will be seen to generate a “category of computations”.

Comp(A)

A morphism f:A nAf:A^n\to A is 𝔸\mathbb{A}-computable\mathbf{computable}, or simply computable, when there exists a total point p:1Ap:1\to A, meaning p¯=id 1\overline{p}=id_1 has total domain, which provides the “code” for ff (like the τ\tau-index):

This code pp picks out a program, i.e. an element of the universal Turing machine, which implements the function ff by application. When n2n\geq 2, we also require that n1(p×1):1×A n1A\bullet^{n-1}\circ (p\times 1): 1\times A^{n-1}\to A is total. A morphism A nA mA^n \to A^m is computable if all components are, and a morphism A nAA^n\to A is computable if its domain is (as a map A nA nA^n\to A^n).

By weakening this diagram from commuting to being filled by a restriction 2-cell, i.e. f(p×1)f \leq \bullet (p\times 1), we generalize from computability to the realizability of ff, meaning that ff can be implemented or “realized” by restricting the code-application composite to the domain of ff, even if the former is defined more generally.

Of course in defining a morphism, we intend to form a category. With this general definition of computability, however, categorical structure does not come for free—for 𝔸\mathbb{A}-computable morphisms to include identities and to be closed under composition, 𝔸\mathbb{A} needs to have the right structure. This structure is directly connected to combinatory logic.

Let Poly(𝔸)Poly(\mathbb{A}) by the sub- (cartesian restriction) category of CC generated by all total points 1A1\to A and application. These morphisms are called 𝔸\mathbb{A}-polynomial morphisms. This essentially represents “all computations that 𝔸\mathbb{A} knows how to do”. We should like that these are all computable, i.e. there exists codes for all of these “computations”. An applicative system is combinatory complete if every polynomial morphism is computable.

When is this true? Precisely when 𝔸\mathbb{A} has the combinators S and K; that is, the following morphisms are 𝔸\mathbb{A}-computable.

(20)k 0=π 0:A×AA (K(a,b)=a) s 0:A×A×Aπ 13,π 23(A×A)×(A×A)×A×AA (S(a,b,c)=((ab)(ac)))\begin{array}{ll} k_0= \pi_0: A\times A\to A\\ (K(a,b) = a)\\ s_0: A\times A\times A \xrightarrow{\langle \pi_{13}{,} \pi_{23} \rangle} (A\times A)\times (A\times A) \xrightarrow{\bullet\times \bullet} A\times A \xrightarrow{\bullet} A\\ (S(a,b,c) = ((a\bullet b)\bullet (a\bullet c))) \end{array}

Let’s think about why this is true. As in the original use of combinators, we give a translation from 𝔸\mathbb{A}-polynomial morphisms to composites of s 0,k 0s_0,k_0, application, and cartesian structure. This suffices because the generators of Poly(𝔸)Poly(\mathbb{A}), the total points p:1Ap:1\to A, are computable; then the polynomial construction essentially makes all possible application trees of these codes—and this is exactly what ss and kk do.

So, we define a partial combinatory algebra (PCA) to be an applicative system which is combinatory complete. We can now prove the simple categorical characterization of this completeness:

Theorem: For an applicative system 𝔸=(A,)\mathbb{A} = (A,\bullet) in a cartesian restriction category CC, the following are equivalent:

  • 𝔸\mathbb{A} is combinatory complete.

  • 𝔸\mathbb{A}-computable morphisms form a cartesian restriction category.

(The forward direction: computable is a special case of being polynomial. The backward direction: the code of id Aid_A is the code of \bullet.)

What about the restriction structure? That is, given computable ff how do we know that f¯\overline{f} is computable? This is actually given by the combinator kk:

i.e., f¯=k 01,f\overline{f} = k_0\langle 1{,}f\rangle.

Pairing a function with id Aid_A and projecting onto the first component gives the subdomain of AA on which the function is defined, and this operation is computable by the existence of the code kk. These combinators are humble but very useful.

Hence, we have a direct equivalence between the classical notion of combinatory completeness and the general, abstract computability concept of cartesian restriction category. (Like many “generalizations” in abstract mathematics, it’s really more an extension than generalization, because we still depend on the original idea of combinatory completeness; but “grounding” the latter by seeing it as a natural structure in a fully abstract setting demonstrates essential understanding, and connects the concept to the rest of category theory and computer science.)

Examples

The most important example of a category of computable maps is Comp()Comp(\mathbb{N}), of natural numbers and Kleene application.

Another example is the (partial) lambda calculus, modelled by a reflexive object AA as mentioned in the introduction.

One can generalize these examples by adding further expressive power such as oracles, or by considering “basic recursive function theories” which are sets of functions satisfying the Universality and Parameter theorems.

Structure Theorem

Now, for the unsurprising and important conclusion: given a PCA 𝔸\mathbb{A}, the computable maps Comp(𝔸)Comp(\mathbb{A}) form not only a cartesian restriction category, but a Turing category. The application :A×AA\bullet:A\times A\to A is the Turing morphism, and the codes are the indices. The pairing operator pair: AA×AA\to A\times A is represented by the lambda term λxyz.((zx)y)\lambda xyz.((zx)y), and projections fst,snd:A×AAA\times A\to A are given by λxy.x\lambda xy.x, λxy.y\lambda xy.y. One can check that these satisfy fst pair = π 0\pi_0 and snd pair = π 1\pi_1. We then have (pair,\langle fst, snd ):A×AA\rangle): A\times A \triangleleft A gives the product as a retract of AA, and this extends to A nA^n. Hence, 𝔸\mathbb{A} is a Turing object, and Comp(𝔸)Comp(\mathbb{A}) is a Turing category.

Thus, we have the main theorem:

Structure Theorem

Let CC be a cartesian restriction category with applicative system 𝔸=(A,)\mathbb{A}=(A,\bullet).

  1. When 𝔸\mathbb{A} is a PCAPCA in CC, then Comp(𝔸)Comp(\mathbb{A}) is a Turing category with Turing object 𝔸\mathbb{A}.

  2. If CC is a Turing category and 𝔸\mathbb{A} is a Turing object, then 𝔸\mathbb{A} is a PCAPCA in CC and CC is Morita-equivalent to Comp(𝔸)Comp(\mathbb{A}).

Conclusion

We learned some about the roots of computability theory, and discussed how Turing categories abstract these ideas, using the guiding example of natural numbers and partial recursive functions. Every Turing category arises from a partial combinatory algebra, in particular the natural numbers with Kleene application. Conversely, every partial combinatory algebra determines a Turing category, establishing the latter and raising the former into more general abstraction. Turing categories remain an active area of research, and our group looks forward to studying and contributing to this work.

Posted at August 27, 2019 4:20 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3135

15 Comments & 0 Trackbacks

Re: Turing Categories

Thank you for the introductory post on this subject, it looks interesting!

I was just wondering about the remark that you made about Turing objects:

There is an internal pairing operation A×AAA\times A\to A, which is a section.

If I understand it correctly, there is the universal Turing morphism τ A,A:A×AA\tau_{A,A} : A \times A \to A, and it is a retraction with a section τ˜ A,A\tilde{\tau}_{A,A}. And there is a section τ˜ 1,A 2\tilde{\tau}_{1, A^2}. I guess internal pairing is the later operation, because the former one is an evaluation map?

Posted by: Dan Frumin on August 29, 2019 1:44 PM | Permalink | Reply to this

Re: Turing Categories

Yes, exactly. Thank you for catching that ambiguity.

Posted by: Christian Williams on August 30, 2019 4:42 PM | Permalink | Reply to this

Re: Turing Categories

I don’t understand the axioms of a restriction category (15). What do the equations “dom(f) = dom(g)” in the 2nd axiom mean? Similarly for 3 and 4.

Posted by: Max S. New on August 29, 2019 9:08 PM | Permalink | Reply to this

Re: Turing Categories

I think those just mean that both f and g have the same source, e.g. f:ABf : A \to B and g:ACg : A \to C. Otherwise the composite f¯g¯\bar{f} \circ \bar{g} would not be defined.

Posted by: Dan Frumin on August 30, 2019 8:46 AM | Permalink | Reply to this

Re: Turing Categories

Has anyone formalized the Halting problem in terms of Turing Categories yet? My brief glance of the literature didn’t turn anything up, which surprised me.

I’ve written something up which I think works, but I’m not sure if the proof is totally “categorified”, if I could get some feedback that would be great! It’s on my blog because I was too lazy to figure out how to get nice commutative diagrams to work on this page: Categorifying Halting.

Posted by: Owen Lynch on August 30, 2019 2:51 AM | Permalink | Reply to this

Re: Turing Categories

Wow, great blog post! That would have been very good to include in the original paper.

The paper actually does discuss the halting problem, though not the classic contradiction to which you refer - at the end of Section 3, it discusses the completeness of the Halting set: the fact that the Halting problem is “at least as hard” as any other problem.

They define m-reduction on domains, e mee \leq_m e' to be the existence of a total map f:XYf:X\to Y such that e=f *(e)e=f^*(e'); ff is “reducing” the problem for which e is the subobject of solutions to the problem of e’. Then m-completeness of a domain is that any other domain m-reduces to it.

The Halting set is the domain of (0)\bullet^{(0)}, which is very close to the map that you describe in your post. They prove that it is m-complete.

Thanks a lot; that’s really interesting.

Posted by: Christian Williams on August 30, 2019 6:02 PM | Permalink | Reply to this

Re: Turing Categories

Hi Owen,

I think your definition of the morphism n is specific to categories of partial maps (like partial recursive functions). At least I am not sure how to define in in an arbitrary Turing category. Also, from my understanding of the blog post, a Turing category doesn’t necessarily have coproducts, so the object 2 does not have to exist.

On a side note, a somewhat categorical derivation of the undecidability of Halting problem can be done by applying Lawvere’s fixed point theorem in the category of assemblies. There the hom set Hom(N, N) are exactly the total computable functions on natural numbers. Suppose that the Halting problem is decidable. Then it induces a weakly point-surjective function h:N2 Nh : N \to 2^N that computes a domain of every computable function. From that, by the fixed point theorem you get a contradiction. Perhaps a similar argument can be given in Turing categories?

Posted by: Dan Frumin on September 1, 2019 2:49 PM | Permalink | Reply to this

Re: Turing Categories

Hi, there’s a typo in equation (10), the last part of the exponent should be downstairs.

Posted by: Matteo Capucci on August 30, 2019 7:51 AM | Permalink | Reply to this

Re: Turing Categories

Good eye; thanks.

Posted by: Christian Williams on August 30, 2019 6:04 PM | Permalink | Reply to this

Re: Turing Categories

Great post. I have just few doubts about the definition of AA-computable morphisms.

In the definition you say that f:A nAf \colon A^n \to A is computable if and only if you can find a total code pp such that f= np,1f=\bullet_{n} \circ \langle p,1\rangle (or f np,1f\leq\bullet_{n} \circ \langle p,1\rangle, more generally). Nevertheless I missed where n\bullet_{n} is defined, it seems that the combinatorial algebra structure only defines 1=\bullet_{1}=\bullet. How do we get the other evaluation-morphisms from :A×AA\bullet \colon A \times A \to A?

Also you said

A morphism A nA mA^n \to A^m is computable if all components are, and a morphism A nAA^n \to A is computable if its domain is (as a map A nA nA^n \to A^n).

I do not understand the redefinition of AA-computability: we have defined above such notion for morphisms of type A nAA^n \to A, or am I missing something?

Thanks in advance for any reply.

Posted by: Giorgio Mossa on August 30, 2019 2:42 PM | Permalink | Reply to this

Re: Turing Categories

Bob Harper wittily expressed your “untyped means one-typed” point by calling dynamically typed languages “uni-typed”: https://existentialtype.wordpress.com/2011/03/19/dynamic-languages-are-static-languages .

Posted by: L Spice on August 30, 2019 8:58 PM | Permalink | Reply to this

Re: Turing Categories

This is a little diary-on-the-newsstand uncanny for me, since I’ve spent my spare time of the past few weeks investigating a very similar family of categories.

I cannot easily draw a tombstone diagram, so let me instead say that a tombstone is a compiler from some source language to some target language, written in some third host language. By “language” I mean code objects, but equipped with an intended semantics.

If we assume that the host language is nice enough, then we can have a category. Tombstones have a “horizontal” composition using program composition in the host language.

However, we also have a “vertical” composition, where we can compile the tombstone to a new host language. And what performs that compilation? Another tombstone! In particular, when the new host language is also the target language, then this tombstone has the effect of bringing every compiler from the old host language to the new host language. If it is abstract enough as a compiler, then it is effectively a functor from the old host language to the new host language.

Actually building the functor requires Kleene’s Parameter Theorem (also known as the s-m-n rule). This is important!

We can repeat this construction, yielding an ∞-category of tombstones interconnecting all of the languages. At this point in my research, I’m trying to do three things:

  • Write some code to demonstrate some cells in this ∞-category, and investigate internal Galois connections
  • See whether the Futamura projections exist here, and whether this is really a 4-category with all additional structure collapsed/trivialized
  • Don’t interconnect all of the languages

On this last point, I’m reminded of the trouble with defining “nice” topological spaces. We don’t want to allow every language, but we probably have some basic demands.

We probably want each language to admit a Turing category, for starters. (That’s the reason that I’m writing these words here, after all!) That alone will cover many requirements I’d independently found, including the code morphism 1 → A, the Cartesian closed structure, and Kleene’s Parameter Theorem. We’d also like for each compiler to always halt, which I think I might have to give up on. We could ask that every compiler is “Jones-optimal”, a bit of jargon meaning that it is easily optimized away by a chosen normalization technique.

Anyway, all of this is hopefully known. If not, then I’ve been calling this structure “Tomb, the ∞-category whose objects are compilers”.

Posted by: Corbin on September 7, 2019 2:48 AM | Permalink | Reply to this

Re: Turing Categories

This is extremely interesting. Thank you for your comment, and sorry for the slow reply.

After reading about tombstones and Futamura projections, there is still much in your comment that I do not understand. What is the categorical structure of the Futamura projections, and why would this collapse structure about dimension 4?

Do you have any notes or preprints about these ideas? Thank you.

Posted by: Christian Williams on September 13, 2019 2:54 AM | Permalink | Reply to this

Re: Turing Categories

Honestly, what I’ve written so far here is closer to real notes than anything else. I’ve got a lot of windows converted into temporary whiteboards, but they don’t really say anything interesting about the current topic.

I recently observed Turing categories and the extra structure of Tomb in a comment on a Computerphile video.

For Futamura projections, let’s focus on specializers only. When we generate a cogen program, the specializers used to create it each leave a layered mark of interpretative overhead, intermingling with each other. Each time we mix the contents of a cogen into another specializer, we pack another layer of snow onto the snowball, squishing it down. However, unlike with snowballs, it turns out that there are only up to three layers of residual interpreter that each specializer can leave on cogen. The fourth layer, when applied, forgets the first layer’s overhead entirely. The paper “Is There A Fourth Futamura Projection?” has many diagrams and justifications for my extremely informal explanation.

So, if 0-cells are Turing-complete languages, and 1-cells are compilers written in those languages, then 2-cells are compiler compilers. They send compilers to compilers in a way that ensures that each compiler still acts on its original source and target languages. (This is related to “fully abstract compilation”, I think, but I’m not exactly sure how yet.) And 3-cells are of the nature of cogen implementations; they send compiler compilers to compiler compilers.

What does this mean for, say, a 4-cell? Well, the 4-cell sends 3-cells to 3-cells by adding on its own layer of interpreter, and pushing out each of the layers underneath. But, because of the above nature of Futamura projections, a fourth projection can’t be aware of the very first projection’s language, can it? It seems to me that there is a cutoff, and that instead we get a universal applicator of some sort at the bottom, which can no longer treat individual languages as distinct, but instead gloms them all into one single object.

Hm, glomming everything into a single object sounds like Turing objects, though. At this point, I’m out of fully-formed ideas. But I’m still thinking about it. Sorry for the even-slower reply.

Posted by: Corbin on October 6, 2019 6:53 PM | Permalink | Reply to this

Post a New Comment