## November 21, 2006

### Classical vs Quantum Computation (Week 7)

#### Posted by John Baez

Here are this week’s notes on Classical versus Quantum Computation:

• Week 7 (Nov. 21) - The untyped lambda-calculus, continued. “Building a computer” inside the free cartesian closed category on an object $X$ with $X = \mathrm{hom}(X,X)$. Operations on booleans. The "if-then-else" construction. Addition and multiplication of Church numerals. Defining functions recursively: the astounding Fixed Point Theorem.

Last week’s notes are here; next week’s notes are here.

Now we’re finally getting to Church’s work on computable functions. We’re sketching how to define all partial recursive functions from the natural numbers to themselves using the untyped lambda calculus. The trick is to start with the free cartesian closed category on an object $X$ with $X = \mathrm{hom}(X,X)$ and think of the natural number $n$ as giving an operation $\overline{n} : \mathrm{hom}(X,X) \to \mathrm{hom}(X,X),$ namely “taking the $n$th power” of any endomorphism $f: X \to X$: $\overline{n}: f \mapsto f^n .$ These operations $\overline{n}$ are called Church numerals. They are just a way of getting natural numbers into our cartesian closed category. Using the lambda calculus, we can then define addition, multiplication, etc. for these morphisms.

The big theorem is that using the lambda calculus, we can define any partial recursive function on Church numerals. In other words: any partial function $F: \mathbb{N} \to \mathbb{N}$ that any known sort of computer can compute, can be defined using the lambda calculus!

We don’t prove this theorem, but we sketch how to define the factorial function using the lambda calculus. This is tricky because the definition of the factorial is recursive: $factorial(n) = if \; n=0 \; then \; 1 \; else \; n \cdot factorial(n-1)$ To handle this, we use an astounding result called the Fixed Point Theorem.

I’d like to know who first proved this theorem. Was it Church, Turing, both, or neither? The proofs I’ve seen often use something called the ‘Turing fixed point combinator’. Turing came to Princeton to do his PhD thesis under Church after he discovered that his Turing machine definition of ‘computable function’ had been scooped by Church’s work. So, I’m wondering if the proof I give is due to some combination of Church and Turing. It’s devastatingly elegant.

Posted at November 21, 2006 9:25 PM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/1046

Read the post Classical vs Quantum Computation (Week 6)
Weblog: The n-Category Café
Excerpt: From lambda-terms to string diagrams. Building a computer inside a cartesian closed category with an object X with X ≅ hom(X,X).
Tracked: November 21, 2006 9:52 PM

### Exercises

Test for zero:

(1)$n(x\mapsto F)(T)$

If $n=0$ then $(x\mapsto F)$ is not applied to $T$, and the result is $T$; otherwise it maps $T\mapsto F\mapsto F \mapsto \cdots \mapsto F$ ($n$ times) and the result is $F$.

Decrement: Rather more complicated. The trick is to define a new function $g$ and new input $y$ in terms of $f$ and $x$ such that $g$ can distinguish the first application from the latter ones. So we turn the input into a pair consisting of a flag and the input. We unset the flag once we’ve applied the function. So let

(2)$y=\langle T,x\rangle$

and

(3)$g=f\mapsto y\mapsto if (head(y))\; then\; (x)\; else\; (\langle F, f(tail(y))\rangle).$

Now define

(4)$dec=n\mapsto f\mapsto x\mapsto tail(\;n\;(g(f))\;(\langle T,x\rangle)\;),$

where

(5)$\langle a,b\rangle = z\mapsto z(a)(b),$
(6)$head(p) = p(T),$

and

(7)$tail(p) = p(F).$

Exponentiation: Note that we write $n(f)$ as $f^n$. Taking this literally, write

(8)$f(x)\; as\; x^f$
(9)$n(f)(x)\; as\; x^{f^n}$
(10)$n(m(f))(x)\; as\; x^{m(f)^n} = x^{(f^n)^m} = x^{f^{nm}}$

and

(11)$n(m)(f)(x) \;as \; x^{f^{m^n}}$
Posted by: Mike Stay on November 22, 2006 7:49 PM | Permalink | Reply to this

### Re: Exercises

Excellent stuff! Thanks for doing these.

Let’s hope someone else takes up the other challenges in these course notes, including computing 3! using the method described.

Posted by: John Baez on November 23, 2006 12:33 AM | Permalink | Reply to this

### Re: Classical vs Quantum Computation (Week 7)

Tom Payne from the computer science department at UCR is sitting in on this course. Here is his answer to my question above about the Fixed Point Theorem:

The “astounding” theorem is commonly called Kleene’s Second Recursion Theorem or simply the Recursion Theorem.

It was discovered by Stephen Kleene in 1938 a couple of years after Turing discovered the notion of a Universal Turing Machine, which is essentially what is needed to prove this. It then reappeared as Curry’s Y combinator, which he called the “paradoxical combinator”. Of course, Godel had already applied the self-application form of the diagonal argument in 1931 in showing his incompleteness theorem. Also Tarski had applied it again in 1936 in showing the undefinability of truth.

A couple of decades before “Curry” had become a verb, I asked Burton Jones who was the first mathematician to play games with treating a function of two variables as a function of one variable into a space of functions of one variable, e.g., writing $f(x,y)$ as $f_x(y)$. He said that he thinks it was Schönfinkel.

OOPS! WOW!! I just now looked up Schönfinkel in the Wikipedia and look at what I found:

“In a talk given at Goettingen in 1920, Schönfinkel invented combinatory logic, which was further developed by Haskell Curry (who studied under Hilbert in the late 1920s). Heinrich Behmann revised the text of this talk and published it in 1924. This article also introduced what is now called currying, named after Haskell Curry. Schönfinkel wrote nothing else about combinatory logic; its entire development after 1924 is due to Curry and others.”

Very interesting.

Regards,
Tom

Posted by: John Baez on November 23, 2006 12:45 AM | Permalink | Reply to this

### Re: Classical vs Quantum Computation (Week 7)

Speaking of fixed points, there’s a fun way of formulating Cantor’s theorem in any CCC: Suppose given objects A, B and an isomorphism AB → B. Then every morphism f: A → A has a fixed point, i.e., a morphism a: 1 → A s.t. f(a) = a.

[Cantor’s theorem of course follows by considering f = negation: 2 → 2, which has no fixed point. Of course, there is no fixed point for the negation operator on any Heyting algebra A.]

I don’t know how to make this really precise, but it reminds me of one of those “macrocosm/microcosm” situations: at the macrocosm level we have a “fixed point” B of a contravariant functor A-: C → C, and we use this to produce fixed points of maps A → A at the microcosm level.

Posted by: Todd Trimble on November 23, 2006 3:22 PM | Permalink | Reply to this

### Re: Classical vs Quantum Computation (Week 7)

On the same theme, Yanofsky’s second paper here is worth reading.

Posted by: David Corfield on November 23, 2006 3:39 PM | Permalink | Reply to this

### Re: Classical vs Quantum Computation (Week 7)

That’s really cool, Todd! I need more time to think about this fixed-point theorem / diagonal argument stuff. I still don’t feel I get the essence. In a while I’ll post another comment by Tom Payne about this.

It would be neat if there were a systematic ‘microcosm principle’ thing going on here. The microcosm principle is one of those patterns that seems to extend beyond all our attempts to formalize it.

Posted by: John Baez on November 26, 2006 12:10 AM | Permalink | Reply to this

### Re: Classical vs Quantum Computation (Week 7)

Tom Payne sent me another email:

Hi John,

I’m doing a bit more research on the history of what you’ve called “The Astounding Theorem,” and I often call “The Fixed-Point Variant of the Diagonal Argument” or, more graphically, “The Scissor-Hinge Lemma.” According to the Wikipedia entry Cantor’s first uncountability proof, Cantor discovered the diagonal argument, his second proof of the uncountability of the reals, in 1877.

Visibly, self-application is a form of diagonalization, and the mathematical use of self-application seems to trace back to Richard’s Paradox, 1905. In that paradox we enumerate the properties of natural numbers, and ask which properties do not apply to their own index, i.e., position in the enumeration. Such indexes are called “Richardians.” We then consider the index of the property of being a Richardian, with obviously paradoxical consequences.

Gödel’s incompleteness theorem (1931) is essentially a formalization of Richard’s paradox:

“To every omega-consistent recursive set K of formulae, there corresponds” a non-member of K whose negation is also non-member of K. [I copied this from the Wikipedia, but the portion outside the quotes omits some technical tedium. THP]

which simply says that the composition of K’s characteristic function with negation has a fixed point (which can’t be in the set because that would make the set inconsistent and thus omega-inconsistent). Of course, the K of interest was the set of provable formulae.

Note that Gödel already had a mathematical definition of “recursive set,” and that he obviously realized that it was a relevant definition of the notion of “effective computability.” This was five years before Church’s Thesis.

(For what follows my primary reference is the 2006 history of these matters, The Logic of Curry and Church by Jonathan Seldin.)

At the time both Church and Curry were working on theories of functions they hoped could serve as the foundation of mathematics. Obviously, Gödel’s result posed questions about their systems. But, Curry was at Penn State and had a huge teaching load and no graduate students, Church was at Princeton and had Kleene and Rosser as graduate students, and he put them to work:

Next came Kleene’s dissertation [Kleene, 1935], which was based on Church’s definition of natural numbers. Kleene showed that Church’s system was adequate for the elementary theory of natural numbers by proving the third and fourth Peano axioms. In the course of doing this, he showed that a number of (recursive) functions could be represented as lambda-terms, including functions that Church had not thought could be represented that way. Rosser had also found some such representations in work that was never published, and by the fall of 1933, Church began to speculate that every function of positive integers that can be effectively calculated can be represented as a lambda-term. By early the following year, Church was convinced of what is now known as “Church’s Thesis.” See [Rosser, 1984, p. 334].

But meanwhile, in late 1934, Kleene and Rosser succeeded in proving that Church’s system of logic was inconsistent by showing that Richard’s paradox could be derived in it. [Kleene and Rosser, 1935]. Their proof also applied to the system of combinatory logic then being developed by Curry (see AS2.3 below).

Church, Kleene, and Rosser responded to this inconsistency by extracting from Church’s system of logic what is now called the pure lambda-calculus: the part of the system involving the formation of lambda-terms and the rules for reduction and conversion. In Church’s Thesis they already had an indication of the importance of pure lambda-calculus, and in the years following the discovery of the inconsistency, they obtained a number of major results, results which fully justified treating the pure lambda-calculus as a system on its own.

Per the Wikipedia:

One well-known (and perhaps the simplest) fixed point combinator in the untyped lambda calculus is called the Y combinator. It was discovered by Haskell B. Curry, and is defined as

$Y = \lambda f . ( \lambda x . f(x x) ) ( \lambda x . f(x x) )$

And, per Seldin: “Finally, [Curry] found a much simpler contradiction that could be derived under the assumptions of the Kleene-Rosser paradox in [Curry, 1942b].” I’m guessing that this 1942 work was where Curry introduced this fixed-point combinator, but I’d like to know for sure.

Note that Kleene had already published his Recursion Theorem in 1938, which involved the same construction, but in the context of recursive-function theory. Almost surely, he and Rosser used it in replicating Gödel’s work (i.e., constructing Richard’s paradox) within Church’s system of logic. And note that Seldin said that “Their proof also applied to the system of combinatory logic then being developed by Curry.” So, I’m betting that The Astounding Theorem appears in some form in:

S. C. Kleene and J. B. Rosser, The inconsistency of certain formal logics, Annals of Mathematics 36 (1935) 630-636.

Regards,
Tom

Posted by: John Baez on November 25, 2006 7:37 AM | Permalink | Reply to this

### Re: Classical vs Quantum Computation (Week 7)

John Baez wrote (in email):

I need to think harder to really grok the connection between all these diagonalization arguments… I studied a lot of them in school, but not the “astounding theorem”, so that one still seems mysterious to me.

Best,
jb

Tom Payne replied:

Hi John,

So we see in our mind’s eye that if a function $f$ maps the diagonal of a matrix componentwise to a row of that same matrix (i.e., closes the scissors) then the value at the point where that row crosses the diagonal has to be a fixed point of $f$; it’s the hinge of the scissors.

Now any function $g$ of two variables can be viewed as a matrix. And its diagonal is essentially $\lambda x.g(x,x)$. So what would it mean for $f$ to map $\lambda x.g(x,x)$ to the $b$th row? That would mean that $\lambda x. f(g(x,x)) = \lambda x. g(b,x).$ So at the diagonal $(\lambda x. f(g(x,x)))(b) = (\lambda x .g(b,x))(b).$ Which is to say that $f(g(b,b)) = g(b,b)$.

So, consider the case where $g$ is the evaluation function. Then we have $f(b(b))$ = $b(b)$, where $b$ denotes the lambda-definable function $\lambda x. f(x(x))$ That’s assuming that $b(b)$ is defined, which it’s assumed to be in lambda-calculus, but not in real life, i.e., not in the category of sets. Rather, we have to go to the category of continuous lattices (or semilattices) to make that happen.

But in real life we have universal Turing machines, which compute the application of a program (which is a piece of data) to another piece of data. Unfortunately the result isn’t always defined, since sometimes the calculation goes on forever.

But Currying can be “always defined,” in the sense that for every program $p$ there is a program $q$ such that, for every $x$, $q(x)$ is always defined, and $\lambda y . (q(x))(y) = \lambda y . (p(x))(y)$

If $r$ is any program, there’s a program $p$ with $p(x) = r(x(x))$ So, there is an always defined $q$ such that $\lambda x \lambda y . (q(x)(y)) = \lambda x \lambda y. r(x(x)))(y)$ So $(\lambda x \lambda y . q(x)(y))(q) = (\lambda x \lambda y . (r(x(x)))(y))(q)$ So $\lambda y . q(q)(y) = \lambda y . (r(q(q)))(y)$ Thus, $q(q)$ exists and $q(q)$ and $r(q(q))$ are programs for the same function, i.e., $q(q)$ is a fixed point of $r$ viewed as a transformation on functions. And that’s Kleene’s Second Recursion Theorem (1938), which is to programs what the Astounding Theorem is to functions.

Regards,
Tom

Posted by: John Baez on November 26, 2006 12:33 AM | Permalink | Reply to this
Read the post Classical vs Quantum Computation (Week 8)
Weblog: The n-Category Café
Excerpt: How Curry's Fixed Point Theorem is related to Cantor's diagonal argument.
Tracked: December 1, 2006 1:41 AM

Post a New Comment