## June 10, 2017

### Eliminating Binders for Easier Operational Semantics

#### Posted by John Baez guest post by Mike Stay

Last year, my son’s math teacher introduced the kids to the concept of a function. One of the major points of confusion in the class was the idea that it didn’t matter whether he wrote $f(x) = x^2$ or $f(y) = y^2$, but it did matter whether he wrote $f(x) = x y$ or $f(x) = x z$. The function declaration binds some of the variables appearing on the right to the ones appearing on the left; the ones that don’t appear on the left are “free”. In a few years when he takes calculus, my son will learn about the quantifiers “for all” and “there exists” in the “epsilon-delta” definition of limit; quantifiers also bind variables in expressions.

Reasoning formally about languages with binders is hard:

“The problem of representing and reasoning about inductively-defined structures with binders is central to the PoplMark challenges. Representing binders has been recognized as crucial by the theorem proving community, and many different solutions to this problem have been proposed. In our (still limited) experience, none emerge as clear winners.” – Aydemir, Bohannon, Fairbairn, Foster, Pierce, Sewell, Vytiniotis, Washburn, Weirich, and Zdancewic, Mechanized metatheory for the masses: The PoplMark challenge. (2005)

The paper quoted above reviews around a dozen approaches in section 2.3, and takes pains to point out that their review is incomplete. However, recently Andrew Pitts and his students (particularly Murdoch Gabbay) developed the notion of a nominal set (introductory slides, book) that has largely solved this problem. Bengston and Barrow use a nominal datatype package in Isabell/HOL to formalize $\pi$-calculus, and Clouston defined nominal Lawvere theories. It’s my impression that pretty much everyone now agrees that using nominal sets to formally model binders is the way forward.

Sometimes, though, it’s useful to look backwards; old techniques can lead to new ways of looking at a problem. The earliest approach to the problem of formally modeling bound variables was to eliminate them.

# Abstraction elimination

$\lambda$-calculus is named for the binder in that language. The language itself is very simple. We start with an infinite set of variables $x, y, z, \ldots$ and then define the terms to be

$t, t' ::= \left\{ \begin{array}{lr}x & variable \\ \lambda x.t & abstraction \\ (t\; t') & application\end{array}\right.$

Schönfinkel’s idea was roughly to “sweep the binders under the rug”. We’ll allow binders, but only in the definition of a “combinator”, one of a finite set of predefined terms. We don’t allow binders in any expression using the combinators themselves; the binders will all be hidden “underneath” the name of the combinator.

To eliminate all the binders in a term, we start at the “lowest level”, a term of the form $t = \lambda x.u,$ where $u$ only has variables, combinators, and applications; no abstractions! Then we’ll try to find a way of rewriting $t$ using combinators instead. Since the lambda term $\lambda x.(v\; x)$ behaves the same as $v$ itself, if we can find some $v$ such that $v x = u$, then the job’s done.

Suppose $u=x$. What term can we apply to $x$ and get $x$ itself back? The identity function, obviously, so our first combinator is

$I = \lambda x.x.$

What if $u$ doesn’t contain $x$ at all? We need a “konstant” term $K_u$ such that $(K_u\; x)$ just discards $x$ and returns $u.$ At the same time, we don’t want to have to specify a different combinator for each $u$ that doesn’t contain $x$, so we define our second combinator $K$ to first read in which $u$ to return, then read in $x$, throw it away, and return $u:$

$K = \lambda u x.u.$

Finally, suppose $u$ is an application $u = (w\; w').$ The variable $x$ might occur in $w$ or in $w'$ or both. Note that if we recurse on each of the terms in the application, we’ll have terms $r, r'$ such that $(r\; x) = w$ and $(r'\; x) = w'$, so we can write $u = ((r\; x)\; (r'\; x)).$ This suggests our final combinator should read in $r, r'$, and $x$ and “share” $x$ with them:

$S = \lambda r r'x.((r\; x)\; (r'\; x)).$

If we look at the types of the terms $S, K,$ and $I$, we find something interesting:

$\begin{array}{rl}I:&Z \to Z\\K:&Z \to Y \to Z\\S:&(Z \to Y\to X)\to (Z\to Y) \to Z \to X\end{array}$

The types correspond exactly to the axiom schemata for positive implicational logic!

The $S K I$-calculus is a lot easier to model formally than the $\lambda$-calculus; we can use a tiny Gph-enriched Lawvere theory (see the appendix) to capture the operational semantics and then derive the denotational semantics from it.

# $\pi$-Calculus

As computers got smaller and telephony became cheaper, people started to connect them to each other. ARPANET went live in 1969, grew dramatically over the next twenty years, and eventually gave rise to the internet. ARPANET was decommissioned in 1990; that same year, Robin Milner published a paper introducing the $\pi$-calculus. He designed it to model the new way computation was occurring in practice: instead of serially on a single computer, concurrently on many machines via the exchange of messages. Instead of applying one term to another, as in the lambda and SK calculi, terms (now called “processes”) get juxtaposed and then exchange messages. Also, whereas in $\lambda$-calculus a variable can be replaced by an entire term, in the $\pi$-calculus names can only be replaced by other names.

Here’s a “good parts version” of the asynchronous $\pi$-calculus; see the appendix for a full description.

$\begin{array}{rll}P, Q ::= \quad& 0 & do\; nothing \\ \;|\; & P|Q & concurrency \\ \;|\; &for (x \leftarrow y) P & input \\ \;|\; & x!y & output \\ \;|\; & \nu x.P & new\; name \\ \;|\; & !P& replication \end{array}$

$x!z \;|\; for(y \leftarrow x).P \Rightarrow P\{z/y\}\quad communication\; rule$

There are six term constructors for $\pi$-calculus instead of the three in $\lambda$-calculus. Concurrency is represented with a vertical bar |, which forms a commutative monoid with 0 as the monoidal unit. There are two binders, one for input and one for introducing a new name into scope. The rewrite rule is reminiscent of a trace in a compact closed category: $x$ appears in an input term and an output term on the left-hand side, while on the right-hand side $x$ doesn’t appear at all. I’ll explore that relationship in another post.

The syntax we use for the input prefix is not Milner’s original syntax. Instead, we borrowed from Scala, where the same syntax is syntactic sugar for $M(\lambda x.P)(y)$ for some monad $M$ that describes a collection. We read it as “for a message $x$ drawn from the set of messages sent on $y$, do $P$ with it”.

For many years after Milner proposed the $\pi$-calculus, researchers tried to come up with a way to eliminate the bound names from a $\pi$-calculus term. Yoshida was able to give an algorithm for eliminating the bound names that come from input prefixes, but not those from new names. Like the abstraction elimination algorithm above, Yoshida’s algorithm produced a set of concurrent combinators. There’s one combinator $m(a, x)$ for sending a message $a$ on the name $x$, and several others that interact with $m$’s to move the computation forward (see the appendix for details):

$\begin{array}{rlr} d(a,b,c) | m(a,x) & \Rightarrow m(b,x) | m(c,x) & (fanout)\\ k(a) | m(a,x) & \Rightarrow 0 & (drop) \\ fw(a,b) | m(a,x) & \Rightarrow m(b,x) & (forward) \\ br(a,b) | m(a,x) & \Rightarrow fw(b,x) & (branch\; right)\\ bl(a,b) | m(a,x) & \Rightarrow fw(x,b) & (branch\; left) \\ s(a,b,c) | m(a,x) & \Rightarrow fw(b,c) & (synchronize)\end{array}$

Unlike the $S K I$ combinators, no one has shown a clear connection between some notion of type for these combinators and a system of logic.

# Reflection

Several years ago, Greg Meredith had the idea to combine the set of $\pi$-calculus names and the set of $\pi$-calculus terms recursively. In a paper with Radestock he introduced a “quoting” operator I’ll write & that turns processes into names and a “dereference” operator I’ll write * that turns names into processes. They also made the calculus higher-order: they send processes on a name and receive the quoted process on the other side.

$x!\langle Q\rangle \;|\; for(y \leftarrow x).P \Rightarrow P\{\&Q/y\}\quad communication\; rule$

The smallest process is 0, so the smallest name is &0. The next smallest processes are

$\&0\langle 0 \rangle \quad and \quad for(\&0 \leftarrow \&0) \; 0,$

which in turn can be quoted to produce more names, and so on.

Together, these two changes let them demonstrate a $\nu$-elimination transformation from $\pi$ calculus to their reflective higher-order (RHO) calculus: since a process never contains its own name ($\&P$ cannot occur in $P$), one can use that fact to generate names that are fresh with respect to a process.

Another benefit of reflection is the concept of “namespaces”: since the names have internal structure, we can ask whether they satisfy propositions. This lets Meredith and Radestock define a spatial-behavioral type system like that of Caires, but more powerful. Greg demonstrated that the type system is strong enough to prevent attacks on smart contracts like the \$50M one last year that caused the fork in Ethereum.

In our most recent pair of papers, Greg and I consider two different reflective higher-order concurrent combinator calculi where we eliminate all the bound variables. In the first paper, we present a reflective higher-order version of Yoshida’s combinators. In the second, we note that we can think of each of the term constructors as combinators and apply them to each other. Then we can use $S K I$ combinators to eliminate binders from input prefixes and Greg’s reflection idea to eliminate those from $\nu.$ Both calculi can be expressed concisely using Gph-enriched Lawvere theories.

In future work, we intend to present a type system for the resulting combinators and show how the types give axiom schemata.

# Appendix

## Gph-theory of $S K I$-calculus

• objects
• $T$
• morphisms
• $S, K, I\colon 1 \to T$
• $(-\; -)\colon T \times T \to T$
• equations
• none
• edges
• $\sigma\colon (((S\; x)\; y)\; z) \Rightarrow ((x\; z)\; (y\; z))$
• $\kappa\colon ((K\; x)\; z) \Rightarrow x$
• $\iota\colon (I\; z) \Rightarrow z$

## $\pi$-calculus

• grammar
• $\begin{array}{rll}P, Q ::= \quad & 0 & do\; nothing \\ \;|\; & P|Q & concurrency \\ \;|\; &for (x \leftarrow y).P & input \\ \;|\; & x!y & output \\ \;|\; & \nu x.P & new\; name \\ \;|\; & !P& replication \end{array}$
• structural equivalence
• free names
• $FN(0) = \{\}$
• $FN(P|Q) = FN(P) \cup FN(Q)$
• $FN(for (x \leftarrow y).P) = \{y\} \cup FN(P) - \{x\}$
• $FN(x!y) = \{x,y\}$
• $FN(\nu x.P) = FN(P) - \{x\}$
• $FN(!P) = FN(P)$
• 0 and | form a commutative monoid
• $P|0 \equiv P$
• $P|Q \equiv Q|P$
• $(P|Q)|R \equiv P|(Q|R)$
• replication
• $!P \equiv P\;|\;!P$
• $\alpha$-equivalence
• $for (x \leftarrow y).P\equiv for (z \leftarrow y).P\{z/x\}$ where $z$ is not free in $P$
• new names
• $\nu x.\nu x.P \equiv \nu x.P$
• $\nu x.\nu y.P \equiv \nu y.\nu x.P$
• $\nu x.(P|Q) \equiv (\nu x.P)|Q$ when $x$ is not free in $Q$
• rewrite rules
• $x!z \;|\; for(y \leftarrow x).P \Rightarrow P\{z/y\}$
• if $P \Rightarrow P'$, then $P\;|\; Q \Rightarrow P'\;|\; Q$
• if $P \Rightarrow P'$, then $\nu x.P \Rightarrow \nu x.P'$

To be clear, the following is not an allowed reduction, because it occurs under an input prefix:

$for(v \leftarrow u).(x!z \;|\; for(y \leftarrow x).P) \nRightarrow for(v \leftarrow u).P\{z/y\}$

## Yoshida’s combinators

• grammar
• atom: $Q ::= 0 \;|\; m(a,b) \;|\; d(a,b,c) \;|\; k(a) \;|\; fw(a,b) \;|\; br(a,b) \;|\; bl(a,b) \;|\; s(a,b,c)$
• process: $P ::= Q \;|\; \nu a.P \;|\; P|P \;|\; !P$
• structural congruence
• free names
• $FN(0) = \{\}$
• $FN(k(a)) = \{a\}$
• $FN(m(a,b)) = FN(fw(a,b)) = FN(br(a,b)) = FN(bl(a,b)) = \{a, b\}$
• $FN(d(a,b,c)) = FN(s(a,b,c)) = \{a,b,c\}$
• $FN(\nu a.P) = FN(P) - \{x\}$
• $FN(P|Q) = FN(P)\cup FN(Q)$
• $FN(!P) = FN(P)$
• 0 and | form a commutative monoid
• $P|0 \equiv P$
• $P|Q \equiv Q|P$
• $(P|Q)|R \equiv P|(Q|R)$
• replication
• $!P \equiv P\;|\;!P$
• new names
• $\nu x.\nu x.P \equiv \nu x.P$
• $\nu x.\nu y.P \equiv \nu y.\nu x.P$
• $\nu x.(P|Q) \equiv (\nu x.P)|Q$ when $x$ is not free in $Q$
• rewrite rules
• $d(a,b,c) | m(a,x) \Rightarrow m(b,x) | m(c,x)$ (fanout)
• $k(a) | m(a,x) \Rightarrow 0$ (drop)
• $fw(a,b) | m(a,x) \Rightarrow m(b,x)$ (forward)
• $br(a,b) | m(a,x) \Rightarrow fw(b,x)$ (branch right)
• $bl(a,b) | m(a,x) \Rightarrow fw(x,b)$ (branch left)
• $s(a,b,c) | m(a,x) \Rightarrow fw(b,c)$ (synchronize)
• $!P \Rightarrow P|!P$
• if $P \Rightarrow P'$ then for any term context $C$, $C[P] \Rightarrow C[P']$.

## Gph-theory of RHO Yoshida combinators

• objects
• $N, T$
• morphisms
• $0\colon 1 \to T$
• $k\colon T \to T$
• $m\colon N \times T \to T$
• $fw,br,bl\colon N^2 \to T$
• $d,s\colon N^3 \to T$
• $|\colon T^2 \to T$
• $*\colon N \to T$
• $\&\colon T \to N$
• equations
• 0 and | form a commutative monoid
• $P|0 = P$
• $P|Q = Q|P$
• $(P|Q)|R = P|(Q|R)$
• $*\&P = P$
• edges
• $\delta\colon d(a,b,c) | m(a,P) \Rightarrow m(b,P) | m(c,P)$
• $\kappa\colon k(a) | m(a,P) \Rightarrow 0$
• $\phi\colon fw(a,b) | m(a,P) \Rightarrow m(b,P)$
• $\rho\colon br(a,b) | m(a,P) \Rightarrow fw(b,\&P)$
• $\lambda\colon bl(a,b) | m(a,P) \Rightarrow fw(\&P,b)$
• $\sigma\colon s(a,b,c) | m(a,P) \Rightarrow fw(b,c)$

## Gph-theory of RHO $\pi$-like combinators

• objects
• $T$
• morphisms
• $C, 0, |, for, !, \&, *, S, K, I\colon 1 \to T$
• $(-\; -)\colon T\times T \to T$
• equations
• 0 and | form a commutative monoid
• $((|\; 0)\; P) = P$
• $((|\; ((|\; P)\; Q))\; R)=((|\; P)\; ((|\; Q)\; R)$
• $((|\; P)\; Q) = ((|\; Q)\; P)$
• edges
• $\sigma\colon(((S\; P)\; Q)\; R) \Rightarrow ((P\; R)\; (Q\; R))$
• $\kappa\colon((K\; P)\; Q) \Rightarrow P$
• $\iota\colon(I\; P) \Rightarrow P$
• $\xi\colon((|\; C)\; ((|\; ((for\; (\&\; P))\; Q))\; ((!\; (\&\; P))\; R))) \Rightarrow ((|\; C)\; (Q\; (\&\; R)))$
• $\epsilon\colon((|\; C)\;(*\; (\&\; P))) \Rightarrow ((|\; C)\; P)$
Posted at June 10, 2017 1:24 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2967

### Re: Eliminating Binders for Easier Operational Semantics

Clouston defined nominal Lawvere theories.

has a bogus link which should be http://cs.au.dk/~ranald/Clouston_WoLLIC.pdf.

In a paper with Radestock he introduced

Posted by: RodMcGuire on June 12, 2017 4:45 PM | Permalink | Reply to this

### Re: Eliminating Binders for Easier Operational Semantics

Posted by: Mike Stay on June 13, 2017 5:13 PM | Permalink | Reply to this

### Re: Eliminating Binders for Easier Operational Semantics

The “linear” version of SKI-calculus is BCI-calculus, whose combinators also have exactly the types of the basic operations in a symmetric closed category. Thus SKI-calculus ought to correspond to a sort of “cartesian (closed category)” (I added parentheses to distinguish it from the usual meaning of “cartesian closed category”, which is more precisely a “closed cartesian monoidal category”). Does eliminating binders from $\pi$-calculus make its categorical semantics any clearer?

Posted by: Mike Shulman on June 12, 2017 6:50 PM | Permalink | Reply to this

### Re: Eliminating Binders for Easier Operational Semantics

Does eliminating binders from $\pi$-calculus make its categorical semantics any clearer?

That’s really the point of this project that Greg and I are undertaking, but we haven’t proved exactly what categorical structure $\pi$-calculus is describing yet. Here’s what we have so far.

Given the Gph-enriched Lawvere theory of the SKI-calculus, we can mod out by edges and then consider the monoid of endomorphisms on the generating object $T$. This monoid is a cartesian closed category, corresponding to the “untyped” SKI calculus. In particular, we can represent a “pairing” combinator $P = \lambda xyz.((z\; x)\; y)$ with projections $\pi_1 = \lambda p.(p\; K)\quad and\quad \pi_2 = \lambda p.(p\; (K\; I)),$ so it’s cartesian. There’s also the obvious isomorphism between functions on pairs and functions that expect one argument and return a curried function expecting the second argument, so it’s closed with respect to the pairing operator.

If we do the same thing for the BCI combinators, then we can still encode the pairing operator (since it uses each variable only once), but we can’t encode the projections, so the resulting monoid is merely symmetric monoidal closed.

Let $G$ be the free model of the theory of some term calculus on the empty graph; suppose also that the theory has a sort $T$ for the terms of the calculus. We “oidify” the Gph-enriched monoid of endomorphisms on $T$ and get types by considering some theory of collections over the edges of $G.$ The primary example is the theory of Heyting algebras; this theory lets us describe (certain) subsets of edges, which correspond to (certain) subgraphs of $G.$ We get a new Gph-enriched category Typ whose objects are subgraphs of $G,$ whose morphisms are graph homomorphisms between them, and whose edges are graph transformations between the graph homomorphisms. Since one-hole term contexts induce graph homomorphisms, they provide a notion of a typing judgement in a typing context: given the type of the hole, we get the type of the term. Since one-hole collection constructors also induce graph homomorphisms, they provide a notion of subtyping: given a subgraph $H$ of $G,$ I can construct a larger subgraph $H'$ of $G$ around it, witnessing that $H\subset H'.$

Note that given an arbitrary term calculus, Typ may not even be monoidal, let alone cartesian closed. Those properties depend very much on the specifics of the calculus.

In applicative calculi like the $\lambda$-calculus and the SKI and BCI combinator calculi, the arrow type itself arises as a generalized “possibility” modality in the type system. Given a two-hole term context $K$ and two formulae $\phi, \psi,$ the modality $\phi\langle K\rangle \psi$ describes those terms $t$ such that there exists a term $u$ satisfying $\phi$ such that $K[t, u]$ may possibly reduce to a term satisfying $\psi.$ If we take $K[t, u] = (t\; u),$ then $\phi\langle K\rangle \psi = \phi \implies \psi.$

In $\pi$-calculus and its descendants, we can take $K[t,u] = t\;|\;u = u\;|\;t.$ When we do, we recover Caires’ “rely-guarantee” modality $\phi\langle K\rangle \psi = \phi \triangleright \psi.$ It captures a notion similar to “an object $\gamma$ such that I can trace out part of $\gamma \otimes \phi$ and get $\psi$.”

We’ve seen how to get types and judgements. Each reduction rule gives a different kind of cut rule; for example, in the SKI calculus, we get a rule

$\frac{\vdash S:(Z\implies Y \implies X)\implies (Z\implies Y)\implies Z \implies X \quad \Gamma \vdash t:Z\implies Y \implies X \quad \Delta \vdash u:Z\implies Y \quad \Phi \vdash v:Z}{\Gamma, \Delta, \Phi \vdash ((t\; v)\; (u\; v)):X}\sigma$

Composing a bunch of cut rules gives one possible way the program might execute, and the cut rules correspond to dinatural transformations between profunctors on Typ.

So I guess we’ve implicitly got a description of the kind of higer-categorical structure behind the RHO combinators I describe in the post, but I haven’t yet worked out all the details of that and checked whether it corresponds to something that’s already known. I can say that Typ for the RHO combinators has a tensor product given by |. Despite the fact that we can embed $\lambda$-calculus into this calculus, it’s not clear to me that it has cartesian products per se, since the embedding of a term has to be located at a name; my current guess is that names are related to dependent types.

So we’re close, and I’m working on writing all this stuff up in a paper as well as implementing a type checker for Rholang, the language for writing contracts on RChain.

Posted by: Mike Stay on June 13, 2017 7:57 PM | Permalink | Reply to this

### Re: Eliminating Binders for Easier Operational Semantics

Given the Gph-enriched Lawvere theory of the SKI-calculus, we can mod out by edges and then consider the monoid of endomorphisms on the generating object TT. This monoid is a cartesian closed category, corresponding to the untyped SKI calculus.

I’m a bit surprised by this – I would have expected something about Curry’s equations (see e.g. Selinger’s tutorial).

Posted by: Sam Staton on June 14, 2017 9:26 AM | Permalink | Reply to this

### Re: Eliminating Binders for Easier Operational Semantics

Excellent, thank you! I’d asked around before for equations that ought to hold between combinators, but nobody I talked to knew anything about it.

When we mod out by edges in this Gph-enriched theory approach, any two terms connected by a reduction end up in the same equivalence class. If two terms $t, u$ are declared equal by one of Curry’s axioms, then for all $x,$ $t x = u x$ in the monoid, but $t$ might not be equal to $u.$

It comes down to a question of extensionality; if we want extensionality to hold, we can add Curry’s equations to the theory; otherwise we can use the presentation I gave.

Posted by: Mike Stay on June 14, 2017 5:36 PM | Permalink | Reply to this

### Re: Eliminating Binders for Easier Operational Semantics

Wouldn’t it be easier to start out by considering a typed calculus in the first place, rather than trying to conjure types out of thin air in an untyped calculus?

Posted by: Mike Shulman on June 13, 2017 10:24 PM | Permalink | Reply to this

### Re: Eliminating Binders for Easier Operational Semantics

I agree that it would be easier, but there are a couple other considerations:

First, nearly all the existing programming languages we care about modeling, like C and Java, do not have spatial-behavioral type systems. Many programming languages in wide use, like JavaScript and Python, do not have static types at all. It is only recently that $\pi$-calculus got a type system more complicated than Milner’s simple sorting discipline on the channels.

We also want to reason about different ways of composing term calculi. Greg was working with Satnam Singh many years ago at Microsoft and they built a small RHO-calculus engine in hardware. It was so small that they could tile the chip with copies of the engine. Greg had the idea that they might manage which processes were running on which engines using a different calculus, the ambient calculus. The problem was that there was no clear way to reason about the combination of their type systems. By deriving the type system from the operational semantics, we can combine calculi in several different ways—whether by taking the coproduct of the theories, or by adding “switch language” instructions (rather like a tag in HTML), or any other way---and get a type system tailored to the result.

Posted by: Mike Stay on June 14, 2017 8:00 PM | Permalink | Reply to this

### Re: Eliminating Binders for Easier Operational Semantics

So I guess there are also other goals than understanding the categorical semantics of $\pi$-calculus? (-:

If it were me, I would still approach the problem by starting with a typed theory and then later specializing to a un(i)typed one.

Posted by: Mike Shulman on June 15, 2017 12:15 AM | Permalink | Reply to this

Post a New Comment