## June 7, 2017

### Enriched Lawvere Theories for Operational Semantics

#### Posted by John Baez

guest post by Mike Stay

Programs are an expression of programmer intent. We want the computer to do something for us, so we need to tell it what to do. We make mistakes, though, so we want to be able to check somehow that the program will do what we want. The idea of semantics for a programming language is that we assign some meaning to programs in such a way that we can reason about the behavior of a program. There are two main approaches to this: denotational semantics and operational semantics. I’ll discuss both below, but the post will focus for the most part on operational semantics.

There’s a long history of using 2-categories and related structures for term rewriting and operational semantics, but Greg Meredith and I are particularly fond of an approach using multisorted Lawvere theories enriched over the category of reflexive directed graphs, which we call Gph. Such enriched Lawvere theories are equal in power to, for instance, Sassone and Sobociński’s reactive systems, but in my opinion they have a cleaner categorical presentation. We wrote a paper on them:

Here I’ll just sketch the basic ideas.

# Denotational Semantics

Denotational semantics works really well for functional programming languages. The actual process of computation is largely ignored in denotational semantics; it doesn’t matter how you compute the function, just what the function is. John Baez’s seminar eleven years ago explored Lambek and Scott’s approach to the denotational semantics of lambda calculus, and there was extensive discussion on this blog. Lambek and Scott constructed a cartesian closed category of types and $\alpha$-$\beta$-$\eta$-equivalence classes of terms with one free variable, and then assigned meaning to the types and terms with a cartesian closed functor into Set.

Denotational semantics gets a lot harder once we move away from functional programming languages. Modern programs run on multiple computers at the same time and each computer has several cores. The computers are connected by networks that can mix up the order in which messages are received. A program may run perfectly by itself, but deadlock when you run it in parallel with another copy of itself. The notion of “composition” begins to change, too: we run programs in parallel with each other and let them interact by passing messages back and forth, not simply by feeding the output of one function into the input of another. All of this makes it hard to think of such programs as functions.

# Operational Semantics

Operational semantics is the other end of the spectrum, concerned with the rules by which the state of a computer changes. Whereas denotational semantics is inspired by Church and the lambda calculus, operational semantics is inspired by Turing and his machines. All of computational complexity lives here (e.g. $P \stackrel{?}{=} NP$).

To talk about the operational semantics of a programming language, there are five things we need to define.

First, we have to describe the layout of the state of the computer. For each kind of data that goes into a description of the state, we have a sort. If we’re using a programming language like lambda calculus, we have a sort for variables and a sort for terms, and the term is the entire state of the computer. If we’re using a Turing machine, there are more parts: the tape, the state transition table, the current state, and the position of the read/write head on the tape. If we’re using a modern language like JavaScript, the state is very complex: there are a couple of stacks, the heap, the lexical environment, the this binding, and more.

Second, we have to build up the state itself using term constructors. For example, in lambda calculus, we start with variables and use abstraction and application to build up a specific term.

Third, we say what rearrangements of the state we’re going to ignore; this is called structural congruence. In lambda calculus, we say that two terms are the same if they only differ in the choice of bound variables. In pi calculus, it doesn’t matter in what order we list the processes that are all running at the same time.

Fourth, we give reduction rules describing how the state is allowed to change. In lambda calculus, the state only changes via $\beta$-reduction, substituting the argument of a function for the bound variable. In a Turing machine, each state leads to one of five others (change the bit to 0 or 1, then move left or right; or halt). In pi calculus, there may be more than one transition possible out of a particular state: if a process is listening on a channel and there are two messages, then either message may be processed first. Computational complexity theory is all about how many steps it takes to compute a result, so we do not have equations between sequences of rewrites.

Finally, the reduction rules themselves may only apply in certain contexts; for example, in all modern programming languages based on the lambda calculus, no reductions happen under an abstraction. That is, even if a term $t$ reduces to $t'$, it is never the case that $\lambda x.t$ reduces to $\lambda x.t'$. The resulting normal form is called “weak head normal form”.

Here’s an example from Boudol’s paper “The $\pi$-calculus in direct style.” There are two sorts: $x$ or $z$ for variables and $L$ or $N$ for terms. The first line, labeled “syntax,” defines four term constructors. There are equations for structural congruence, and there are two reduction rules followed by the contexts in which the rules apply:

# Category theory

We’d like to formalize this using category theory. For our first attempt, we capture almost all of this information in a multisorted Gph-enriched Lawvere theory: we have a generating object for each sort, a generating morphism for each term constructor, an equation between morphisms encoding structural congruence, and an edge for each rewrite.

We interpret the theory in Gph. Sorts map to graphs, term constructors to graph homomorphisms, equations to equations, and rewrites map to things I call “graph transformations”, which are the obvious generalization of a natural transformation to the category of graphs: a graph transformation between two graph homomorphisms $\alpha:F\Rightarrow G$ assigns to each vertex $v$ an edge $\alpha_v:Fv \to Gv$. There’s nothing about a commuting square in the definition because it doesn’t even parse: we can’t compose edges to get a new edge.

This initial approach doesn’t quite work because of the way reduction contexts are usually presented. The reduction rules assume that we have a “global view” of the term being reduced, but the category theory insists on a “local view”. By “local” I mean that we can always whisker a reduction with a term constructor: if $K$ is an endomorphism on a graph, then given any edge $e:v\to v'$, there’s necessarily an edge $K e:K v \to K v'.$ These two requirements conflict: to model reduction to weak head normal form, if we have a reduction $t \to t',$ we don’t want a reduction $\lambda x.t \to \lambda x.t'.$

One solution is to introduce “context constructors”, unary morphisms for marking reduction contexts. These contexts become part of the rewrite rules and the structural congruence; for example, taking $C$ to be the context constructor for weak head normal form, we add a structural congruence rule that says that to reduce an application of one term to another, we have to reduce the term on the left first:

$C(T\; U) \equiv (C T\; U).$

We also modify the reduction reduction rule to involve the context constructors. Here’s $\beta$ reduction when reducing to weak head normal form:

$\beta: (C(\lambda x.T)\; U) \Rightarrow C T\{U/x\}.$

Now $\beta$ reduction can’t happen just anywhere; it can only happen in the presence of the “catalyst” $C$.

With context constructors, we can capture all of the information about operational semantics using a multisorted Gph-enriched Lawvere theory: we have a generating object for each sort, a generating morphism for each term constructor and for each context constructor, equations between morphisms encoding structural congruence and context propagation, and an edge for each rewrite in its appropriate context.

# Connecting to Lambek/Scott

The SKI combinator calculus is a formal system invented by Schönfinkel and Curry. It allows for universal computation, and expressions in this calculus can easily be translated into the lambda calculus, but it’s simpler because it doesn’t include variables. The SK calculus is a fragment of the SK calculus that is still computationally universal.

We can recover the Lambek/Scott-style denotational semantics of the SK calculus (see the appendix) by taking the Gph-theory, modding out by the edges, and taking the monoid of endomorphisms on the generating object. The monoid is the cartesian closed category with only the “untyped” type. Using Melliès and Zeilberger’s notion of a functor as a type refinement system, we “-oidify” the monoid into a category of types and equivalence classes of terms.

However, modding out by edges utterly destroys the semantics of concurrent languages, and composition of endomorphisms doesn’t line up particularly well with composition of processes, so neither of those operations are desirable in general. That doesn’t stop us from considering Gph-enriched functors as type refinement systems, though.

Let $G$ be the free model of a theory on the empty graph. Our plan for future work is to show how different notions of a collection of edges of $G$ give rise to different kinds of logics. For example, if we take subsets of the edges of $G,$ we get subgraphs of $G,$ which form a Heyting algebra. On the other hand, if we consider sets of lists of composable edges in $G,$ we get quantale semantics for linear logic. Specific collections will be the types in the type system, and proofs should be graph homomorphisms mapped over the collection. Edges will feature in proof normalization.

At the end, we should have a system where given a formal semantics for a language, we algorithmically derive a type system tailored to the language. We should also get a nice Curry-Howard style approach to operational semantics that even denotational semantics people won’t turn up their noses at!

# Appendix

## Gph

For our purposes, a graph is a set $E$ of edges and a set $V$ of vertices together with three functions $s\colon E \to V$ for the source of the edge, $t\colon E \to V$ for the target, and $a\colon V \to E$ such that $s\circ a = V$ and $t \circ a = V$—that is, $a$ assigns a chosen self-loop to each vertex. A graph homomorphism maps vertices to vertices and edges to edges such that sources, targets and chosen self-loops are preserved. Gph is the category of graphs and graph homomorphisms. Gph has finite products: the terminal graph is the graph with one vertex and one loop, while the product of two graphs $(E , V , s, t, a) \times (E' , V' , s' , t' , a')$ is $(E \times E', V \times V', s \times s', t\times t', a \times a').$

Gph is a topos; the subobject classifier has two vertices $t, f$ and five edges: the two self-loops, an edge from $t$ to $f,$ an edge from $f$ to $t,$ and an extra self-loop on $t$. Any edge in a subgraph maps to the chosen self-loop on $t,$ while an edge not in the subgraph maps to one of the other four edges depending on whether the source and target vertex are included or not.

A Gph-enriched category consists of

• a set of objects;
• for each pair of objects $x, y,$ a graph $\hom(x,y);$
• for each triple of objects $x, y, z,$ a composition graph homomorphism $\quad \circ\colon \hom(y, z) \times \hom(x, y) \to \hom(x, z);$ and
• for each object $x,$ a vertex of $\hom(x, x),$ the identity on $x,$

such that composition is associative, and composition and the identity obey the unit laws. A Gph-enriched category has finite products if the underlying category does.

Any category is trivially Gph-enrichable by treating the elements of the hom sets as vertices and adjoining a self loop to each vertex. The category Gph is nontrivially Gph-enriched: Gph is a topos, and therefore cartesian closed, and therefore enriched over itself. Given two graph homomorphisms $F, F'\colon (E, V, s, t, a) \to (E', V', s', t', a'),$ a graph transformation assigns to each vertex $v$ in $V$ an edge $e'$ in $E'$ such that $s'(e') = F(v)$ and $t'(e') = F'(v).$ Given any two graphs $G$ and $G',$ there is an exponential graph $G'^G$ whose vertices are graph homomorphisms between them and whose edges are graph transformations. There is a natural isomorphism between the graphs $C^{A\times B}$ and $(C^B)^A.$

A Gph-enriched functor between two Gph-enriched categories $C, D$ is a functor between the underlying categories such that the graph structure on each hom set is preserved, i.e. the functions between hom sets are graph homomorphisms between the hom graphs.

Let $S$ be a finite set, FinSet be a skeleton of the category of finite sets and functions between them, and $FinSet/S$ be the category of functions into $S$ and commuting triangles. A multisorted Gph-enriched Lawvere theory, hereafter Gph-theory is a Gph-enriched category with finite products Th equipped with a finite set $S$ of sorts and a Gph-enriched functor $\theta\colon FinSet^{op}/S \to Th$ that preserves products strictly. Any Gph-theory has an underlying multisorted Lawvere theory given by forgetting the edges of each hom graph.

A model of a Gph-theory Th is a Gph-enriched functor from Th to Gph that preserves products up to natural isomorphism. A homomorphism of models is a braided Gph-enriched natural transformation between the functors. Let FPGphCat be the 2-category of small Gph-enriched categories with finite products, product-preserving Gph-functors, and braided Gph-natural transformations. The forgetful functor $U\colon FPGphCat[\Th, \Gph] \to \Gph$ that picks out the underlying graph of a model has a left adjoint that picks out the free model on a graph.

Gph-enriched categories are part of a spectrum of 2-category-like structures. A strict 2-category is a category enriched over Cat with its usual product. Sesquicategories are categories enriched over Cat with the “funny” tensor product; a sesquicategory can be thought of as a generalized 2-category where the interchange law does not always hold. A Gph-enriched category can be thought of as a generalized sesquicategory where 2-morphisms (now edges) cannot always be composed. Any strict 2-category has an underlying sesquicategory, and any sesquicategory has an underlying Gph-enriched category; these forgetful functors have left adjoints.

## Some examples

### The SK calculus

Here’s a presentation of the Gph-theory for the SK calculus:

• objects
• $T$
• morphisms
• $S\colon 1 \to T$
• $K\colon 1 \to T$
• $(-\; -)\colon T \times T \to T$
• equations
• none
• edges
• $\sigma\colon (((S\; x)\; y)\; z) \Rightarrow ((x\; z)\; (y\; z))$
• $\kappa\colon ((K\; x)\; z) \Rightarrow x$

The free model of this theory on the empty graph has a vertex for every term in the SK calculus and an edge for every reduction.

### The SK calculus with the weak head normal form reduction strategy

Here’s the theory above modified to account for weak head normal form:

• objects
• $T$
• morphisms
• $S\colon 1 \to T$
• $K\colon 1 \to T$
• $(-\; -)\colon T \times T \to T$
• $R\colon T \to T$
• equations
• $R(x\; y) = (Rx\; y)$
• edges
• $\sigma\colon (((RS\; x)\; y)\; z) \Rightarrow ((Rx\; z)\; (y\; z))$
• $\kappa\colon ((RK\; x)\; z) \Rightarrow Rx$

If $M$ is an $SK$ term with no uses of $R$ and $M'$ is its weak head normal form, then $RM$ reduces to $RM'.$

Once we have $R$, we can think about other things to do with it. The rewrites treat the morphism $R$ as a linear resource. If we treat $R$ as “fuel” rather than “catalyst”—that is, if we modify $\sigma$ and $\kappa$ so that $R$ only appears on the left-hand side—then the reduction of $R^n M$ counts how many steps the computation takes; this is similar to Ethereum’s notion of “gasoline.”

Posted at June 7, 2017 8:51 AM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2966

### Re: Enriched Lawvere Theories for Operational Semantics

Nice! I wonder how easy all this material is to understand, for people who haven’t been talking to you for years. The basic idea feels very natural and right to me, but I suspect that’s because of our long interaction.

By the way, you wrote:

A graph homomorphism maps vertices to vertices and edges to edges such that the source and target are preserved

but since your graphs have a chosen self-loop for each vertex, your graph homomorphisms had better preserve this chosen self-loop too, or you’re in for a world of trouble. Should I fix this sentence?

Also by the way, what do you use these chosen self-loops for?

I guess one thing is that you need them for each graph homomorphism $F : G \to G'$ to have an identity graph transformation $1_F : F \Rightarrow F$.

Posted by: John Baez on June 7, 2017 10:18 AM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

It makes sense to me, at least. Nice!

I would guess that the self-loops are in order that $Gph$-enriched categories have whiskering. In an application $M N$, for instance, you want to be able to reduce $M\to M'$ to obtain $M N \to M' N$, but technically the enrichment over graphs only maps pairs of edges in homs to edges, so you can only say that given a reduction $M \to M'$ and $N \to N'$ you can reduce $M N \to M' N'$.

Posted by: Mike Shulman on June 7, 2017 3:07 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

I am guessing the distinguished self-loops are there for idle transitions. Without self-loops the product of $a\to b$ with itself would have four vertices and only 1 arrow (from $(a,a)$ to $(b,b)$). With self-loops the product would have 5 arrows. I think of reflexive graphs as poor man’s simplicial sets.

Posted by: Eugene on June 7, 2017 5:19 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

John:

Also by the way, what do you use these chosen self-loops for?

A “functor”* between reflexive graphs should preserve identities but not necessarily reflect them. One cool thing a “functor” can do is to eat a non-id arc by mapping it to an identity. The free category on a reflexive graph has morphisms that are paths of non-identity arcs. The free category of the image of a functor will not have any path elements corresponding to those that have been eaten.

I learned of this from R.F.C. Walters’ 1989 paper A note on context-free languages where a grammar may contain empty transitions but these are eaten in the functor that is the generated language.

I don’t understand enough of Mike’s approach to tell if he uses uses arc eating.

*A reflexive graph is a structure intermediate between quivers and categories. I think it reasonable to call the expected morphisms between any types of graphs “functors” if they translate to functors on the free categories of the graphs.

Posted by: RodMcGuire on June 7, 2017 5:19 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Yes, self loops need to be preserved; sorry for the omission. First, whenever we have a term constructor like $S,K: 1 \to T$, they’ll pick out self loops anyway because the terminal graph has a self loop. Second, they’re needed so that the exponential of two graphs is the graph of graph homomorphisms and graph transformations. Without self loops, the cartesian product differs from the “box” product, and we get two different notions of closedness. The one corresponding to the cartesian product when there aren’t self loops says $H^G$ has arbitrary functions from the vertices of $G$ to the vertices of $H$ as vertices and graph homomorphisms $A\times G \to H$ as edges, where $A$ is the “arrow graph”.

Posted by: Mike Stay on June 7, 2017 5:35 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Another little question: why do you talk about the “SK calculus” instead of the full-fledged SKI calculus? Is the SK calculus computationally universal?

Posted by: John Baez on June 7, 2017 10:20 AM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Yes, the SK calculus is too computationally universal. You can, for instance, reconstruct $I$ as $SKK$: $SKKx = Kx(Kx) = x$.

Posted by: Ingo Blechschmidt on June 7, 2017 2:12 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

That’s cool. Mike: is this why you used just the SK calculus—just to save time, basically?

Those who aren’t experts on everything you’re doing may 1) not know the SK calculus (which I believe is less famous than the SKI calculus) is computationally universal, 2) think there’s something tricky about the I combinator that your formalism can’t handle.

Posted by: John Baez on June 9, 2017 10:41 AM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

That’s cool. Mike: is this why you used just the SK calculus—just to save time, basically?

Yes.

Those who aren’t experts on everything you’re doing may 1) not know the SK calculus (which I believe is less famous than the SKI calculus) is computationally universal, 2) think there’s something tricky about the I combinator that your formalism can’t handle.

It honestly didn’t occur to me that anyone who knew about the SKI calculus wouldn’t know that the I combinator can be encoded as SKK, sorry.

Posted by: Mike Stay on June 9, 2017 3:34 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Formatting. (markdown wants some different spaces is all, I think)

Here’s a presentation of the Gph-theory for the SK calculus:

• objects
• $T$
• morphisms
• $S \colon 1 \to T$
• $K \colon 1 \to T$
• $(-\; -)\colon T \times T \to T$
• equations
• none
• edges
• $\sigma\colon (((S\; x)\; y)\; z) \Rightarrow ((x\; z)\; (y\; z))$
• $\kappa\colon ((K\; x)\; z) \Rightarrow x$

The free model of this theory on the empty graph has a vertex for every term in the SK calculus and an edge for every reduction.

source: (¢ instead of thalers, because the Café doesn’t like finding math in pre tags…)

Here's a presentation of the Gph-theory for the SK calculus:

* objects
* ¢T¢
* morphisms
* ¢S \colon 1 \to T ¢
* ¢K \colon 1 \to T ¢
* ¢(-\; -)\colon T \times T \to T¢
* equations
* none
* edges
* ¢\sigma\colon (((S\; x)\; y)\; z) \Rightarrow ((x\; z)\; (y\; z))¢
* ¢\kappa\colon ((K\; x)\; z) \Rightarrow x¢

Posted by: Jesse C. McKeown on June 7, 2017 5:49 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

That’s my fault. It was in the text when I composed it, but it either got stripped out when I copy-pasted it to email, or GMail did it.

Posted by: Mike Stay on June 8, 2017 12:43 AM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

I shouldn’t call it “fault” at all; I was only able to manage the above because the Café insists on showing you what it thinks you meant for comments; and I would guess there isn’t quite the same feedback loop for main entries (esp. for Guest Posts). (it doesn’t quite help that Café markdown isn’t quite daringfireball markdown, either)

Posted by: Jesse C. McKeown on June 9, 2017 12:07 AM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

I think I fixed the formatting, indenting those bullet points that need indenting.

Posted by: John Baez on June 9, 2017 10:38 AM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Does the choice of a cohesive topos (nLab) to enrich in allow any useful constructions? You’d have some ‘modalities’ (Def 2.3) to work with.

Posted by: David Corfield on June 10, 2017 1:31 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

That’s interesting! I hadn’t known about those, thanks. We’ve considered generalizing the collection, but we haven’t yet considered generalizing the category we enrich over. We are certainly interested in modalities in the type system, particularly “it is possible that this term reduces to that one” and “it is necessary that this term reduces to that one”.

Posted by: Mike Stay on June 10, 2017 7:23 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

The adjoint quadruple is $components \vdash discrete \vdash underlying \vdash codiscrete$. So

Any category is trivially Gph-enrichable by treating the elements of the hom sets as vertices and adjoining a self loop to each vertex

is arising from discrete. Any call for using codiscrete?

Any category is trivially Gph-enrichable by treating the elements of the hom sets as vertices and adjoining a single edge for every ordered pair of vertices.

Posted by: David Corfield on June 11, 2017 9:50 AM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Any call for using codiscrete?

Perhaps, but we haven’t come across one yet.

Posted by: Mike Stay on June 13, 2017 10:13 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

I had some fun trying to write down the Lawvere theory for the $\pi$-calculus just having read your description of the “initial approach”. I figured that the substitution $[z/x]$ in the $\lambda$-reduction rule must treated inductively using the syntax constructors, and thus stumbled upon exactly the issue with $\lambda$-abstraction you addressed with the context constructor, which was a great motivation! I was also unsure how to deal with the “$x\neq z$” condition in the structural congruence rule, and I was unclear on what some of the “$\dots$” meant in the $def$ reduction rule. But I think I learned something about operational semantics, so thanks, this was great!

Posted by: Tim Campion on June 11, 2017 2:10 AM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Thanks!

Posted by: Mike Stay on June 11, 2017 10:47 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Have you found that this approach helps in any way to prove the usual correspondences you might have between an operational and denotational semantics (soundness, adequacy, full abstraction)?

Posted by: Max S. New on June 14, 2017 4:36 PM | Permalink | Reply to this

### Re: Enriched Lawvere Theories for Operational Semantics

Yes! This was in fact one of the major advantages in the HDRA paper. Our model has specific representation for ‘syntax’ which made it much easier to reason about strong bisimilarity in a setting where this is often quite subtle. The full abstraction result in that paper, in particular, falls out almost by definition.

Posted by: L. G. Meredith on June 20, 2017 10:23 PM | Permalink | Reply to this

Post a New Comment