Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

June 29, 2019

Behavioral Mereology

Posted by John Baez

guest post by Toby Smithe and Bruno Gavranović

What do human beings, large corporations, biological cells, and gliders from Conway’s Game of Life have in common?

This week in the Applied Category Theory School, we attempt some first steps towards answering this question. We turn our attention to autopoiesis, the ill-understood phenomenon of self-perpetuation that characterizes life. Our starting point is Behavioral Mereology, a new take on the ancient question of parthood: how should we understand the relationship between a part and a whole? The authors Fong, Myers, and Spivak suggest that we cleave a whole into parts by observing – and grouping together — regularities in its behavior. Parthood entails behavioral constraints, which are mediated by the whole. We describe the corresponding logic of constraint passing, and the associated modalities of compatibility and ensurance. We propose that autopoiesis entails a special kind of parthood — one that ensures its compatibility with its environment — and end with a list of open questions and potential research directions.


When we look at the world, we notice that it is full of objects. Some of these objects are immediate to our senses — such as a coffee cup, a bicycle, or your pet rabbit. Other objects we only come to know after reflection, or through the use of tools — such as the atoms of the cup, the ecosystem that led to the evolution of the rabbit, or the cells of the rabbit’s body. These objects are related by a kind of compositionality that we call parthood: the parts of the bicycle include its gears and pedals; the parts of the ecosystem include its population of rabbits; in turn, the rabbits are composed of organs and thence cells. Indeed, all of these objects are tautologically parts of the whole that we call ‘the universe’.

Behavioral Mereology seeks to formalize the intuitions behind these ascriptions of parthood. Its starting point is the observation that parthood captures a kind of behavioral coherence: “when you pull on a part, the rest comes with”. That is to say, being a part entails a restriction on possible behaviors. For instance, consider a cup together with its contents: as long as the two parts can be considered together as a whole, the possible movements of the contents are constrained by the those of the cup. In general, the behavior of a part is not only constrained by its corresponding whole, but also by the parts around it. For instance, the behavior of the rabbit’s brain constrains the behavior of its muscles; and because the muscles change the relation of the rabbit to its environment, and the environment constrains the rabbit’s senses, we can say transitively that the rabbit’s muscle-behavior constrains its sense-behavior. Consequently, we can think of the whole ‘rabbit’ system as providing a context for the passage of constraints between parts of the rabbit, and we will see how this gives rise to a lattice structure on the corresponding category of parts.

These ideas sit at the confluence of two traditions of applied category theory. On one side, following Lawvere, we can conceptualize the aforementioned constraint passing as a formal change of context, via the pullback functor; precisely put, constraint passing is modelled by pulling a predicate back along the epi representing the parthood relation, from which we obtain two new adjoint inter-modalities, called compatibility and ensurance. On the other side of the confluence lies a behavioral approach to the categorical modelling of open and interacting systems. Let’s see how these two parts of applied category theory interact. We’ll start with the behavioral approach, which supplies the objects for the logic of constraint passing.

A behavioral perspective on parts and their interactions

In our everyday lives, we do not need to know the internal workings or detailed structure of objects in order to distinguish them. Instead, we perform a kind of informal ‘black-boxing’, and declare that “if it looks like a duck and quacks like a duck, then it probably is a duck.” This is to say, we distinguish objects by observational equivalence, calling two systems the same when they produce the same observations under our measurements; in the field of coalgebra, this idea gives rise to the concept of bisimulation.

For our purposes, it is this idea that associates parts with epimorphisms. To see how this works, let us make some definitions. First, we identify a system SS with its behavior type B SB_S; you can think of this as like the set of all its possible behaviors. Because we take an abstract observational approach to systems, the internal structure of a behavior type is of secondary importance – but it might help to give a couple of toy examples.

  • Consider a bicycle. The bicycle pedals and wheels might be moving with real-valued speeds pp and ww related by some multiplier rr via the gears. Then we could give a behavior type as B Bicyle{(p,w):×|wrp}B_{Bicyle} \coloneqq \{(p, w) : \mathbb{R} \times \mathbb{R} | w \geq rp\}.

  • Suppose our rabbit came from an ecosystem of foxes and rabbits. Their population sizes at time tt \in \mathbb{N} are given by f tf_t and r tr_t accordingly. The populations at time tt determine those at time t+1t+1 according to a recurrence relation, R t(f,r)R_t(f, r), that models how the two populations interact (rabbits and foxes are born, foxes kill rabbits, and so on). This gives us a behavior type B Eco{(f,r):×| t.R t(f,r)}B_{Eco} \coloneqq \{(f, r) : \mathbb{N} \to \mathbb{R} \times \mathbb{R} | \forall_t . R_t(f, r)\}.

We also identify the parts of a system with their behavior types: for us, there is nothing more to a system or part than its behavior. (We will occasionally refer both to PP and its behavior B PB_P: this distinction is purely for intuition and has no formal meaning.)

In this context, we can formalize our earlier assertion that “being a part entails a restriction on possible behaviors”, noting that every behavior of a part PP arises from some behavior of a system SS. For instance, every behavior of the heart of your pet rabbit must come from some behavior of the entire rabbit, although two different rabbit-behaviors may involve the same heart-behavior (more on this ‘quotienting’ later!).

We emphasize that the part PP is only a part in the context of the larger system SS: when severed from the system SS, PP might behave in ways unrelated to SS. For instance, every behavior of a baby rabbit before birth arises from a behavior of its mother; but after it is born, it gains independence.

Let’s make these ideas precise:

Definition 1: The behavior type of a part PP of a system SS is a regular epimorphism | P:B SB P\big|_P : B_S \twoheadrightarrow B_P. B PB_P therefore corresponds to a quotient of B SB_S.

The parts of SS form a category, its category of parts. Let Quot(𝒞)Quot(\mathcal{C}) be the subcategory of quotients in our ambient category 𝒞\mathcal{C}. Then the category of parts is its coslice by B SB_S, B S/Quot(𝒞)B_S / Quot(\mathcal{C}). Explicitly, it has parts as objects and as morphisms the commuting triangles in 𝒞\mathcal{C}.

(Note that 𝒞\mathcal{C} really is very ‘ambient’ for us! We have deliberately avoided requiring much structure. You could think of it as a topos, or at least as being locally Cartesian closed.)

Amongst many others, our toy examples have parts such as

  • the set of possible pedal speeds: B Pedalp:|w.(p,w)B BicycleB_{Pedal} \coloneqq {p : \mathbb{R} | \exists w . (p, w) \in B_{Bicycle}};

  • the set of possible rabbit population sizes at time tt: B Rabbit tx:|(f t,r t)B Eco.f t=xB_{Rabbit_t} \coloneqq {x : \mathbb{R} | \exists (f_t, r_t) \in B_{Eco} . f_t = x}.

Thinking behaviorally, we can consider parts as measurement devices by which we make observations. Every behavior of a part PP arises from at least one behavior of the whole system SS, but two SS-behaviors may be indistinguishable when we look only at PP: those SS-behaviors thus form an equivalence class. For example, consider SS to be one of our rabbits, and PP its heart. By observing PP, we may notice that it has a heightened pulse, but this doesn’t disambiguate between the rabbit being in a state of exertion or of anxiety, for instance: those behaviors fall into an equivalence class.

The lattice of parts: behavioral compatibility and determination

We may also ask what a PP-behavior means for other parts of SS: for instance, what having a heightened heartbeat means for the legs of the rabbit. For this purpose, Fong, Myers and Spivak introduce the formal concept of compatibility: we say that the rabbit’s exertion or fear behaviors, amongst others, are compatible with its having a heightened heartbeat. Formally, we make the following definition.

Definition 2: Suppose PP and QQ are parts of SS, and that a:B Pa : B_P and b:B Qb : B_Q are behaviors of those parts. We say that aa and bb are compatible if there is a behavior s:B Ss : B_S that restricts to both aa and bb. We write this as

𝔠(a,b):s:S.a=s| Ps| Q=b\mathfrak{c}(a, b) :\equiv \exists s : S . a = s \big|_P \wedge s \big|_Q = b

The definition extends in the obvious way to compatibility of an ii-indexed family of behaviors a i:B P ia_i : B_{P_i}.

We can easily obtain examples of compatible behaviors by restricting behaviors of SS to its parts. From our toy systems, we find:

  • a speed pp of the bicycle’s pedal is compatible with a speed ww of its wheel iff wrpw \geq rp: 𝔠(p,w)=wrp\mathfrak{c}(p, w) = w \geq rp;

  • two fox and rabbit populations at time tt are compatible iff there is a history of the ecosystem achieving those population sizes at tt.

Using compatibility, we can observe that the category of parts of SS has a lattice structure. Firstly, we write PQP \geq Q if QQ is a part of PP. For any two parts PP and QQ, the behavior type of their meet PQP \cap Q is given by

B PQB P+B Q𝔠B_{P \cap Q} \cong \frac{B_P + B_Q}{\mathfrak{c}}

making the following diagram a pushout:

Assuming that the ambient category 𝒞\mathcal{C} admits it (as would be the case in any topos or, more generally, any regular category), the behavior type of the join of PP and QQ is given by the image factorization of the map B SB P×B QB_S \to B_P \times B_Q:

Concretely, we have B PQ{(a,b):B P×B Q|𝔠(a,b)}.B_{P \cup Q} \cong \{(a, b) : B_P \times B_Q | \mathfrak{c}(a, b)\}.

If every behavior of PP is compatible with every behavior of QQ, then they are disjoint as parts, with empty meet:

a:B P.b:B Q.𝔠(a,b)B PQ=\forall a : B_P.\, \forall b : B_Q.\, \mathfrak{c}(a, b) \quad\Rightarrow\quad B_{P \cap Q} = \bot

Our toy systems supply some simple illustrations:

  • the bicycle is just the join of the pedal and wheel: B Bicycle=B PedalWheelB_{Bicycle} = B_{Pedal \cup Wheel};

  • the initial population sizes of foxes and rabbits are disjoint as parts: B Fox 0Rabbit 0=B_{Fox_0 \cap Rabbit_0} = \bot.

We will see that compatibility has a right adjoint, called ensurance, corresponding to a relation called determination. Informally, we can say that whilst the rabbit’s being still is compatible with its having a rapid heartbeat, strenous exertion would ensure it. So the QQ-behavior “strenuous exertion” is said to determine the PP-behavior “rapid heartbeat”. Formally, we have:

Definition 3: Suppose PP and QQ are parts of SS, and that a:B Pa : B_P and b:B Qb : B_Q are behaviors of those parts. We say that aa determines bb if every behavior s:B Ss : B_S of the whole system which restricts to aa also restricts to bb. That is,

𝔡(a,b):s:S.s| P=as| Q=b\mathfrak{d}(a, b) :\equiv \forall s : S . s \big|_P = a \Rightarrow s \big|_Q = b.

If every behavior of a part PP determines a behavior of QQ, then we say that PP determines QQ. Indeed, we can say more than this: if PP determines QQ, then QQ is a part of PP, PQP \geq Q. Hence, determination recovers the order of the parts of a system.

The logic of constraint passing

Usually, we do not have enough information to describe a behavior completely. Instead, we pick out sets of behaviors by giving predicates, such as “has a raised pulse”. We can think of these predicates as expressing constraints on the behaviors of parts, and we are interested in how constraints are passed between the parts of a system. For instance, how does running fast constrain the behaviors of the other parts of the rabbit?

There are two principal directions to consider. Suppose we want to obtain a constraint on the whole from a constraint on a part; for instance, we want to move from a heart-constraint “raised pulse” to a body-constraint “behaves such that the heart has a raised pulse”. There is a canonical way to do this: we pull the heart-constraint back along the restriction epi witnessing its parthood.

Let’s formalize this, noting that all the structure we need will exist in any topos. First, constraints are given by predicates: characteristic maps into some ‘classifying object’ Ω\Omega of truth values such that those behaviors for which the predicate holds are mapped to true\mathtt{true}. In a Boolean topos — such as Set, the category of our examples — Ω1+1\Omega \cong 1 + 1 is just the two-element set {true,false}\{ \mathtt{true}, \mathtt{false} \}.

Two predicates ϕ\phi (eg., “runs fast”) ψ\psi (eg., “runs”) may be related by entailment, which we write as ϕψ\phi \vdash \psi. This makes the predicates on an object B PB_P into a poset Pred(B P)\text{Pred}(B_P), whose objects are predicates and whose morphisms represent these entailments. If our ambient category is Cartesian closed, like Set or any topos, then Pred(B P)\text{Pred}(B_P) is internalized as the object Ω B P\Omega^{B_P}.

Then, in moving from a part QQ to its parent PQP \geq Q, given a constraint ψ:B QΩ\psi : B_Q \to \Omega, we obtain a constraint on PP-behaviors by restricting PP-behaviors to those that satisfy ψ\psi on QQ; that is, by pulling ψ\psi back along the restriction epi | Q:B PB Q\big|_Q : B_P \twoheadrightarrow B_Q. This pullback effects a change of context, and in categorical logic it is also known as base change, inverse image, or substitution. It is given by precomposing ψ\psi with the restriction map | Q\big|_Q, so that P-behaviors are substituted for Q-behaviors:

Δ P Qψψ| Q\Delta^Q_P\psi \coloneqq \psi \circ \big|_Q

We can see how this precomposition corresponds to a pullback by noting that maps into Ω\Omega correspond to subobjects: if 𝒫\mathcal{P} is the powerset functor, then we have a natural isomorphism Ω ()𝒫()\Omega^{(-)} \cong \mathcal{P}(-). The pullback square on the right of the following commutative diagram witnesses this correspondence; the pullback square on the left commutes by definition. By the two-pullbacks lemma, the whole diagram is thus a pullback, and we observe that the composition Δ P Qψ=ψ| Q\Delta^Q_P \psi = \psi \circ \big|_Q does therefore correspond to pulling (the monomorphism corresponding to) ψ\psi back along | Q\big|_Q:

If ψ\psi is our rabbit’s heart-constraint “has a raised pulse”, then Δ P Qψ\Delta^Q_P \psi is the rabbit-constraint “behaves such that the heart has a raised pulse”. What about the other direction, from a part PP to a subpart QPQ \leq P? That is, from a constraint on the rabbit’s behavior (such as “forages for food”), what constraints can we obtain on the behaviors of its heart or legs?

Given a constraint ϕ:B PΩ\phi : B_P \to \Omega with PQP \geq Q, we have two canonical choices, the left and right adjoints of the pullback functor Δ P Q\Delta^Q_P:

We will see that these adjoints supply weak (on the left) and strong (on the right) forms of constraint passing. Logically, they correspond to existential and universal quantification. This is easiest to see if we make a brief detour into basic predicate logic, following Awodey (§9.5).

Rather than pulling back along a restriction epi, let us consider pulling back along a projection π\pi of a product, writing Δ π\Delta_\pi for the corresponding functor. Δ π\Delta_\pi takes a predicate ϕ(x)\phi(x) in one variable to a predicate in that variable alongside a dummy variable ϕ(x,y)ϕ(x)×(y)\phi(x, y) \simeq \phi(x)\times\top(y), such that ϕ(x,y)\phi(x, y) is true whenever ϕ(x)\phi(x) is true on xx, for any yy in the domain. We then observe that we have the adjoint triple πΔ π π\exists_\pi \dashv \Delta_\pi \dashv \forall_\pi given by the following correspondences, as expressed in predicate logic:

y.ψ(x,y)ϕ(x)ψ(x,y)Δ πϕ(x)\frac{\exists y . \psi(x, y) \vdash \phi(x)}{\psi(x, y) \vdash \Delta_\pi \phi(x)}


Δ πϕ(x)ψ(x,y)ϕ(x)y.ψ(x,y)\frac{\Delta_\pi \phi(x) \vdash \psi(x, y)}{\phi(x) \vdash \forall y . \psi(x, y)}

The (co)units of this adjoint triple encode the familiar introduction and elimination rules for these quantifiers. For instance, the unit of πΔ π\exists_\pi \dashv \Delta_\pi encodes \exists-introduction: ψ(x,y)y.ψ(x,y)\psi(x, y) \vdash \exists y . \psi(x, y). And the counit of Δ π π\Delta_\pi \dashv \forall_\pi encodes \forall-elimination: y.ψ(x,y)ψ(x,y)\forall y . \psi(x, y) \vdash \psi(x, y).

We can now return to our restriction pullback Δ P Q\Delta^Q_P. In light of the preceding discussion, its adjoints Q PΔ P Q Q P\exists^P_Q \dashv \Delta^Q_P \dashv \forall^P_Q have the unsurprising forms

Q Pϕ(q)p:B P.((p| Q=q)ϕ(p))\exists^P_Q \phi(q) \coloneqq \exists p : B_P . \big( (p \big|_Q = q) \wedge \phi(p) \big)


Q Pϕ(q)p:B P.((p| Q=q)ϕ(p))\forall^P_Q \phi(q) \coloneqq \forall p : B_P . \big( (p \big|_Q = q) \Rightarrow \phi(p) \big) .

Where the parent part is obvious, such as in the case of the whole system SS, then we omit it, writing just Q\exists_Q, Δ Q\Delta^Q, and Q\forall_Q.

Compatibility and ensurance

We’ve seen how the familiar logical operators \exists, Δ\Delta and \forall model the passage of constraints between a part and its parent or child. For example, the heightened heartbeat of our rabbit is witnessed by a predicate ψ\psi on heart-behaviors. But any heart-behavior satisfying ψ\psi may arise from a number of rabbit-behaviors Δ P Qψ\Delta^Q_P \psi, such as fear, or exertion. To learn more, we look at other parts of the rabbit, such as its legs. How does having a heightened heartbeat constrain leg behavior? In general, how does the behavior of one part constrain the behavior of another?

It turns out we have already defined all the necessary machinery to answer these questions. We show the hierarchy of the rabbit and its two parts graphically as follows.

These two parts of the rabbit induce two adjoint triples - allowing us to visualize passage of constraints between parts.

This allows us to define two canonical inter-modalities to answer this question, generalizing both the usual modal operators of possibility and necessity as well as the compatibility and determination relations introduced earlier. Accordingly, we call these new operators compatibility and ensurance and denote them by the symbols Q P\Diamond^P_Q and Q P\Box^P_Q, correspondingly:

These inter-modalities are thus obtained by composing the ‘context-change’ functor Δ P Q\Delta^Q_P with its left (\exists) and right (\forall) adjoints, giving ‘weak’ and ‘strong’ constraint-passing. Let’s unpack this, starting with compatibility.

Given a behavior qq of QQ, we have

Q Pϕ(q)=s:B S.(s| Q=q)ϕ(s| P)=p:B P.𝔠(p,q)ϕ(p). \Diamond^P_Q \phi(q) \; = \exists s : B_S . (s \big|_Q = q) \wedge \phi(s \big|_P) \; = \exists p : B_P . \mathfrak{c}(p, q) \wedge \phi(p).

That is, if QQ can be doing qq while PP is satisfying ϕ\phi, then the Q-behavior qq is compatible with ϕ\phi on PP: Q Pϕ(q)\Diamond^P_Q \phi(q). Q Pϕ\Diamond^P_Q \phi is thus a predicate on QQ that picks out all the QQ-behaviors compatible with ϕ\phi on PP. Thinking back to our rabbit, this is a constraint on leg-behaviors satisfied by just those which are compatible with having a heightened heartbeat.

Alternatively, ensurance expands as

Q Pϕ(q)=s:B S.(s| Q=q)ϕ(s| P)=p:B P.𝔠(p,q)ϕ(p). \Box^P_Q \phi(q) \; = \forall s : B_S . (s \big|_Q = q) \Rightarrow \phi(s \big|_P) \; = \forall p : B_P . \mathfrak{c}(p, q) \Rightarrow \phi(p).

That is, if QQ doing qq entails that PP must satisfy ϕ\phi, then we say that qq ensures the constraint ϕ\phi on PP: Q Pϕ(q)\Box^P_Q \phi(q). The predicate Q Pϕ\Box^P_Q \phi on QQ thus picks out all those QQ-behaviors that ensure ϕ\phi on PP. For our rabbit, this means a constraint on leg-behaviors satisfied by just those which ensure the rabbit’s heightened heartbeat. Ensurance is thus a much stronger form of constraint-passing than compatibility.

More formally, considering again our rabbit-fox ecosystem example, let’s suppose that we want to choose an initial fox population f 0f_0 to ensure the rabbit population r tr_t stays within some bounds (k 1,k 2)(k_1, k_2) after some deadline dd. We can write this using the constraint inCheck(r t)k 1<r t<k 2\text{inCheck}(r_t) \coloneqq k_1 \:\text{&lt;}\: r_t \:\text{&lt;}\: k_2:

Fox_0 tdRabbit t(td.inCheck)(f 0) \Box^{\cup_{t \geq d} \text{Rabbit}_t}_{\text{Fox_0}} (\forall t \geq d . \text{inCheck}) (f_0)

It is easy to see that these compatibility and ensurance operators have the compatibility and determination relations as special cases. Let δ p:B PΩ\delta_p : B_P \to \Omega be the ‘Dirac’ predicate satisfied by pp' iff p=pp = p'. Then for any p:B Pp : B_P and q:B Qq : B_Q we have

𝔠(p,q)= Q Pδ p(q)= P Qδ q(p)\mathfrak{c}(p, q) = \Diamond^P_Q \delta_p (q) = \Diamond^Q_P \delta_q (p)


𝔡(p,q)= P Qδ q(p)\mathfrak{d}(p, q) = \Box^Q_P \delta_q (p).

Because they are defined using the left and right adjoints of the functor Δ Q P\Delta^P_Q, compatibility and ensurance inherit a number of useful properties. For example, if the internal logic of our ambient category 𝒞\mathcal{C} satisfies the law of excluded middle, then just as the adjoints \exists and \forall are De Morgan duals, so are compatibility and ensurance: ¬ Q P¬= Q P\neg \Diamond^P_Q \neg = \Box^P_Q. More generally, Q P\Diamond^P_Q and Q P\Box^P_Q inherit the monotonicity and adjointness of \exists and \forall. Monotonicity means that if a constraint ϕ\phi entails a constraint ψ\psi, then compatibility with ϕ\phi entails compatibility with ψ\psi and ensuring ϕ\phi entails ensuring ψ\psi. And from the adjunctions Q PΔ P Q\exists^P_Q \dashv \Delta^Q_P and Δ P Q Q P\Delta^Q_P \dashv \forall^P_Q, we inherit that Q P P Q\Diamond^P_Q \dashv \Box^Q_P.

What does this adjunction mean? An adjunction is a natural family of isomorphisms between hom-sets, which for us represent logical entailment. So Q P P Q\Diamond^P_Q \dashv \Box^Q_P means we have ϕ P Qψ Q Pϕψ\phi \vdash \Box^Q_P \psi \Leftrightarrow \Diamond^P_Q \phi \vdash \psi. Interpreted in our rabbit example, this means that the implication “rapid leg movement ensures heightened heartbeat” is the same as the implication “if the heart’s behavior is compatible with the legs running fast, then the heart has a raised pulse”.

Alternatively, we can look at the unit and counit of the adjunction. The unit is a map id Q P P Q\text{id} \to \Box^P_Q \Diamond^Q_P: it witnesses the fact that, whatever the rabbit’s heartbeat, its having that heart behavior ensures that the legs are behaving compatibly. Dually, the counit is a map Q P P Qid\Diamond^P_Q \Box^Q_P \to \text{id} and we leave its rabbit-interpretation to the reader.

Finally, because right adjoints preserve limits, Q P\Box^P_Q commutes with \wedge and \forall, and Q P\Diamond^P_Q commutes with \vee and \exists.

We can use these adjunction properties to simplify the ‘inCheck ensurance’ ecosystem constraint introduced above. Firstly, we can rewrite it as

td. Fox 0 tdRabbit tinCheck(r t). \forall t \geq d . \Box^{\cup_{t \geq d} \text{Rabbit}_t}_{\text{Fox}_0} \text{inCheck} (r_t).

(This is just the constraint as given above, but with the \forall moved outside of the scope of the \Box, using the commutativity of \Box with limits.)

We then note that inCheck(r t)\text{inCheck}(r_t) can be spelled out as Δ tdRabbit t Rabbit tinCheck\Delta^{\text{Rabbit}_t}_{\cup_{t \geq d} \text{Rabbit}_t} \text{inCheck}, and that Q PΔ Q Q= Q P\Box^P_Q \Delta^Q_{Q'} = \Box^P_{Q'}, which together mean we can simplify the example further as

td. Fox 0 Rabbit tinCheck(r t). \forall t \geq d . \Box^{\text{Rabbit}_t}_{\text{Fox}_0} \text{inCheck} (r_t).

Possibility and necessity

For the final trick with our rabbit, consider the constraint of heightened pulse ϕ\phi on its part PP: the heart. Given any such ϕ\phi, we might be interested in describing the two alethic modalities of possibility and necessity given by the predicates: does any heart behavior satisfy ϕ\phi and do all heart behaviors satisfy ϕ\phi?

Those modalities are given by the operators P P\Diamond_P^{\bot}\Diamond_{\bot}^P and P P\Box_P^{\bot}\Box_{\bot}^P, respectively, where \bot is the system whose behavior type B B_{\bot} consists of just a single element.

Both cases of possibility and necessity on a constraint ϕ\phi of PP can be thought of as a map B PB PB_P \rightarrow B_P that goes through a one-element bottleneck. P\Diamond_{\bot}^P and P\Box_{\bot}^P both compress all PP-behaviors by mapping PP to true\mathtt{true} if there is any behavior of B PB_P satisfies ϕ\phi or if all behaviors of B PB_P satisfy ϕ\phi, respectively. Thus, the modality P P\Diamond_P^{\bot}\Diamond_{\bot}^P detects whether ϕ\phi is possible, and P P\Box_P^{\bot}\Box_{\bot}^P whether ϕ\phi is necessary.

Behavior, mereology, and autopoiesis

Behavioral Mereology supplies a first step towards formalizing a notion of autopoiesis. Informally, a system is autopoietic if its behavior enables it to persist within its environment. In other words: it behaves to ensure compatibility with its environment. Our aim is to unpack this conjecture, and to investigate how it relates to extant accounts of autopoiesis using category theory.

Consider again our rabbit. It is autopoietic within its ecological niche, foraging for food, sheltering from the elements, and fleeing from prey: in particular, it responds to environmental changes (such as the seasons, or the appearance of a fox), and it changes the environment (by building a burrow), to improve its chances of survival. Similar things can be said of the societies of rabbits and foxes. For instance, in our simple example, the foxes collectively must symbiotically ensure that there are at least some rabbits, in order that they do not starve. We can write this behavioral-mereologically as the implication

r 00r<k R F(f>0 F R(r>0)). r_0 \geq 0 \wedge r \:\text{&lt;}\: k \vdash \Box^F_R \big( f \:\text{&gt;}\: 0 \wedge \Box^R_F (r \:\text{&gt;}\: 0) \big).

That is, if there is an initial rabbit population, and the ecosystem bounds the rabbit population independently of time, then the rabbits ensure that there are foxes, and that the foxes ensure there are rabbits.

These ideas extend to numerous familiar examples of adaptive or living systems, from single cells to corporations and political parties. In each case, the distinguishing feature seems to be the same: the autopoietic system can anticipate and respond to external fluctuations (‘perception’), and can act in order to bring about sustainable states of affairs (‘action’). This is evocative of the Good Regulator theorem of Conant and Ashby: if we consider input-output machines, with ‘good regulation’ meaning to minimize surprising outputs, then any good regulator of a machine must be in isomorphism with said machine.

Considering again our ecosystem—the parts of which include rabbits and foxes—we conjecture that an autopoietic part AA is one which contains an internal model: a subpart that sits in isomorphism with the environment EE. In order to ensure compatibility between AA and EE, AA must behave to maintain this isomorphism, and this entails the aforementioned perception and action. Because AA is not omniscient, it can only do so approximately, which suggests we must leave the Boolean world behind, and frame this behavior in terms of approximate Bayesian inference. Likely to be important here is the work of Karl Friston on the free energy principle, which says that perception, action, and the Good Regulator theorem are all consequences of an information-theoretic principle of least action. Categorically, it seems we may need a probabilistic logic, such as that supplied by effectus theory.

There are of course many questions for discussion. To give a few:

  • How should we draw the boundary between an autopoietic system and its environment? A Bayesian analysis suggests that the concept of Markov blanket will be important. Just as there are innumerable parts in the Boolean logic explored above, we should expect there to be innumerable Markov blankets. Is there a distinguished class?

  • Can we quantify the degree of autopoiesis exhibited by a system? In a probabilistic setting, we expect to obtain a notion of stochastic constraint passing, and hence graded compatibility and ensurance.

  • Is our informal notion of persistence connected to ergodicity?

  • Do coalgebras of a probability monad supply appropriate behavior types for our stochastic interacting processes? Can we use the tools of coalgebraic modal logic to formalize the notions of measurement and interaction introduced above? Can we use (approximate) bisimulation to characterize the ‘good regulator’ isomorphism?

  • How should we model societies of agents, which pass behavioral constraints between levels? Is said stochastic constraint passing relevant here?

  • Under what conditions does an autopoietic system reproduce itself? Does this require some kind of resource limitation, or competition?

  • Autopoietic systems don’t appear in the environment already configured as such, but rather undergo a process in which their internal model becomes more compatible with the environment. Can we characterize the process of becoming a good autopoietic system in categorical terms? Can we substantiate the claim that “a good autopoietic system must be a good learner”?

Autopoiesis seems to be a characteristic of those systems that we might call living or intelligent. By taking a behavioral-mereological approach, by considering the interactions between systems and their parts, and the constraints that these impose, we hope to pin down categorically precisely what this means.

Posted at June 29, 2019 3:16 AM UTC

TrackBack URL for this Entry:

18 Comments & 0 Trackbacks

Re: Behavioral Mereology

What’s a “Markov blanket”?

Posted by: John Baez on June 29, 2019 3:35 AM | Permalink | Reply to this

Re: Behavioral Mereology

Hi John! A “Markov blanket” is one formalization of the idea of the surface of a thing – in the sense that the surface isolates the thing from its environment. If the objects of category are the things in the world and the morphisms are somehow their information-theoretic interactions, then the Markov blanket of a thing AA is roughly the collection of all the things through which any interaction with AA factors.

This idea can be interpreted in the monoidal category of Bayesian networks. So in the usual formalism, the Markov blanket of a node AA is the set MB(A)MB(A) of its parent nodes, child nodes, and “coparent” nodes (the other parents of its children). Then every other collection of nodes in the network is conditionally independent of AA given MB(A)MB(A); this is just a generalization of the usual Markov property.

The reason this seems to be an important notion for autopoiesis is that an autopoietic thing is one that maintains itself, and in particular, maintains its boundary. Karl Friston actually has a paper on this, in which he (roughly) identifies things with their Markov blankets and argues that being autopoietic somehow just amounts to being ergodic. The idea is autopoietic systems ought to minimize encounters with surprising events or perturbations (because these might be adverse), and that this entails a kind of dynamic “good regulator”. Friston says that this process is entailed by having a Markov blanket and being ergodic, as follows.

The existence of a Markov blanket entails a partition of the world into ‘internal’ and ‘external’ (and ‘blanket’), and Friston’s claim is that the existence of an ergodic measure with respect to the dynamics of the system means that the flow of the dynamics can be written as a gradient ascent on this measure. By using the partition induced by the Markov blanket, this gradient ascent can itself be written as a minimization of the relative entropy between an arbitrary density implicit in the ‘internal’ states and the true density over the ‘external’ states. In the end, this amounts to maintaining the kind of “good regulator” I mentioned above.

Markov blankets can of course be nested – there’s some nice operadic stuff there – and this leads to lots of interesting ideas about the structure of multi-scale autopoietic systems such as large corporations, or packs of animals. (Behavioural mereology example: how does a large-scale autopoietic system constrain the behaviours of a part that is itself autopoietic such that both remain autopoietic?) The question in the blog post arises similarly: how should we judge which Markov blanket of a system is ‘canonical’? Also: do we need to account for temporal variation in the structure or existence of the Markov blanket?

Finally, I am no expert in ergodic theory, so cannot judge the technical details of that part of the argument; for instance, assuming the argument goes through, I would like to know if ergodicity really is necessary and/or sufficient. It is not clear from the paper, I am not sure I trust my intuition (which suggests that ergodicity gives a nice story, but is perhaps unnecessarily strong).

Posted by: Toby Smithe on June 29, 2019 12:49 PM | Permalink | Reply to this

Re: Behavioral Mereology

There should be an “and” in the final sentence of my comment: “It is not clear from the paper, and I am not sure I trust my intuition …”.

Posted by: Toby Smithe on June 29, 2019 12:52 PM | Permalink | Reply to this

Re: Behavioral Mereology

Could you clarify the fourth bullet point? You wrote:

Do coalgebras of a probability monad supply appropriate behavior types for our stochastic interacting processes?

At first I thought this was just a typo for “algebras for a probability monad”. It always makes sense to talk about the algebras for a monad, but “coalgebra for a monad” is not in general defined. Conceivably there’s a definition of coalgebra for this particular family of monads, but the link doesn’t supply one, or indeed mention coalgebras at all.

But the next sentence is:

Can we use the tools of coalgebraic modal logic to …?

which suggests that it wasn’t a typo and it’s something you’ve started to think through. If that’s the case, what do you mean by a coalgebra for a probability monad?

Posted by: Tom Leinster on June 29, 2019 8:18 PM | Permalink | Reply to this

Re: Behavioral Mereology

Hi, I imagine they meant coalgebras for the endofunctor part of the probability monad, since these are Markov chains. Van Breugel, Hermida, Makkai, Worrell and others looked at pseudometrics from this position, which might be the notion of “approximate bisimilarity” that is referred to.

Posted by: Sam Staton on June 30, 2019 1:31 PM | Permalink | Reply to this

Re: Behavioral Mereology

Hi Tom,

yes, I have encountered this confusion before and probably we should have been more explicit, as the notion of a coalgebra of an endofunctor is slightly different from the common notion of a coalgebra of a comonad; the former is the one we mean, and seems to be mainly found in the theoretical computer science literature, where it really just refers to a map XFXX \to FX. The idea is that XX is some state space, and FXFX is the space of possible successor states, so these are abstract models of dynamical systems, particularly where they involve some kind of explicit state transition. As Sam points out, where FF is, or derives from, the distribution or Giry monads (say), then the coalgebras correspond to Markov chains or processes. Where there is a terminal object in the category of coalgebras, the elements of it correspond to behaviours (eg, distributions over sequences of states), and this seems to me to suggest a nice place to start thinking about behaviour types in an uncertain setting. We say two processes or coalgebras are bisimilar when they coincide in the terminal coalgebra, but as Sam points out there are also various notions of approximate bisimulation.

Another aspect that seems to make coalgebras of this sort suitable for our purposes is that in many circumstances it is possible to lift a logic defined on XX via some sort of distributive law to one defined on FXFX, and in doing so we can define various modal operators: say pp is some predicate then we can define notions like “henceforth pp”. There is a kind of Stone or Isbell duality between the algebras of this kind of logic and coalgebras that satisfy the corresponding algebraic specifications; see “testing semantics” in this paper by Hasuo, Jacobs, and Sokolova.

I believe that there should be a connection between the idea mentioned in the blog of “ensuring compatibility with the environment”, potentially formalized using the tools I’ve mentioned here, and the ergodic account of autopoiesis supplied by Friston (as described in my earlier reply to John Baez). But there are still lots of steps missing; for instance, I’m not aware of a categorical account of ergodicity, although I of course have some ill-defined ideas of my own about that.

In any case, perhaps this is all severely overcomplicating things: perhaps there is a much simpler way to look at these questions. If you can think of one, please do let us know!

Posted by: Toby Smithe on June 30, 2019 4:20 PM | Permalink | Reply to this

Re: Behavioral Mereology

Thanks, both. So, they just meant coalgebras for the underlying endofunctor.

Posted by: Tom Leinster on June 30, 2019 9:40 PM | Permalink | Reply to this


FWIW, the free energy approach of Friston is just one manifestation of the more general idea that information-theoretical or similar objective functions involving action now and sensing later ought to be useful for intrinsic motivation.

There are highly nontrivial categorical considerations that emerge from demanding that such an objective function be a) compositional and b) obey some basic constraints. Ideally one would also have c) some useful algebraic properties.

For instance, David Spivak and Brendan Fong showed that channel capacity CC (the objective function for “empowerment” qua intrinsic motivation; resp. expC\exp C) is a strict monoidal lax functor w/r/t the Kronecker product (resp., direct sum) on Stoch. But laxness is not very useful! My bet is that conditional mutual information gives the right place to start courtesy of the chain rule.

Posted by: Steve Huntsman on July 1, 2019 4:49 PM | Permalink | Reply to this

Re: Steve

Forgot to include a link re: various flavors of intrinsic motivation objective functions:

Also obviously forgot to include a subject line that got autopopulated :/

Posted by: Steve Huntsman on July 1, 2019 4:52 PM | Permalink | Reply to this

Re: Steve

Hi Steve, thanks for the comment and link; I’ll make sure to read that paper soon. Meanwhile, whilst I’m aware of (but still unfamiliar with!) the work of Baez, Fritz and Leinster on entropy, a few quick searches didn’t reveal the Spivak & Fong work you refer to. Do you have a reference for that, too?

It’s not surprising that the various components of active inference are somewhat ‘modular’. For instance, it also turns out that the approximate inference procedure – variational inference – adopted by Friston and many others is itself modular; see this paper by Knoblauch et al for more on that (though they do also show that assuming your ‘distance’ for probability distributions is the KL divergence, standard variational inference is the optimal choice of algorithm).

Posted by: Toby Smithe on July 1, 2019 7:36 PM | Permalink | Reply to this

Re: Steve

David and Brendan’s result isn’t published. It’s essentially a formalization of results originally due to Shannon that I thought smelled relevant to intrinsic motivation from a compositional perspective. I watched David and Brendan work out this formalization in real time on a brief visit to MIT, where along the way they also rediscovered a partial order on channel matrices that was again originally due to Shannon. To me (very much a dilettante) this was a vivid reminder of how categorical things frequently look to an observer…i.e., there’s a bunch of formalism, and it can seem like not much of substance is happening, and then suddenly a highly nontrivial result falls out “for free.”

Posted by: Steve Huntsman on July 2, 2019 2:54 PM | Permalink | Reply to this

Re: Steve

Compare with what Deligne said about Grothendieck’s highly categorical style (I’m quoting from The Rising Sea, an essay by Colin McLarty):

Deligne describes a characteristic Grothendieck proof as a long series of trivial steps where “nothing seems to happen, and yet at the end a highly non-trivial theorem is there”.

Posted by: Todd Trimble on July 2, 2019 9:46 PM | Permalink | Reply to this

Re: Steve

Yep, that’s precisely what Steve’s comment recalled to my mind, as well (perhaps not unintentionally).

Posted by: Toby Smithe on July 3, 2019 12:42 AM | Permalink | Reply to this


I’m sure this was an instance of cryptomnesia on my part

Posted by: Steve Huntsman on July 3, 2019 12:57 PM | Permalink | Reply to this

Modalities on types

Of course, that adjoint triple construction of Lawvere carries over to a higher topos setting, so providing modalities for dependent type theory:

dependent sum dashvdashv base change dashvdashv dependent product.

The intermodalities when working with a pair of epimorphisms to the same target give something like temporal logic.

I presented some of these ideas, worked out at the nnLab, in an article which will become a chapter in a book I’m publishing with Oxford University Press.

Such transformations across spans/correspondences seems to crop up all over, e.g., as a kind of integral transform.

Posted by: David Corfield on July 5, 2019 3:12 PM | Permalink | Reply to this

Re: Modalities on types

Hi David, yes, the temporal dimension is of course very relevant to our discussion of autopoiesis, as an autopoietic thing is a thing that persists. As you probably observed from my response to Tom Leinster, this is one important reason why I’m interested in coalgebra. I suppose that coalgebraic treatments of modal logic have considered these adjoints in a temporally extended form, though I do not yet know that part of the literature well enough.

Two particular thoughts come to my mind. Firstly, your chapter remarks on instants and intervals, and the temporal type theory of Spivak and Schultz. My (limited) exposure to coalgebraic modal logic suggests that in that line of work, the “instant-based” approach is the usual one. This work seems to have originated in theoretical computer science, where time tends to come in discrete ticks, so perhaps this is unsurprising. Clearly there should be a formal link between that work and more general interval-based temporal type theory – perhaps it is as simple as your inclusion IntervalInstant 2Interval \embedsin Instant^2. Maybe it would be interesting to work out that connection in detail: though a philosopher by training, I am a theoretical neuroscientist by trade, and neurons straddle both categories.

I note also that David Spivak has some work on mode-dependence in dynamical systems; your remarks on mode theory feel familiar in that context. This leads me to my second – somewhat cruder – thought: that there is logically little special about time. Is the difference between an instant and an interval not just the same as the difference between a point and a subset of a space?

Really, that’s just the same acknowledgement you make at the end of your chapter, of the power of categorical logic (and hence, say, the promise of its application to natural language). And in turn, that acknowledgement seems just to say that mathematics is indifferent to how we apply it!

Finally, thank you for your point about integral transforms. I was not aware of that connection before (I have much to learn); it’s very pleasing. I look eagerly forward to a future where mathematics is universally taught from the ‘nPOV’.

Posted by: Toby Smithe on July 8, 2019 12:05 AM | Permalink | Reply to this

Re: Modalities on types

Yes, the coalgebraic connection is interesting. I was looking at that back here. (Can that really be a decade ago?) I was wondering about some form of modal hyperdoctrine and semantics in (indexed) descriptive general frames over here.

I guess as regards time one might stipulate more structure (branching, continuity, etc.), otherwise, as you say, time is not that special.

Posted by: David Corfield on July 10, 2019 5:43 AM | Permalink | Reply to this

Re: Modalities on types

Oh yes, there was further elaboration of that frames idea here.

Posted by: David Corfield on July 10, 2019 5:48 AM | Permalink | Reply to this

Post a New Comment