Concrete Groups and Axiomatic Theories I
Posted by Guest
Guest post by Todd Trimble
I’d like to take a shot here at explaining some of the ideas on logic that Jim Dolan has been alluding to in his talks in the Geometric Representation Theory seminar, and eventually give an argument for a perhaps surprising idea of his, that “concrete groupoid theory” and “axiomatic theories” are really the same subject, from a kind of Galois theory point of view. This is meant to set the stage for a whole slew of interesting developments, in which we view Jim’s orbi-simplex idea as a geometric description of a general axiomatic theory, which in turn is related to the idea of viewing Tits buildings as “quantized” axiomatic theories, and also perhaps to the theory of classifying toposes and their “Galois theory”. But we’ll get to all that later!
Right now I’d just like to set the scene, and try to flesh out the (somewhat skeletal) description Jim gave of axiomatic theories in precise terms, up to the point where we can at least state the amazing Galois correspondence between groups and theories. In part II, we’ll have a look at proving that this correspondence really works, in part by adapting an interesting argument of Joyal that characterizes analytic functors of species (themselves closely related to the Tale of Groupoidification!).
First, what do we mean by an “axiomatic theory”? At one point Jim drew up a little table under the heading “Axiomatic Theories” (for a quick reminder, see page 2 of Alex Hoffnung’s notes): a theory involves types, predicates on types, and axioms on predicates. For example, in Euclidean geometry, the types might be geometric figures like point, line, etc.; the predicates would be formal relations between figures, such as incidence, or how two planes can be related to each other, and of course there are axioms these predicates are assumed to satisfy.
Here, to keep things simple, we’ll consider theories involving just a single type $X$; for now just think of it being modeled by a set. Now I’m sure everyone has at least a syntactic idea of what is we mean by a first-order or predicate theory: in addition to expressions called predicates, there are logical (let’s say Boolean) connectives $\wedge$, $\vee$, $\neg$ and also quantifiers $\forall$, $\exists$ which are used to construct new predicates from old, and these logical operators are manipulated according to standard rules of inference. But before we get anywhere with this, I’ll need to clean that description up a bit, and say it a bit more crisply and precisely, and algebraically, so that we’ll be able to apply some standard algebraic or categorical tools.
For a familiar example of a predicate theory, consider the theory where the predicates are identified with finitary relations on a set $X$. An $n$-ary relation is just a subset of $X^n$, or alternatively a function $p: X^n \to 2$; typically one writes this as $p(x_1, \ldots, x_n)$ where the $x_i$ are variables over the domain $X$. The set of $n$-ary relations on $X$ is thus the power set $P(X^n) = hom(X^n, 2)$. Since $2 = \{0, 1\}$ carries a Boolean algebra structure, we obtain a Boolean algebra structure on $P(X^n)$ (defining the Boolean operations on relations $p: X^n \to 2$ in the usual pointwise fashion).
These Boolean algebras may be collected together into a functor
$Fin \to Bool$
$n \mapsto P(X^n)$
which we will denote as $P(X^{*})$ or $P(hom(-, X))$, from the category of finite sets to the category of Boolean algebras, and this functor provides a way for the Boolean algebras $P(X^n)$ to “talk to one another”. Namely, for each function $f: m \to n$ between finite sets, the function $P(X^f)$ pulls back an $m$-ary predicate $p: X^m \to 2$ to the composite
$X^n \stackrel{X^f}{\to} X^m \stackrel{p}{\to} 2.$
Because the Boolean operations on $P(X^n)$ are defined pointwise, it is obvious that $P(X^f)$ is a Boolean algebra map.
The functorial point of view on theories is typically missing from traditional accounts of first-order logic, which is a shame, because actually it sheds light on some logical practices [typically picked up in apprentice-like fashion as it is] which usually go unmentioned and perhaps usually unnoticed. For example, consider a predicate expression like
$p(x, y) \wedge q(x).$
Strictly speaking, this makes no sense: the predicates $p$ and $q$ are of different type (one is binary, the other unary); how could you conjoin them? To make sense of this, one thinks of the predicate $q(x)$ as having an extra free variable $y$ “implicitly there”. In reality, what one is doing is pulling back $q$ from an unary predicate to a binary predicate, by pulling back along projection $\pi: (x, y) \mapsto x$. In other words, we are applying the pullback $P(X^f): P(X^1) \to P(X^2)$ to $q$, where $f$ is the inclusion map $\{1\} \subseteq \{1, 2\}$.
The functorial approach with its pullback operations thus provides a precise apparatus for maintaining a “typing discipline” that pays careful attention to the matching up of types; it describes not just pulling back but how one is pulling back.
A beautiful insight of Lawvere is that there is an intimate relationship between the pullback operations and logical quantification:
- Quantification is adjoint to pulling back. (Existential quantification is left adjoint to pulling back, and universal quantification is right adjoint to pulling back.)
To understand this, let’s think about it concretely in terms of subsets. Consider a binary relation $A(x, y)$, which we picture as a blob $A$ in the plane $X^2$. The expression $\exists_y A(x, y)$ represents an unary relation (the set of $x$ for which there exists $y$…), which is nothing but the direct image of $A$ under the projection $\pi: (x, y) \mapsto x$ to the $x$-axis. Now direct image is adjoint to inverse image in the sense that there is an equivalence
$\frac{\pi(A) \subseteq B}{A \subseteq \pi^{-1}(B)} iff$
The syntactic counterpart is that existential quantification along a variable $y$ is left adjoint to pulling back along $\pi: (x, y) \mapsto x$ (which “adds in” the variable $y$):
$\frac{\exists_y A(x, y) \vdash B(x)}{A(x, y) \vdash P(\pi)(B)(x, y)} iff$
Thus, we define $\exists_f$ to be the left adjoint of $P(f)$. This means we can quantify with respect to other maps, not just projections. For example, for the diagonal map $\delta: X \to X \times X$, there is an existential quantification operator $\exists_{\delta}: P(X) \to P(X \times X)$, which involves taking the direct image along $\delta$. In gunky syntactic terms, $\exists_{\delta}$ takes an unary predicate $q(x)$ to the binary predicate $\exists_x q(x) \wedge [x = x_1] \wedge [x = x_2]$ in two free variables $x_1$, $x_2$. As a matter of fact, we can define the equality predicate $[x_1 = x_2]$ in this set-up as $\exists_{\delta}(\top)$, where $\top$ is the top element of the Boolean algebra $P(X^1)$.
(We won’t say much about universal quantification; in our vanilla Boolean logic, universal quantification $\forall_f$ may be defined simply as $\neg (\exists_f) \neg$; in intuitionistic variants, we would define it as right adjoint to the pullback along $f$).
There is one more condition which automatically holds for our functor $P(X^{*}): Fin \to Bool$, and which we will want to require of general typed predicate theories: the pullback operations should “preserve quantification”. To make sense of this, consider first quantifying or pushing forward by taking direct image along a map $X^f: X^n \to X^m$, followed by pulling back or taking inverse image along $X^g: X^p \to X^m$:
$\array{ X^n & & \\ \downarrow \; X^f & & \\ X^m & \stackrel{X^g}{\leftarrow} & X^p }$
The result is the same as pulling back along a map $X^k: X^n \leftarrow X^q$ and then pushing forward or quantifying along $X^h: X^q \to X^p$, where $(X^k, X^h)$ is the pair of projections out of the corner $X^q$ in a pullback square (roughly speaking, we pass through the pullback because we want $\exists_{X^h}$ to ‘simulate’ $\exists_{X^f}$, by ensuring that fibers $(X^h)^{-1}(x)$ match the fibers $(X^f)^{-1}(y)$, where $x$ maps to $y$ under $X^g$; this is achieved by taking the fiber product or pullback). Notice this pullback square is obtained by applying $\hom(-, X)$ to a pushout square in Fin:
$\array{ m & \stackrel{g}{\to} & p \\ \downarrow \;f & & \downarrow \; h \\ n & \stackrel{k}{\to} & q }$
Thus, the precise expression of the slogan “pullback preserves quantification” is the
- Beck-Chevalley condition: given such a pushout square in Fin, the following equation holds (“pullback preserves quantification”): $\exists_{X^h} \circ P(X^k) = P(X^g) \circ \exists_{X^f}.$
(Question: What did Beck and Chevalley have to do with the Beck-Chevalley condition?)
We are now ready to introduce our key player, which is our notion of first-order theory with equality:
- Definition. A polyadic Boolean algebra (which here will simply be called a theory) is a covariant functor $T: Fin \to Bool$ such that for every $f: m \to n$ in Fin, the Boolean algebra map $T(f): T(m) \to T(n)$ has a left adjoint $E(f): T(n) \to T(m)$, and the Beck-Chevalley condition is satisfied: $E(h) \circ T(k) = T(g) \circ E(f)$ as above.
Notes:
- The term “polyadic Boolean algebra”, suggested to me a few years ago by Jim, was purloined from the literature on “algebraic logic”, which has been pursued in various forms by, e.g., Halmos (“polyadic algebras”) and Tarski (“cylindric algebras”). The categorical form it assumes here is a special case of the much more general notion of hyperdoctrine, due to Lawvere.
- I should mention that the word “theory” here is a slight abuse of language. Technically, the term “theory” as usually considered by logicians connotes a presentation. Think for example of a propositional theory: we start with a propositional language, or in algebraic terms a free Boolean algebra on a set of primitive elements called ‘literals’, and then consider a subset of this Boolean algebra whose elements are called axioms, which in turn generate a filter whose elements are called theorems. In ring-theoretic terms, we could just as well consider the ideal generated by negations of axioms, instead of the filter. Either way, this is describing a presentation of the quotient Boolean algebra. The situation for first-order theories is similar, and our definition above is really describing the class of “theories” in direct algebraic terms, without passing through a description via presentations.
- Definition. A morphism between theories $S, T: Fin \to Bool$ is a natural transformation $\theta: S \to T$ which preserves the quantification operations. A model of a theory $T$ consists of a set $X$ and a morphism of theories $\theta: T \to P(X^{*})$, called a $T$-structure on $X$.
Every $T$-structure $\theta: T \to P(X^{*})$, as a morphism of theories, factors through an image theory $Im(\theta)$, i.e., the subtheory of $n \mapsto P(X^n)$ consisting of relations of the form $\theta(p)$, where $p$ ranges over the predicates of $T$. This image may be called the theory of the $T$-structure; the theorems are all the things you can say in $T$ which become true in the structure. Because $P(X^0) = P(1) = 2$, this theory is ‘complete’ in the sense that its sentences (the predicates with 0 free variables) are ‘decidable’ – are either true or false.
- Definition. Let $(X, \theta)$ and $(Y, \zeta)$ be two models of a theory $T$. A homomorphism from the $T$-model $X$ to the $T$-model $Y$ is a function $f: X \to Y$ such that for all $n \geq 0$, $(T(n) \stackrel{\zeta_n}{\to} P(Y^n) \stackrel{P(f^n)}{\to} P(X^n)) = (T(n) \stackrel{\theta_n}{\to} P(X^n)).$
Now we are ready to state the main point:
Jim, in his lecture, stressed that there was a duality between concrete groups (subgroups of a permutation group $G \subseteq X!$), and structures on $X$, or rather the logically complete theories of structures on $X$. This type of thing is familiar from Galois theory, certainly, but it’s also part of the philosophy of Klein’s Erlanger Programm, where the transformation group determines the geometric theory. Part of Jim’s point is that this is more than German philosophy – there’s an actual theorem here. The simplest versions of this theorem occur in toy examples where we impose draconian assumptions, such as assuming $X$ is finite, as we do when considering geometric structures over finite fields $F_q$:
Theorem. Let $X$ be a finite set. There is a Galois correspondence (a bijective correspondence) between
- (Complete) theories of structures on $X$, i.e., subtheories of $P(X^{*})$
- Groups of transformations of $X$, i.e., subgroups $G \subseteq X! = Aut(X)$
Stay tuned for more…
Re: Concrete Groups and Axiomatic Theories I
Thank you, O Guest (Todd, right?). For those of us who aren’t watching the lectures, and probably for those who are too, that was extremely helpful.