Here is a simple type theory, which we may call the equational theory of a group.
A word of warning before I begin: This is so simple that people might hesitate to actually call it a ‘type theory’ at all. The reason is that the types are all fixed in advance; there is no theory of how to build new types out of old. For that reason, people might prefer to call it a ‘typed theory’ (a theory with types) than a ‘type theory’ (a theory about types). In fact, not only are all of the types fixed in advance; there is only one type! So people might prefer to call it an ‘untyped theory’ (this is an example of the [[red herring principle]]: an untyped theory is actually a typed theory in which there is exactly one type). I start with this such a simple example because it is one that you should already know, so it will help you to learn the language and notation.
We define everything recursively, so to tell you what (for example) a context is, I simply define what data you might use to build a context, and then give notation for that context. In fact, I will define through mutually recursive definitions what a context is, what a term in a context is, and what an equational identity in a context is; as a technical device, I will also define the number of free variables (which will always be a natural number) in a context.
One normally uses uppercase Greek letters for contexts. To indicate that is a term in the context , we write . (The ‘’ indicates that is the type of the term ; since there is only one type in this theory, that bit is superflous. But I include it so that we can generalise later.) An equational identity in a context will always be a pair of terms in that context; to indicate and constitute such a pair, we write . Finally, to indicate the number of free variables in , we write .
So: a context in the equational theory of a group may be constructed in any of three ways:
- There exists a context, called the empty context and denoted .
- For every context , there exists another context, called the extension of by a new variable and denoted , where is the number of free variables in .
- For every context , every term in , and every term in , there exists a context, called the extension of by the equation of and and denoted .
The notation for these recursively built contexts mirrors the notation for terms and equational identities in a context, which is no coincidence. However, notice that we have commas here, not ‘’.
Since the only way to begin constructing a context is with the empty context, we usually leave out ‘’ and the first comma: thus, , for example, becomes . Similarly, we abbreviate ‘’ to simply ‘’.
Since contexts are constructed recursively, I may define the number of free variables in a context recursively:
- The number of free variables in the empty context is .
- For every context , the number of free variables in the extension of by a new variable is , where is the number of free variables in .
- For every context , the number of free variables in the extension of by any equation is the same as the number of free variables in .
So every time we extend a context by a new variable, we get one more free variable.
Next, a term in a context in the equational theory of a group may be constructed in any of four ways.
- For every context and every natural number , there is a term in , called the th variable of and denoted .
- For every context , there is a term in , called the identity element and denoted .
- For every context and term in , there is a term in , called the inverse of and denoted .
- For every context and terms and in , there is a term in , called the product of and and denoted .
As always in algebra, we tend to omit the outer pair of parentheses for a term constructed as a product, although we need to keep those parentheses if we continue to construct a new term out of it.
Finally, an equational identity in a context in the equational theory of a group is any pair of terms in that may be obtained in any of the following ways.
- For every context and term in , we have the equational identity , by reflexivity at .
- For every context , terms and in , and equational identity , we have , by symmetry at and .
- For every and , we have , by transitivity at , , and .
- For every , we have , called substitution into an inverse at and .
- For every and , we have , called left substitution into a product at , , and .
- For every and , we have , called right substitution into a product at , , and .
- For every term in context, we have , called the left unit law at .
- For every term in context, we have , called the right unit law at .
- For every term in context, we have , called the left inverse law at .
- For every term in context, we have , called the right inverse law at .
- For every , , and , we have , called associativity at , , and .
That's it!
There are many variations that we can introduce for the rules handling variables. Here I've adopted the practice of numbering the variables; as you can see if you start building up a few contexts using the first two rules, the variables are simply numbered in order. Thus a typical context is , which has free variables. It is more common in practice to allow one to introduce arbitrary notation for the free variables; so this same context might be denoted or or , etc. It is possible to write down a formal definition that takes this into account and avoids the counting, but it is technically more complicated, because you have to keep track of which variable names are taken. (In particular, is not allowed!) So for my purposes, all of the actual variable names are of the form ‘’ and anything else is an informal notation.
There are standard ways of symbolically stating all or most of the rules above. I'm not sure about how to put the rules for forming contexts or counting the number of free variables in a context; people would probably prefer to just say in words that a context is a list of variable declarations and equational hypotheses, leaving it to the reader to realise that you have to keep track of how many variables appear (or which variables appear, without repeating any) and that you can only insert an equation that goes between terms that are valid in the context so far. But the other rules may be written like this:
The idea is that if you have everything above a line, then you get the judgement below that line.
If you play around with this, you should find that you can start proving theorems in the equational theory of a group. For example, you can prove that the identity element is its own inverse as follows:
- As always, is a context (the empty context).
- As in any context, we have the term (remember that this abbreviates ).
- Continuing, we have its inverse, the term , and their product, the term .
- By the right identity law at , we have .
- By symmetry at and , we have .
- By the left unit law at , we have .
- Finally, by transitivity at , , and , we have .
This formalises the equational proof
only noting explicitly where the properties of equality and the group axioms are used.
I carried out this proof in the empty context, but I could just as well have done it in an arbitrary context . Sometimes it's nice to do that explicitly; just write ‘’ before every ‘’ in the explicit proof above. But also, there is a metatheorem that any equational identity in the empty context is valid in every context; more generally, any equational identity valid in some context is also valid in any context that is built from by any number of extensions. This metatheorem can be proved by structural induction on the ways in which equational identities (and hence terms and contexts) are constructed.
Here is another theorem, that the inverse of a product is the product of the inverses in the opposite order. I will formalise this argument:
- As always, is a context.
- Extending by a new variable twice, we have the context ; but I'll write this instead.
- Since , we have and , so we have the terms and in the last context. But again, I'm writing these differently, so it comes out like this: and .
- Now I can also build more terms:
- ;
- ;
- ;
- ;
- etc, ultimately building all of the terms that appear in the long equation above.
- Then I start building the equational identities:
- by the left unit law;
- by the right inverse law;
- by symmetry;
- by right substitution into a product;
- etc, etc, etc, finishing with applications of transitivity.
But we could also formalise this by working in an arbitrary context and using any two arbitrary terms (not necessarily variables!) and . Then the same argument works, only writing in place of the context (and skipping the beginning where I derive that and are terms in that context). Once again, there is a metatheorem that anything that can proved one way can also be proved the other way.
Finally, consider the theorem that the inverse of a commutative product is the product of the inverses in the same order. That is, if , then . The argument is
where the dots are filled in the same way as in the previous argument (only with and swapped). To formalise this, we work in the context ; that is, our context now consists of two free variables and an equation stating that they commute. Alternatively, we formalise the argument in an arbitrary context with arbitrary terms and , such that is valid.
Of course, having just proved , it's silly to go through the entire proof of . Instead, we should use the previous result in the course of proving this new one. We actually cannot do so directly, since is irrelevant; it's that we need. However, suppose that we take the broader view, in which we have proved that is valid in any context and any two terms in that context. Then we are OK: if we take to be , take to be , and take to be , then we have , and a single application of transitivity gives us the new theorem.
In fact, we can think of the arbitrary-context versions of these three theorems as derived rules:
Then we use the second derived rule in the proof of the third; more generally, we can use a derived rule to shorten any later proofs.
Well, this is all very long and tedious, but that is not really the fault of type theory as such, but rather of formal logic. Any way of making completely formal the equational reasoning of elementary group theory would be just as long and tedious. One reason that people don't ever write down anything so long and tedious when describing group theory is that they take basic logic for granted. Here I have spelt out that logic — and it is only a very weak logic! (In my next post, I hope to go beyond the equational theory of a group to describe the first-order theory of a group, spelling out a more powerful logic that allows us to state things such as ‘Only the identity equals its inverse.’ as a hypothesis and prove consequences of it. You can't do that in the equational theory.)
The benefit of taking logic for granted is obvious. But the benefit of spelling it out in so much detail is flexibility. The equational theory of a group can be interpreted in any category with finite products, after all, while first-order logic requires more. And you can even take completely different approaches to logic, as with propositions as types or Mike's -logic.
Re: Stack Semantics
I am intrigued by the role the self-indexing stack (on slide 30) plays.
Do you see a role for the under-self-indexing ? Elsewhere the claim is that a tremendous amount of interesting geometric information is encoded in the objectwise stabilized/abelianized self-under-indexing of a higher “category of spaces” . This must have a logical counterpart, I suppose.