Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

October 19, 2009

Syntax, Semantics, and Structuralism, I

Posted by Mike Shulman

Sparked by Arnold Neumaier’s work on “a computer-aided system for real mathematics,” we had a very stimulating discussion, mostly about various different foundations for mathematics. But that thread started getting rather long, and John pointed out that people without some experience in formal logic might be feeling a bit lost. In this post I’ll pontificate a bit about logic, type theory, and foundations, which have also come up recently in another thread. Hopefully this too-brief summary of background will make the discussion in the next post about the meaning of structuralism more readable. (Some of this post is intended as a response to comments on the previous discussions, but I probably won’t carefully seek them out and make links.)

At the most basic level, what we do when we do mathematics is manipulate symbols according to specified rules. Just as in chess the rules state that a knight moves like so and not like so, in mathematics the rules state that a quantifier can be eliminated like so and not like so. (The opinion that this is all there is to mathematics is called formalism, but I’m not espousing that. You are free to believe that these symbols have whatever meaning you want, but even the most Platonist mathematician does mathematics by manipulating symbols on paper.)

Now, the actual rules of the game of mathematics are extremely complicated. But the idea of foundations is to derive these complicated rules from a much simpler list of fundamental rules. I think many people would agree that it’s reasonable to take these fundamental rules to be some sort of calculus of terms. That is, the game is played with sequences of symbols called terms, and the rules specify what terms are valid and what assertions (called judgments) can be made about them. Here are some sample terms:

  • x 2y 2x^2-y^2
  • 0 e x 2dx\int_0^\infty e^{-x^2}\;dx
  • 𝒫(𝒫())\mathcal{P}(\mathcal{P}(\mathbb{R}))
  • n3.x.y.z.x n+y nz n\forall n\ge 3. \forall x.\forall y. \forall z. x^n + y^n \neq z^n

Now many people (myself included) believe that there is more than one fundamentally different type of term. For instance, 0 e x 2dx\int_0^\infty e^{-x^2}\;dx is a real number, 𝒫(𝒫())\mathcal{P}(\mathcal{P}(\mathbb{R})) is a set, and n3.x.y.z.x n+y nz n\forall n\ge 3. \forall x.\forall y. \forall z. x^n + y^n \neq z^n is a claim or a theorem, and it’s illogical to confuse them (and makes it harder for a computer to guess what you meant). Thus, the rules should define one or more types, and one of the judgments should be of the form “tt is a well-formed term of type AA,” written as t:At:A. But if you don’t believe in types, you can just use a version of these rules where there is only one type, call it (say) ThingThing, and read t:Thingt:Thing as merely “tt is a well-formed term.”

In this way we can build many different “type theories”. Most familiar ones include a special type called PropProp of “propositions”, and a special type of judgment which asserts that a given proposition (i.e. a term of type PropProp) is “true.” We generally also include “logical inference” rules for constructing complex propositions (e.g. if φ:Prop\varphi:Prop then (x:A.φ):Prop(\forall x:A. \varphi) :Prop) and deducing the truth of propositions (e.g. if φ\varphi and ψ\psi then also φψ\varphi\wedge\psi). We may also include “axioms”, i.e. propositions φ\varphi such that the judgment “φ\varphi is true” can be made unconditionally. For example, in Peano arithmetic, there are two types NN and PropProp, a constant term 0:N0:N, a rule that if n:Nn:N then s(n):Ns(n):N, a rule that if n:Nn:N and m:Nm:N then (n=m):Prop(n=m):Prop, the usual rules of logical inference, and various axioms.

Now if we want a foundation for all of mathematics, of course we need a more powerful theory than these. But ZFC can also be stated in this style: it has two types SetSet and PropProp, rules saying that if x:Setx:Set and y:Sety:Set then (x=y):Prop(x=y):Prop and (xy):Prop(x\in y):Prop, the usual rules of logical inference and a collection of axioms. The same with ETCS, which it is convenient to write with three types SetSet, FunctionFunction, and PropProp. Especially when we intend a theory like ZFC or ETCS as a foundation for all of mathematics, it is convenient to call the type-theoretic language in which these theories are written the “meta-language” or “meta-theory,” while ZFC/ETCS is the “object language” or “object theory.”

Things get a bit confusing because there are various ways to augment type theories. It’s common to include type constructors, which allow us to form new types such as A×BA\times B out of types AA and BB. Then we need an additional judgment T:TypeT:Type, so that such a type-constructor is a rule saying that (for example) if A:TypeA:Type and B:TypeB:Type then (A×B):Type(A\times B):Type. We can also have type-constructors which take terms as input, in addition to types; these are called dependent types. When we include exponential or dependent-product types, the complexity and power of the type-theoretic meta-language begins to approach that of the object languages ZFC and ETCS, enabling it itself to serve as a foundation for mathematics. That is, instead of defining a group to be a set equipped with (among other things) a function G×GGG\times G\to G, we could interpret a group as a type GG equipped with (among other things) a term m(x,y):Gm(x,y):G with free variables x:Gx:G and y:Gy:G. Or, which amounts to the same thing, we could change our terminology so that what we have been calling “types” are instead called “sets.” In fact, as Toby has pointed out, it is natural to say that a type theory (especially one with many type-constructors) is itself the object-theory in a meta-meta-theory having meta-types (or “kinds”) such as TypeType, TermTerm, and JudgmentJudgment.

The point is that words like “type” and “set” and “class” are really quite fungible. This sort of level-switch is especially important when we want to study the mathematics of type theory, i.e. the mathematical theory of manipulating symbols according to the rules of type theory, analogous to the mathematical theory of moving pieces around on a chessboard according to the usual rules. When we study type theory in this way, we are doing mathematics, just like when we’re doing group theory or ring theory or whatever. It’s just that our objects of study are called “types”, “terms”, and so on. However, what we do in this mathematical theory can, like any other area of mathematics, be formalized in any particular chosen foundation, be it ZFC or ETCS or a type theory at a higher level. Now the type theory is itself the “object-theory” and ZFC is the “meta-theory”!

That was syntax, the rules of the game; semantics is about its “meaning.” Intuitively, we generally think of a type AA as denoting some “collection” of “things”, and a term t:At:A as indicating a “particular one” of those things. In order for this to make sense, the type theory has to exist in some metatheory (which might or might not be formalized) having a notion of “set” to specify the relevant “collections of things”. In particular, there must be a set of types, and for each type there is a set of terms which can be judged to be of that type. The judgment rules for propositions then become the study of formal logic; we say that a proposition is “provable” or is a “theorem” if it can be judged to be true.

Now, a model of this theory assigns a set [A][A] (in the meta-theoretic sense) to every type AA and a function of appropriate arity to every term, in a way so that the rules and axioms are satisfied. Thus, for instance, a model of Peano arithmetic consists of a set [N][N], an element [0][N][0]\in [N], a function [s]:[N][N][s]\colon [N]\to[N], and so on. Likewise, a model of the type theory of ZFC (here the levels get confusing) consists of a set [Set][Set], a function []:[Set]×[Set][Prop][{\in}]\colon [Set]\times [Set] \to [Prop], and so on.

One can then prove, under certain hypotheses, various things about the relationship between syntax and semantics, such as:

  • The Soundness Theorem: if φ\varphi is a proposition which is provable from the axioms of a theory, then the corresponding statement [φ][\varphi] in any model is actually true (in the sense of the metatheory). Equivalently, if a theory has at least one model, then it doesn’t prove a contradiction.
  • The Completeness Theorem: if [φ][\varphi] is true in every model of a theory, then φ\varphi is provable in that theory. Equivalently, if a theory doesn’t prove a contradiction, then it has at least one model.
  • The (first) Incompleteness Theorem: if a theory doesn’t prove a contradiction, then there exist statements φ\varphi such that neither φ\varphi nor ¬φ\not\varphi is provable in the theory.
  • Corollary to the completeness and incompleteness theorems: if a theory doesn’t prove a contradiction, then it has more than one model.

The “certain hypotheses” is where we get into the difference between first-order and higher-order. We say that a type theory is higher-order if it involves type constructors such as function-types B AB^A (intended to represent the “type of all functions ABA\to B”) or power-types PAP A (intended to represent the “type of all subtypes of AA). Otherwise it is first-order. (We have to deal with PropProp specially in first-order logic. If we actually have a type PropProp, then the theory should be higher-order, since PropP1Prop \cong P 1; thus in first-order logic we take PropProp to be a “kind” on the same level as TypeType, which doesn’t participate in type operations.) We say “second-order” if we never iterate the power-type operation.

The Soundness Theorem is true for all theories, but the Completeness Theorem is true only for first-order theories. I believe that the Incompleteness Theorem as I have stated it is true for higher-order theories (if I’m wrong, someone please correct me), but the corollary fails since the completeness theorem does. In particular, a higher-order theory can sometimes be categorical in the logician’s sense: having exactly one model (at least, up to isomorphism). The second-order version of Peano Arithmetic has this property.

Now at the level we’re talking about, it seems that there is no fundamental difference between first-order and higher-order theories; they each have advantages and disadvantages. However, when we move up to the metalevel and talk about the term calculus itself, we always get a first-order theory. This is what I was trying to get at when I said elsewhere that higher-order logic has “no foundational significance”. The point is that what we do when we do mathematics is manipulate symbols on paper, and that is a first-order notion.

In particular, higher-order logic doesn’t allow you to escape any of the philosophical consequences of the Incompleteness Theorem. You are free to believe in a Platonic collection of “actual” or “standard” natural numbers. (I don’t really, myself, but I can’t argue you out of such an essentially metaphysical belief.) Now by the corollary to the Incompleteness Theorem, first-order Peano Arithmetic can’t capture those natural numbers uniquely; there will always be alternate models containing “nonstandard” natural numbers. By contrast, in second-order logic, the second-order versions of the Peano axioms can be shown to have a unique model. However, this second-order metatheory involves basic notions like “power-types” or “power-sets,” so really what you’ve done with this proof is to explain the notion of “natural number” in terms of the notion of “set,” which is (I think most people would agree) far more ontologically slippery! And indeed, when your second-order metatheory is interpreted in some meta-meta-theory, it will have lots of different models, each of which has its own “unique” natural numbers. You are, of course, also free to believe in a Platonic notion of “sets,” but first-order axioms such as ZF can’t characterize those either. There may be a “second-order” version of ZF which uniquely characterizes a model, but you’ve just explained the notion of “set” in terms of the notion of “class,” which is not much of an improvement. The initial ZF-algebra in a category of classes is a similar idea.

To put it another way, suppose that you have some Platonic universe of numbers, sets, and so on in mind, but I’m an alien from another dimension who has a different such Platonic universe in mind. If I persist in interpreting all of your words like “number,” “set,” and so on as referring to things in my Platonic universe, there’s no way you can convince me that I’m wrong, no matter whether the logic in which we speak is first-order or higher-order.

Now there’s another way to build a “canonical” model of a theory, which is how one usually proves the Completeness Theorem: we make a “tautological” model out of the theory itself. That is, for each type AA we simply take the set [A][A] to be the set of terms of type AA with no free variables (or “ground terms”). Without modification, this naive idea fails for two reasons.

First of all, there might not be enough ground terms. Some of the axioms of the theory might assert that there exists something with some property, without there being a corresponding term constructor actually producing something with that property. This is obviously the case for the usual version of ZFC, which has no term constructors at all (hence no ground terms at all!) but lots of axioms that assert the existence of things. This problem is easily remedied, however, by introducing new constant terms or term constructors into the language.

The second problem is that we may not know how to define all the necessary relations on the ground terms in order to have a model. Suppose, for instance, we have a couple of ground terms t 1t_1 and t 2t_2 in some augmented version of ZFC; how can we tell whether t 1t 2t_1\in t_2 should hold in our tautological model? Certainly if the axioms of the theory imply t 1t 2t_1\in t_2, then it should hold, and if they imply t 1t 2t_1\notin t_2, then it shouldn’t, but they might not imply either one. The usual way to remedy this is to enumerate all such statements and one by one decide arbitrarily whether to make them true or false in the model we’re constructing.

This works, but the model we get (though small, even countable, and concrete) isn’t really canonical; we had to make a bunch of arbitrary choices. In the case of Peano Arithmetic, we can avoid introducing new constant terms and obtain a model which is “canonical” and in fact the “smallest” in some sense: it consists of the terms s(s((s(0))))s(s(\dots(s(0))\dots)), which can of course be identified with “the” natural numbers in the meta-theory. (Note that talking about “the set of terms” always depends on the meta-theory having something that it calls the natural numbers, so that terms can be defined inductively.) But I don’t think this is true for many other theories. Suppose, for instance, that we augment ZF with term constructors for all of its existence axioms. Let φ\varphi be a sentence independent of ZF; then our term-constructor for the axiom of separation gives us a term {|φ}\{\emptyset | \varphi\}. Does the relation {|φ}\emptyset \in \{\emptyset | \varphi\} hold in the term model? We have to make an arbitrary choice.

There’s a slicker categorial approach which does produce a really canonical model, but only with an expanded notion of “model”: instead of each [A][A] being a set, we take it to be an object of some fixed category 𝒮\mathcal{S} with enough structure. We can then build a much more “tautological” model because we have the freedom to build the category 𝒮\mathcal{S} along with the model. In the resulting model, the true statements are precisely the statements provable in the theory, and it’s even initial among all models of the theory in the appropriate sort of category.

In conclusion:

  • Types, sets, classes and so on are fungible.
  • Model theory only makes sense in some meta-theory having a notion of “set.”
  • Models for first-order theories are not unique. Models for higher-order theories can be unique relative to a fixed model of the meta-theory, but that meta-theory will have multiple models.
  • Models are also generally not even canonical, unless you expand the notion of “model” to categories other than SetSet.
Posted at October 19, 2009 8:55 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2089

196 Comments & 2 Trackbacks

Re: Syntax, semantics, and structuralism, I

MS: In particular, higher-order logic doesn’t allow you to escape any of the philosophical consequences of the Incompleteness Theorem.

The idea that Realists resort to HOL to escape the PC of IT is a new one to me. After all, IT was put forth by a rank Platonist of the 1st rank, the point being that any world whose Reality amounts to a hill of beings, physical, mathematical, or otherwise, is just bound to exceed our grasp — or what’s a heaven for, Plato’s or otherwise?

Posted by: Jon Awbrey on October 19, 2009 10:40 PM | Permalink | Reply to this

Re: Syntax, semantics, and structuralism, I

A lot to think about here! I just wanted to make one comment, about this remark:

Now at the level we’re talking about, it seems that there is no fundamental difference between first-order and higher-order theories; they each have advantages and disadvantages. However, when we move up to the metalevel and talk about the term calculus itself, we always get a first-order theory. This is what I was trying to get at when I said elsewhere that higher-order logic has “no foundational significance”. The point is that what we do when we do mathematics is manipulate symbols on paper, and that is a first-order notion.

I disagree here. It is often a convenient metaphor to think of proofs/syntax as higher-order objects, as done in infinitary proof theory. The best-known example is the so-called “ω\omega-rule” (which allows you to conclude x.A(x)\forall x.A(x) given proofs of A(n)A(n) for every natural number nn, i.e., given a dependent product Πn:N.pf(A[n/x])\Pi n:N.pf(A[n/x])), used by Schütte to simplify Gentzen’s proof of the consistency of arithmetic.

Of course, now there is a question about how to interpret these infinitary rules, which should be answered by formalizing a meta-meta-theory…and at some point, I agree, it all has to be explained with first-order stuff. But I don’t see a priori why the entire explanation can’t be simpler with more meta-levels.

Posted by: Noam Zeilberger on October 19, 2009 11:48 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Of course, now there is a question about how to interpret these infinitary rules, which should be answered by formalizing a meta-meta-theory…and at some point, I agree, it all has to be explained with first-order stuff.

That sounds to me like we don’t actually disagree.

But I don’t see a priori why the entire explanation can’t be simpler with more meta-levels.

I didn’t mean to say that adding meta-levels isn’t sometimes be a helpful thing to do, for lots of reasons. I was just saying that for philosophical purposes, everything eventually boils down to first-order.

Posted by: Mike Shulman on October 20, 2009 1:35 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Mike wrote:

… even the most Platonist mathematician does mathematics by manipulating symbols on paper.

Since I agree with almost everything you wrote, I have to pick on this one tiny digression to find anything with which I take issue.

I do a lot of mathematical work by thinking and visualizing things, not manipulating symbols on paper. If I were a Platonist of a certain ilk, I’d say this was a direct connection to the realm of Ideas.

Unfortunately I don’t seem able to use this direct connection to figure out whether Diophantine equations have solutions in cases where the existence of these solutions is independent from ZFC.

Posted by: John Baez on October 20, 2009 1:05 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I do a lot of mathematical work by thinking and visualizing things, not manipulating symbols on paper.

Fair enough; so do I, and I expect so do many people. So the phrase “does mathematics” was probably an overbroad choice of words. What about if I said instead that the portion of “doing mathematics” which we have a hope of studying mathematically, and hence of writing down foundations for, consists of manipulating symbols on paper?

Posted by: Mike Shulman on October 20, 2009 1:37 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Hi Mike!

You might ask David Corfield what he thinks about these claims!

This is the point I was going to contest as well. Lakatos is probably one of the first philosophers of mathematics to actively fight back against the symbol-manipulation model of “doing mathematics”.

“Doing mathematics” is obviously something that happens in the physical world, and not in an idealized mathematical world. Just like other aspects of the physical world, we can clearly come up with very good mathematical models of it. Frege/Gödel/Tarski give a very nice and appealing model that basically comes down to proof theory, and this looks like a slight generalization of that.

Unfortunately, that model leaves certain things (axioms) as boundary conditions that need to be specified. What would be nice is a model that can help explain where these boundary conditions come from as well. I suspect that such a model will talk about things like reasoning by analogy, induction (in the scientific sense), inference to the best explanation, and other sorts of non-deductive reasoning procedures. But of course, I don’t have a better model to propose than the formal-proof type model, despite the flaws in that model. But perhaps there could be a hope of mathematically studying more of “doing mathematics” than just this sort of (deductively valid) manipulation of symbols on paper, just as it turned out there was a hope of mathematically studying more of classical mechanics than just frictionless planes and massless pulleys and the like.

Posted by: Kenny Easwaran on October 20, 2009 9:17 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Well, as I’ve said in several places, Lakatos’s opposition between the formal and non-formal is far too simplistic, but he pointed us in some useful directions. He would have been delighted that so much mathematical conversation is taking place these days out in the open. Being here is like living in a Lakatosian dialogue.

As for the mathematical study of more aspects of mathematics than symbol manipulation, it does seem to be much harder. Bayesian reconstructions of how your strength of belief in a conjecture changes as new evidence emerges doesn’t take you too far.

About the choice of axioms as boundary condition that Kenny mentions, one strong point in favour of category theory for me is its ability to relate systematically different such choices. This discussion – ‘Group = Hopf Algebra’ – is a good example. But there’s nothing quite to compare with Lawvere’s papers, e.g., the one mentioned here, or Metric spaces, generalized logic and closed categories, or Taking categories seriously.

Posted by: David Corfield on October 20, 2009 10:24 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Maybe what I really mean is “the portion of ‘doing mathematics’ which we take into account when evaluating its correctness consists of manipulating symbols on paper.” Maybe that’s not quite right either. I agree that it would be very interesting to come up with a mathematical theory of other parts of ‘doing mathematics’; but that’s not really what the enterprise of foundations is about.

This reminds me of the old joke that category theory is not a foundation for mathematics but instead a ‘co-foundation’, since it describes things from the ‘top down’ rather than the ‘bottom up’.

Posted by: Mike Shulman on October 20, 2009 6:22 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Perhaps the word is ‘roof’?

Posted by: John Baez on October 20, 2009 6:45 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Co-foundation?

Too close to con-foundation for comfort, if you ask me.

Seeing as how we always begin in medias res, I think a “medium” would serve the analysis of the practical situation a little better than any foundation.

Insert your own joke about the Mathematical Séances here.

Posted by: Jon Awbrey on October 20, 2009 6:56 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

More generally speaking, we operate on physical symbol systems — yet another purloin of Peirce that AI “re-discovered” from hints on the mantel.

More insightfully speaking, we are a physical symbol system, with all the constraints on channel capacity that implies.

I think most physicists probably comprehend that models come before theories, and I think most mathematicians thought that way until recent times, when model theorists detached themselves from reality and sought to recreate the world from syntax.

If you comprehend that the original models are generally more concrete than the theories thereof, then it’s clear why theories under-determine models — there has been a loss of information in forming a theory.

Posted by: Jon Awbrey on October 20, 2009 1:56 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

when model theorists detached themselves from reality and sought to recreate the world from syntax

I really couldn’t say what precisely you have in mind here, but it makes me wonder whether you have any idea of the profound mathematical advances actual model theorists have made. Why on earth do you impugn model theorists (actual model theorists, not caricatures) of all people?

Posted by: Todd Trimble on October 20, 2009 3:07 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

That was a too terse remark that I dashed of while hurrying to an appointment.

I learned what little formal model theory I know from Chang and Keisler (1973) and what I fancy to be the general spirit of model theory is basic to the way I see the whole task crafting computational support for the practice of mathematical inquiry.

So I’m not really talking about “model theorists” as math practitioners, or the things that we learn by pursuing their particular forms of abstraction. I take that much as understood.

I am talking about a tendency to forget the bearing of abstractions on the things from which they are abstracted. That is a general tendency in mathematics — there are reasons for it and reasons for being wary of it.

But the object case at hand arises with the over-emphasis on term models. I can remember when I was a bit too fascinated myself with the models that we generate out of syntax alone, and I didn’t know why others set them aside as being something other than “real models”.

I think I understand that better now. Term models, and artificial models in general, can be nice for establishing consistency if nothing else comes to mind, but that amounts to a very small consistency if you can’t connect them with anything outside their own realm.

Posted by: Jon Awbrey on October 20, 2009 4:08 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

George Boole (father of symbolic logic, grandfather of computer science, great grandfather of AI) wrote about laws of thought when Peirce was only 15 years old so the logical conclusion is that Peirce purloined Boole’s original idea regarding signs, if your charge about AI is true.

“In Boole’s opinion, the mental processes that underlie reasoning are made manifest in the way in which we use signs. Algebra and natural language are systems of signs, and so the study of the laws that the signs of these systems meet should allow us to arrive at the laws of reasoning.”

Posted by: Stephen Harris on October 20, 2009 9:44 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

“Surprisingly, given his standing in the academic community, Boole’s idea was either criticized or completely ignored by the majority of his peers. Luckily, American logician Charles Sanders Peirce was more open-minded.

Twelve years after Boole’s ‘Investigation’ was published, Peirce gave a brief speech describing Boole’s idea to the American Academy of Arts and Sciences - and then spent more than 20 years modifying and expanding it, realizing the potential for use in electronic circuitry and eventually designing a fundamental electrical logic circuit.

Peirce never actually built his theoretical logic circuit, being himself more of a logician than an electrician, but he did introduce Boolean algebra into his university logic philosophy courses.

Eventually, one bright student - Claude Shannon - picked up the idea and ran with it.”

Posted by: Stephen Harris on October 20, 2009 9:54 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Now, Stephen, you should know by now I always use that quaint old word “purloin” in the same humorous vein that rhymes with MacLane.

Posted by: Jon Awbrey on October 20, 2009 12:22 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

“I do a lot of mathematical work by thinking and visualizing things, not manipulating symbols on paper. If I were a Platonist of a certain ilk, I’d say this was a direct connection to the realm of Ideas. ” Humm,but sometimes the “thinking ” process in mathematics (specially in algebra )involves symbols manipulation.Isn`t it ?
Here is an interesting paper on the issue:http://www.stanford.edu/~kdevlin/Berlin06.pdf

Regards

Posted by: Serifo on October 20, 2009 8:18 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I


From review of “The Birth of Model Theory”

What are model theoretic methods? Model theory is the activity of a
‘self- conscious’ mathematician. This mathematician distinguishes an object language(syntax) and a class of structures for this language and ‘definable’ subsets of those structures (semantics). Semantics provides an interpretation of inscriptions in the formal language in the appropriate structures. At its most basic level this allows the recognition that syntactic transformations can clarify the description of the same set of numbers.

Posted by: Stephen Harris on October 20, 2009 9:03 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

JB wrote: Unfortunately I don’t seem able to use this direct connection to figure out whether Diophantine equations have solutions in cases where the existence of these solutions is independent from ZFC.

When you speak of solutions, those are concrete things that your computer computes for you using the integers. Would your computer find different solutions is you removed Choice or used PA rather than ZF. Aren’t all the solutions independent of ZFC? You have a very capable colleague at UCR and I think he uses the term “grade-school” (+/-) arithmetic.

Almost the first consequence of formalizing computation was that we can formally establish its limits. Kurt Godel showed that the process of proof in any formal axiomatic system of logic can be simulated by the basic arithmetical functions that computation is made of. Then he proved that any sound formal system that is capable of stating the grade-school rules of arithmetic can make statements that can be neither be proved nor disproved in the system. Put another way, every sound formal system is incomplete in the sense that there are mathematical truths that cannot be proved in the system. Turing realized that Godel’s basic method could be applied to computational models themselves, and thus proved the first computational unsolvability results.
Since then, problems from many areas, including group theory, number theory, combinatorics, set theory, logic, cellular automata, dynamical systems, topology, and knot theory, have been shown to be unsolvable. In fact, proving unsolvability is now an accepted “solution” to a problem. It is just a way of saying that the problem is too general for a computer to handle—that supplementary information is needed to enable a mechanical solution.

Posted by: Stephen Harris on October 27, 2009 6:15 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

OK. I think I know what’s going on here. Please tell me if I’m crazy or if I’ve misunderstood (yet another) something.

You’re telling me that you can write down a Diophantine equation where the existence of solutions is independent of Peano Arithmetic.

I’m fine with this because I don’t care about Peano Arithmetic; I care about the Peano axioms in their full second-order glory. Because those are my integers, dammit. And besides, I want a Diophantine equation that’s independent of ZFC.

Now, you tell me that I can’t talk about the Peano Axioms outside of some set theory like ZFC which, if it isn’t inconsistent, has many models.

So, I’m happy to assume ZFC is consistent and pick a model. Now, we write down our Diophantine equation, but because we’ve chosen a model for ZFC our statement that was independent of ZFC has a definite truth value. Thus, the Diophantine equation either has a solution in the standard integers that I’ve grown rather attached to, or it doesn’t. But all this depends on the fact that are working in the given model. Thus, we can’t ever prove in ZFC that the Diophantine equation has a solution or not, but as soon as we pick a model there is a definite answer.

And we all lived happily ever after?

Posted by: Aaron Bergman on October 20, 2009 3:36 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I think that’s all right, except maybe:

Thus, the Diophantine equation either has a solution in the standard integers that I’ve grown rather attached to, or it doesn’t.

That’s okay as long as you realize that “the standard integers” will be different in different models of ZFC.

Posted by: Mike Shulman on October 20, 2009 4:31 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I’d agree with Michael, and I’d go slightly further: if you have an idea of what your integers are to start with, and are happy to use the second-order Peano axioms for them, then you must also be implicitly believing in your model of ZFC (in which they live) as well.

The tricky thing with all of this (one of the main sources of trickiness, at least), is that while the syntactic statements (“such-and-such is provable in XYZ”) are very cut-and-dried and finitist, any statements about models depend, often heavily, on the meta-theory we’re discussing models in, and generally such a meta-theory will have to be at least as complicated as the theory was itself.

This is part of what makes Platonism really hard to analyse: if we have the real integers, where do they live? In the real model of ZFC, or type theory? So where does that live, in turn?

Posted by: Peter LeFanu Lumsdaine on October 20, 2009 4:47 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Hi Peter,

I found Dedekind’s article “What are the numbers, and what should they be?” to be very helpful in clarifying my thinking on what Platonism means. In terms of actual math, what Platonists want is the ability to talk about “the” set of integers, and so on. That is, they want to be able to appeal to unique existence properties.

So the question is, can this style of argument be justified? What’s interesting about Dedekind’s article is that he tries to give a construction of the natural numbers in this style, and that a) his argument has a lot of intuitive sense to it, and b) yet it doesn’t actually work set-theoretically.

Basically, he shows some construction of the natural numbers is possible, and then he claims the process of abstraction lets you forget all the irrelevant structure and get “the” natural numbers. So you get existence by an explicit construction, with uniqueness following by a abstraction-as-forgetting procedure.

Of course, this doesn’t work set-theoretically, because you can use the membership operation to discover the internal structure of a construction. So his abstraction operation isn’t actually possible.

However, this can be made to work type-theoretically! Because terms have types, but not vice-versa, you can extend type theory with parametricity axioms, which lets you turn many categoricity results into *equality* results. For example, with parametricity you can show that any two terms of type “exists nat:type. Peano_axioms(nat)” are *equal*.

(Of course, you need some very fancy models of type theory to understand what this equality means set-theoretically – you’ll need relationally parametric PER models, or something like that, at which point equality is seen to imply contextual equivalence.)

The point is that this seems to give you a syntactic version of what Platonists want: the natural numbers are unique, and moreover this is a fact you can use without resorting to the metatheory (ie, leaving the internal language). And even in the metatheory, you will be able to prove that object-level-equal things are contextually equal in the metatheory, which exactly captures the idea of being unable to observe irrelevant structure, and hence of Dedekind’s abstraction operation.

So these days I tend to regard Platonism as a weird way of making a set of perfectly mathematically-natural demands upon a logic.

Posted by: Neel Krishnaswami on October 20, 2009 11:32 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

PLF: This is part of what makes Platonism really hard to analyse: if we have the real integers, where do they live? In the real model of ZFC, or type theory? So where does that live, in turn?

There is nothing in Platonic or Scholastic Realism that forces one to toe Kronecker’s line about the integers.

It is merely the Big Idea — to borrow yet another line from Peirce — that there are real generals in nature that persist in making their impression on us. In saying this Peirce uses the traditional definition of “real” that means “having properties”. What those properties are, exactly, is always subject to further inquiry.

Posted by: Jon Awbrey on October 20, 2009 5:18 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Yep, I’d agree! I wasn’t meaning to suggest that all Platonists would assert these existences, nor indeed that this a philosophical problem for Platonism in general. I meant it just as part of the reason why questions like Aaron’s above, about precisely how well our formal syntax and semantics reflect the Big Idea, often end in skepticism of frustration.

I wouldn’t go so far as to claim that compating Platonic and formal worlds of mathematics is a category error, but it does often lead to them. It’s not obvious how to compare a Platonically conceived object (eg “the standard integers we know and love”) with a formal one (“the integers as described by 2nd-order PA”) without putting both into some common universe (eg a model of ZFC) — but then, should that universe be Platonic or formal?

(This all seems curiously reminiscent of comparing theories of higher categories!)

Posted by: Peter LeFanu Lumsdaine on October 20, 2009 5:57 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I wouldn’t go so far as to claim that compating Platonic and formal worlds of mathematics is a category error

Is ‘compating’ a portmanteau of ‘comparing’ and ‘conflating’? If so, I like it!

Posted by: Toby Bartels on October 20, 2009 9:47 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

compate, vt [MF, fr. ML compatibilis, lit., sympathetic, fr. LL compati] (21c) 1 : to render capable of existing or operating together in harmony theoriesdevices\langle \sim theories \rangle \; \langle \sim devices \rangle … see PAT.

Posted by: Jon Awbrey on October 20, 2009 10:10 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

compate

Thanks! Someday I'll remember that dictionaries exist …

Posted by: Toby Bartels on October 21, 2009 10:02 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

And if dictionary entries do not exist, then it is necessary to invent them.

Posted by: Jon Awbrey on October 22, 2009 1:46 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

What a nice self-referential word! Back in my New Age days I read “The Implicate Order” by David Bohm wherein he blends some nice Latin roots (maybe a few Greek?) to coin some new words for conceptual categories. From “Quantum Implications” essays,

In fact according to the emphasis that I would like to get across here, categorizing is simply a cognitive extension of the process of naming. It may seem to some readers that I have got this relationship back to front, but it is a feeling among contemporary semanticists that the classical idea that language is a portmanteau for meaning ‘la vision ferroviaire du langage,’ as Fauconnier so aptly names it, is an obstacle in semantics, in the same way that David Bohm has suggested that the mind-body distinction has inhibited the growth of scientific thought in general.

Posted by: Stephen Harris on October 22, 2009 10:22 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

PLFL: This is part of what makes Platonism really hard to analyse: if we have the real integers, where do they live? In the real model of ZFC, or type theory? So where does that live, in turn?

Let me go back to a facet of the question that I tiptoed past the first time, and try to answer it in light of your followup remarks.

Where do mathematical objects live?

Putting aside the dodgeful artistry about Plato’s Heaven, they live in Nature and they live in our systems of cultural practices, which a naturalist will regard as a “second nature”, part of the first, however well nurtured and guided by deliberate norms.

Viewing the scene through Peircean scopes, I see three worlds here — the world of signs catalyzing interactions amid the world of referent objects and the world of interpretant ideas.

From that point of view the missing term that should follow “syntax” and “semantics” is “pragmatics”, a reminder to put the interpreter back into the picture of how things mean anything at all to anyone.

None of that tells us whether the whole system of whole numbers is One or Many — it only indicates the dimensions of the full context of inquiry within which a question like that can have any hope of sense or settlement.

Posted by: Jon Awbrey on October 21, 2009 4:00 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Peter said

…while the syntactic statements (“such-and-such is provable in XYZ”) are very cut-and-dried and finitist, any statements about models depend, often heavily, on the meta-theory we’re discussing models in, and generally such a meta-theory will have to be at least as complicated as the theory was itself.

Does this tie in with discussions we’ve had elsewhere on the algebra-coalgebra split? Formal languages can often be represented as initial algebras for a term-formation construction. So take a set XX of ground terms and a set Σ\Sigma of term constructors, then the language is an initial algebra for the functor F:YX+Σ(Y)F: Y \to X + \Sigma (Y).

If models are duals, formed by mapping sentences of a language into SetSet, then we’d expect them to lie on the coalgebraic side.

What’s forced on us here? Can’t we have coalgebraically defined languages, where the destructors unpack the terms, potentially endlessly?

Posted by: David Corfield on October 20, 2009 6:00 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Can’t we have coalgebraically defined languages, where the destructors unpack the terms, potentially endlessly?

That’s a very interesting idea! I would be inclined to describe the relationship somewhat differently, though. It seems to me that a “model” is basically an algebra for the term-formation functor. Hence why the terms, elements of the initial algebra, form an (initial) “term model” (at least when there are only terms and no logic to mess things up). The “coterms” in the terminal coalgebra would also form a “coterm model” (since terminal coalgebras are also algebras), but also a “comodel,” i.e. a coalgebra for the term-formation functor. In a model, we can apply term-constructors to any suitable collection of elements to get a new element, whereas in a comodel, we can unpack any element into an application of a well-defined term-constructor and a suitable collection of elements.

Posted by: Mike Shulman on October 20, 2009 6:49 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Ghani et al. in Dualizing Initial Algebras take there to be two ways to dualize the term model, i.e. the initial algebra for the endofunctor based on a signature. For a signature Σ\Sigma there is a free endofunctor F ΣF_{\Sigma}. The carrier of the initial algebra for XX+F Σ(X)X \to X + F_{\Sigma}(X) consists of finite terms of the language. Its terminal coalgebra consists of possibly infinite terms. In some sense the former is the Cauchy completion of the latter.

But they want to point to another dualization via the cofree endofunctor generated by Σ\Sigma, G ΣG_{\Sigma}. Then the functor XX×G ΣX \to X \times G_{\Sigma} has both initial algebra and terminal coalgebra.

Posted by: David Corfield on October 21, 2009 11:43 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

In some sense the former is the Cauchy completion of the latter.

At first I was confused by this, but then when I went to look at the paper I realized you meant Cauchy completion in the good old-fashioned sense of metric spaces. (-:

But they want to point to another dualization via the cofree endofunctor generated by Σ\Sigma, G ΣG_\Sigma.

Looks interesting. I don’t get any intuition for what it might say about logic, though.

Posted by: Mike Shulman on October 21, 2009 5:36 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

All Cauchy completion being the ‘same’ construction, of course. Hmm, I wonder if we have to factor completions for initial algebras through the metric space completion, as Barr does in Terminal coalgebras for endofunctors on sets. Perhaps Adamek’s work on Final Algebras are Ideal Completions of Initial Algebras is more natural.

Posted by: David Corfield on October 22, 2009 11:50 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Regarding Ghani et al.’s second dualization of initial algebras, from what I can glean quickly, instead of the usual algebraic study of operations X mX nX^m \to X^n, i.e., from Hom(m,X)Hom(m, X) to Hom(n,X)Hom(n, X), we now look at operations from Hom(X,m)Hom(X, m) to Hom(X,n)Hom(X, n). For example, if m=n=2m = n = 2 we’re looking at what are called predicate transformers.

The basic idea of predicate transformer semantics going back to Dijkstra is that the meaning of a program is a map sending a postcondition to its weakest precondition. One starts with a set of states (e.g. a mapping from variable locations to integers) and then considers pre- and post conditions as predicates over the set of states. Thus the semantics of a program is a mapping from predicates to predicates, hence the term predicate transformer. (p. 12)

If you buy into the ‘program = proof’ paradigm, then you’d see a proof as a predicate transformer too, sending the predicate of the result to the predicate of the premisses.

There’s a whole book dedicated to the subject by Ernest Manes of ‘Manes and Arbib’ fame.

Posted by: David Corfield on October 22, 2009 4:27 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I was struck by Alexandre Borovik’s talk at the Hertford conference where he said of the proof of the classification of all finite simple groups that it was really a recipe for producing a proof as needed, rather than an existing finite entity. His analogy with lazy evaluation in computer science put me in mind of coalgebra. Recall the idea of zipping streams (i.e., splicing two infinite lists together). The algebraic way of zipping lists would want repeated unfolding of the lists so that what remains to be done is getting smaller at each step, i.e., it needs finite lists. The coalgebraic way just needs to reveal more information about the result at each step, so can happily deal with infinite lists.

So could one have a coalgebraic proof activity where at each step you’re revealing more of the proof without any assurance of termination?

Posted by: David Corfield on October 21, 2009 3:30 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

So could one have a coalgebraic proof activity where at each step you’re revealing more of the proof without any assurance of termination?

In principle, maybe. But I doubt that was what Borovik meant; it seems much more likely to me that he was thinking of a potential finite proof for which we have a recipe for constructing pieces of it as needed. I think a truly “coalgebraic proof activity” would be so different from what we are used to that I would be hesitant to call it a “proof.” Granted we don’t have a definition of “coalgebraic proof activity,” but it feels to me as if I could prove anything at all if my derivations are allowed to stretch backwards infinitely far, without ever having to “bottom out” at an axiom.

The idea does, however, make me think of zero-knowledge proofs and other probabilistic proof methods, which can go on forever with the probability of truth approaching 11 asymptotically.

Posted by: Mike Shulman on October 21, 2009 5:20 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I don’t know if this counts, but dependently typed functional languages that include coinduction can quite easily have coalgebraic proofs, if one thinks in terms of Curry-Howard. One well known use for coinduction, for instance, is defining a monad to model partiality (in Agda):

data D (A : Set) : Set where
now : A D A
later : ∞ (D A) → D A

⊥ : ∀{A} → D A
⊥ = later (♯ ⊥)

Where ∞ is used for defining greatest fixed points, with ♯ being read something like “delay” (this is due to coinduction still being a work-in-progress in Agda, nothing fundamental. D A above is a (weakly) final coalgebra of the functor D’_A X = A + X). Observation is denoted ♭.

Anyhow, one might like to have a notion of divergence for these objects, which is typically written:

data Diverges {A : Set} : D A → Set where
step : ∀{da} → ∞ (Diverges (♭ da)) → Diverges (later da)

⊥-diverges : ∀{A} → Diverges (⊥ {A})
⊥-diverges = step (♯ ⊥-diverges)

and, obviously, these proofs of divergence are themselves coalgebraically defined (somehow; I’m not really up on the (co)algebraic semantics of (co)inductive families, but I’m pretty sure there’s work in that area, at least). One could, of course, define divergence in ways that aren’t (obviously) coalgebraic (like, da diverges if for all n, da doesn’t converge in n steps; such a proof is isomorphic to a stream of proofs, though, as Neel Krishnaswami points out), but the above version is quite natural to work with.

Posted by: Dan Doel on October 24, 2009 9:09 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

So could one have a coalgebraic proof activity where at each step you’re revealing more of the proof without any assurance of termination?

Sure – this idea arises in structural proof theory, albeit for technical rather than philosophical reasons.

Consider what a proof of a proposition forall n. P(n) might be. The “most naive” thing we could do is to imagine giving a case analysis over the natural numbers, with each branch of the proof proving P(0), P(1), P(2), and so on.

Now, a finite representation of such an infinite sequence of proofs can be expressed as a coalgebraic object, such as a stream of proofs. But note that the universal property of final coalgebras is a natural transformation taking arrows A -> F(A) to arrows A -> nu(F). This means that when you construct such an object, it will contain as data a function from your metalogic, telling you what to do at each branch. So these coalgebraic proofs correspond to the omega-rule of Burcholz, and is a way of saying “I accept as an induction principle for the natural numbers any induction principle my metalogic justifies”.

This lets you focus on the interesting properties of your logic, without necessarily getting bogged down in issues of “logical strength” (which are often the least interesting feature of a logic from the structural proof theorist’s POV).

Posted by: Neel Krishnaswami on October 22, 2009 11:58 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

while the syntactic statements (“such-and-such is provable in XYZ”) are very cut-and-dried and finitist, any statements about models depend, often heavily, on the meta-theory we’re discussing models in

I’m not sure I agree with that. The Gödel sentence G XYZG_{XYZ} of XYZ is of the form “such-and-such is not provable in XYZ” (where such-and-such is G XYZG_{XYZ} itself), but its meaning depends on the metatheory as well. If the metatheory is UVW + “XYZ is consistent”, then G XYZG_{XYZ} is true, but if the metatheory is UVW + “XYZ is inconsistent” then G XYZG_{XYZ} is false.

generally such a meta-theory will have to be at least as complicated as the theory was itself.

Actually, I think part of the point is that any theory, no matter how complicated, can eventually be expressed in a simple type-theoretic meta-theory. Certainly the metatheory will have the same philosophical issues surrounding incompleteness, but I don’t think that necessarily translates into complication.

Posted by: Mike Shulman on October 20, 2009 6:38 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I don’t know how to make sense of that statement. What does ‘different’ mean, and how do we compare things in different models?

Posted by: Aaron Bergman on October 20, 2009 11:08 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

A ‘model’ of ZFC involves a map that assigns an actual set to each expression in ZFC that denotes a set. So, what Mike is saying is that the set of integers will be different in different models of ZFC. And not only will it be a different set, the operations on it — like addition, multiplication and so on — will have different properties.

Notice: you need to use set theory to talk about models of ZFC! If this sounds circular, well, that’s why logicians need to be careful about distinguishing the ‘object language’ from the ‘metalanguage’. First you develop set theory using your favorite axioms. Then a model of ZFC involves picking, for each expression in ZFC that denotes a set, an ‘actual set’ — that is, a set in the set theory you’ve already developed.

So, it’s not actually circular. But it is a bit dizzying. And it means that ‘picking a model’ for ZFC is no panacea, because this model is defined within some other set theory. So various questions have just been pushed back one notch.

This is why Mike wrote stuff like this:

In particular, higher-order logic doesn’t allow you to escape any of the philosophical consequences of the Incompleteness Theorem. You are free to believe in a Platonic collection of “actual” or “standard” natural numbers. (I don’t really, myself, but I can’t argue you out of such an essentially metaphysical belief.) Now by the corollary to the Incompleteness Theorem, first-order Peano Arithmetic can’t capture those natural numbers uniquely; there will always be alternate models containing “nonstandard” natural numbers. By contrast, in second-order logic, the second-order versions of the Peano axioms can be shown to have a unique model. However, this second-order metatheory involves basic notions like “power-types” or “power-sets,” so really what you’ve done with this proof is to explain the notion of “natural number” in terms of the notion of “set,” which is (I think most people would agree) far more ontologically slippery! And indeed, when your second-order metatheory is interpreted in some meta-meta-theory, it will have lots of different models, each of which has its own “unique” natural numbers. You are, of course, also free to believe in a Platonic notion of “sets,” but first-order axioms such as ZF can’t characterize those either. There may be a “second-order” version of ZF which uniquely characterizes a model, but you’ve just explained the notion of “set” in terms of the notion of “class,” which is not much of an improvement.

Posted by: John Baez on October 20, 2009 6:03 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

A ‘model’ of ZFC involves a map that assigns an actual set to each expression in ZFC that denotes a set.

Do you mean this in the sense of “model theory”? More generally, I’d say that a model assigns an object to each expression that denotes a “set”, and what you’re describing is something like a “set-theoretic model”.

Posted by: John Armstrong on October 20, 2009 7:18 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

What does ‘different’ mean, and how do we compare things in different models?

Good question! You’re right that a priori, the “set” N 1N_1 of natural numbers in one model V 1V_1 of ZFC and the “set” N 2N_2 of natural numbers in another model V 2V_2 are basically incomparable (i.e. they have different types), since the meaning of “set” is completely different. For instance, if V 1V_1 is a term model, then N 1N_1 will be a term, whereas if V 2V_2 is a “real model” then N 2N_2 will be a “real set.” I suppose one can argue about whether that makes N 1N_1 and N 2N_2 “different,” but they certainly aren’t “the same,” which is why I think it’s misleading to call any one of them “standard.”

However… in the special case of a set theory, like ZFC, there is a way to compare “sets” in different models. In order to have models, we have to have a meta-theory with some set-like notion. Suppose we call these “meta-sets” classes. Then a model of ZFC is a class VV, whose elements are called “sets”, with a binary relation on VV called \in, satisfying various axioms. Now suppose we have two such models V 1V_1 and V 2V_2; each has their own notion of “set” and their own notion of \in (say 1\in_1 and 2\in_2). Now suppose given x 1:V 1x_1: V_1 (a “set” according to V 1V_1) and x 2:V 2x_2: V_2 (a “set” according to V 2V_2). I’m using typing-syntax to denote “membership” in classes, which belongs to the metalanguage, to avoid confusion with the “membership” symbol \in which belongs to the object-language. Now we can form two classes x 1¯={y:V 1|y 1x 1}\overline{x_1} = \{ y: V_1 | y \in_1 x_1\} and x 2¯={z:V 2|z 2x 2}\overline{x_2} = \{ z : V_2 | z \in_2 x_2\}. Since these are both classes, we can compare them in various ways; for instance we could ask whether they are isomorphic. It will then be true that there exist a pair of models V 1V_1 and V 2V_2, with corresponding “sets” N 1N_1 and N 2N_2 of natural numbers, such that the classes N 1¯\overline{N_1} and N 2¯\overline{N_2} are not isomorphic.

Posted by: Mike Shulman on October 20, 2009 6:16 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I’m pretty sure that I don’t care that they’re non-isomorphic in this sense. For me, standard integers means that they obey the second-order Peano axioms. You tell me that this means that they are only standard in the context of a model of ZFC, and that doesn’t bother me.

Posted by: Aaron Bergman on October 21, 2009 12:34 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Aaron wrote:

You’re telling me that you can write down a Diophantine equation where the existence of solutions is independent of Peano Arithmetic.

Yes, infinitely many — and there’s no way to settle all these questions by any systematic procedure whatsoever. That’s a theorem.

So, I’m happy to assume ZFC is consistent and pick a model.

Okay. How do you plan to pick a model?

Now, we write down our Diophantine equation, but because we’ve chosen a model for ZFC our statement that was independent of ZFC has a definite truth value.

Right.

But I hope you see how useless this is: for infinitely many cases, there’s still no procedure for you to discover this ‘definite truth value’.

So, you really haven’t done much more than say this: “Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.” You didn’t need a model of ZFC to claim this; a model of Peano arithmetic would do just fine.

And regardless of whether you use Peano arithmetic or ZFC, no matter how you ‘pick a model’, there’s no systematic method to use it to decide for all Diophantine equations whether they have solutions or not!

Indeed, no matter what consistent procedure you pick to settle whether Diophantine equations have solutions, I can write an infinite list of them for which it’s perfectly consistent with your procedure to flip a coin to decide whether they have solutions or not!

As long as you’re happy with this, I have no problem with anything you’re saying. But I keep feeling you haven’t accepted how strange arithmetic actually is. And that’s why I’m sort of going after you…

Posted by: John Baez on October 20, 2009 5:22 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

So, you really haven’t done much more than say this: “Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.”

But that’s all I ever wanted to say. You guys just weren’t letting me do so.

You didn’t need a model of ZFC to claim this; a model of Peano arithmetic would do just fine.

But I don’t want Peano Arithmetic; I want the integers.

And regardless of whether you use Peano arithmetic or ZFC, no matter how you ‘pick a model’, there’s no systematic method to use it to decide for all Diophantine equations whether they have solutions or not!

This doesn’t surprise me in the slightest. It’s exactly how I originally thought things worked, in fact. Then you all tried to confuse me with these crazy fake integers.

Posted by: Aaron Bergman on October 21, 2009 12:26 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

So, you really haven’t done much more than say this: “Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.”

But that’s all I ever wanted to say. You guys just weren’t letting me do so.

All Mike said was that there are some Diophantine equations that have solutions in some models and none in others; you're the one who took this to mean that this contradicts the statement above.

But even so, I don't see why you want to bother to make that statement. You can only make it because you chose to work with classical logic; it wouldn't follow if you used Heyting arithmetic or constructive Zermelo–Frankel set theory in the meta-reasoning. It doesn't actually tell you anything useful!

Posted by: Toby Bartels on October 21, 2009 12:43 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

John wrote:

So, you really haven’t done much more than say this: “Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.”

Aaron wrote:

But that’s all I ever wanted to say. You guys just weren’t letting me do so.

Right: because it’s odd to say something is a ‘definite fact’ when there’s no way to tell whether it’s true or not, and you’re allowed to flip a coin to make up you’re mind whether it’s true!

I would call that ‘something you get to decide’, rather than a ‘definite fact’.

John wrote:

And regardless of whether you use Peano arithmetic or ZFC, no matter how you ‘pick a model’, there’s no systematic method to use it to decide for all Diophantine equations whether they have solutions or not!

Aaron wrote:

This doesn’t surprise me in the slightest. It’s exactly how I originally thought things worked, in fact. Then you all tried to confuse me with these crazy fake integers.

Maybe the problem is that you’re not comfy with Gödel’s Completeness Theorem, which says that if a formula in first-order logic holds in all models, then it’s provable. The converse is true too: this is the Soundness Theorem.

So, the fact you claim to be comfortable with is exactly equivalent to the existence of those crazy integers.

Let me lay it out:

There are lots of Diophantine equations where you can’t prove they do have a solution and you can’t prove they don’t.

Whenever this happens, there’s a model of the integers where this equation does have a solution, and one where it doesn’t. There’s generally no way to tell which one is the ‘standard’ integers and which one is a ‘crazy nonstandard’ model of the integers.

Whenever this happens, you are free to make up your mind whether this equation has a solution or not. You can do this using your intuitions about math, your esthetic judgement, or by flipping a coin — whatever you like.

When you tell us your decision, you are telling us a bit about which models of the integers you like and which you don’t like. The models you like are the ones that do what you want!

But you’ll never pin down a specific model this way: there will always be infinitely many models that do what you want. And there will always be further Diophantine equations which will require further choices on your part.

So, I’ll never know what you consider the ‘standard integers’ and what you consider the ‘crazy fake ones’.

And neither will you! Because it’s a theorem that there’s no systematic way to make enough choices to settle this question.

If despite all this it comforts you to imagine that one of these models is the ‘real integers’, all I can do is bow my head in despair. I can’t argue you out of it. All I can do is make you ashamed for believing it.

It’s like this. Everyone in the world thinks of themselves as ‘me’. But you’re saying that of all the people in the world, one is ‘really me’, while the rest just ‘feel like they’re me’ — even though nobody can tell which one it is.

Posted by: John Baez on October 21, 2009 3:29 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

John wrote:

There are lots of Diophantine equations where you can’t prove they do have a solution and you can’t prove they don’t

and

Whenever this happens, you are free to make up your mind whether this equation has a solution or not. You can do this using your intuitions about math, your esthetic judgement, or by flipping a coin — whatever you like.

For such an equation, I’d feel compelled to decide that it doesn’t have a solution. For if x\mathbf{x} were a solution then the simple computation showing that x\mathbf{x} is a solution would constitute a proof that the equation had a solution.

Would you put this argument under the heading of ‘your intuitions about math’? It doesn’t seem like intuition to me: it seems like a correct logical deduction.

?

Posted by: Tom Leinster on October 21, 2009 4:29 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

For such an equation, I’d feel compelled to decide that it doesn’t have a solution. For if xx were a solution then the simple computation showing that xx is a solution would constitute a proof that the equation had a solution.

Since, as I said below, you can’t expect to ever prove anything unprovable without the additional hypothesis that your theory is consistent, I think this argument is valid but under this additional hypothesis. That is, if the theory is consistent, then this equation cannot be proven to have a solution, and therefore it doesn’t have a solution. (This is basically how you prove the second incompleteness theorem from the first one: if the theory could prove its own consistency, then by this argument it would prove that the equation has no solution—but we already proved that (since the theory is consistent) it can’t prove whether or not the equation has a solution, so the theory is inconsistent.)

Thus, if you believe your theory to be consistent, then you must indeed conclude that this equation doesn’t have a solution. But I think that belief in consistency does fall under “your intuitions about math.” Of course, for theories in common use such as ZFC and PA, many people share that same intuition!

Posted by: Mike Shulman on October 21, 2009 4:49 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

When you tell us your decision, you are telling us a bit about which models of the integers you like and which you don’t like. The models you like are the ones that do what you want!

But you’ll never pin down a specific model this way: there will always be infinitely many models that do what you want. And there will always be further Diophantine equations which will require further choices on your part.

Though it feels strange for me to be playing advocatus dei for a platonist, could there be some way to single out a “preferred model” via some method akin to optimal fixedpoint?

Posted by: J-L Delatre on October 21, 2009 6:04 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Tom wrote:

For such an equation, I’d feel compelled to decide that it doesn’t have a solution.

Okay, good point! I overstepped my case and will have to rethink it a bit. I apologize to Aaron for getting carried away.

The issue is this:

The existence of solutions of a Diophantine equation is what folks call a Σ 1 0\Sigma^0_1 formula. Roughly: it’s of the form

n 1n kP(n 1,,n k),\exists n_1 \cdots \exists n_k P(n_1,\dots,n_k),

where P(n 1,,n k)P(n_1, \dots , n_k) is an equation built from the variables n 1,,n kn_1, \dots, n_k and arithmetic operations.

Remember: a formula is undecidable if you can’t prove it or prove its negation.

Here’s a cute fact: if a Σ 1 0\Sigma^0_1 formula is undecidable (in Peano arithmetic or any extension thereof), its undecidability is unprovable.

You basically gave the reason why, but I’ll expand on your argument a bit.

For a formula like

Φ=n 1n kP(n 1,,n k),\Phi = \exists n_1 \cdots \exists n_k P(n_1,\dots,n_k),

if you can’t prove it, it must mean you can’t find concrete examples of n 1,,n kn_1, \dots, n_k for which P(n 1,,n k)P(n_1, \dots, n_k) is true. So, Φ\Phi must be false.

But with some work this implies: if you can’t prove Φ\Phi, you can prove its negation. (To see this, this we need to check that Peano arithmetic is powerful enough for us to formalize the common-sense argument in the previous paragraph.)

And with a bit more work this implies: if you can prove you can’t prove Φ\Phi, you can prove its negation. (Again, to see this we need to check that Peano arithmetic is powerful enough to formalize a common-sense argument: if you can prove you can prove something, you can prove it!)

So: if Φ\Phi is undecidable, you can’t prove it’s undecidable.

So, I am not actually able to present Aaron with any Diophantine equations for which I can prove that the existence of solutions is undecidable.

My embarrassment would go away if instead of Σ 1 0\Sigma^0_1 formulas I were allowed to use formulas that were a bit more complex, with more of a mix of \exists’s and \forall’s. Then I could present concrete examples of provably undecidable formulas.

I’ll have to check to see how much this mistake of mine affects various things I’ve been claiming. It’s been a long since I’ve thought about this stuff, and I could be more rusty than I think.

In any case, it’s really good to spend some time thinking about these things so I don’t forget everything I used to know about logic. So — thanks! And thanks to Mike and Aaron, too.

Posted by: John Baez on October 21, 2009 6:10 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

So, I am not actually able to present Aaron with any Diophantine equations for which I can prove that the existence of solutions is undecidable.

If “undecidable” means, as you said, that “you can’t prove Φ\Phi or ¬Φ\not\Phi,” then this is right. But I think usually when folks talk about something being “undecidable” (in the logical sense), they mean that “if ZFC (or whatever) is consistent, then it can’t prove Φ\Phi or ¬Φ\not\Phi.” And with that meaning, I think you can present a specific Diophantine equation, using for instance this argument, for which the existence of solutions is undecidable. Your argument for the unprovability of undecidability doesn’t then apply because proving that something is undecidable doesn’t actually prove that you can’t prove it, it only proves that conditionally on consistency of ZFC (or whatever)—and we know (by 2nd incompleteness) that we can’t prove that (as long as it’s true).

Posted by: Mike Shulman on October 21, 2009 6:35 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

JB wrote: “It seems to me that we need to take a sentence whose undecidability is undecidable.”

JB: “So, I am not actually able to present Aaron with any Diophantine equations for which I can prove that the existence of solutions is undecidable.”

MS: “If “undecidable” means, as you said, that “you can’t prove Φ or ¬Φ,” then this is right.”

Diophantine equations (DEs) are countably infinite. There are an infinite number of yes or no decision questions. What does consistency have to do with establishing the claim of Tom Leinster?

For such an equation, I’d feel compelled to decide that it doesn’t have a solution. For if x were a solution then the simple computation showing that x is a solution would constitute a proof that the equation had a solution. …

Would you put this argument under the heading of ‘your intuitions about math’? It doesn’t seem like intuition to me: it seems like a correct logical deduction.

Suppose we compute the first five of a finitely unbounded (or just a whole bunch) of DEs. They all happen to produce a solution in 5 minutes. This shows that some DEs are decidable. But what does this have to do with establishing that some DEs are not decidable? Suppose the next DE computes for 5 years and yields no solution. One still can’t decide if it has a solution in some finite number of years or whether it has no solution. How is that connected to whether ZFC is consistent which is not provable within ZFC, it remains a majority intuition? What about Tom’s remark prompted JB to re-evaluate his position or claim?

What does this have to do with non-standard models. I thought this meant allowing infinitesimals. Mabye my idea of what is non-standard is way too narrow?

Posted by: Stephen Harris on October 21, 2009 8:01 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

What does this have to do with non-standard models. I thought this meant allowing infinitesimals.

We've mostly been talking about nonstandard natural numbers; none of them are infinitesimal. In fact, they're all infinite, in the sense that each nonstandard natural number is greater than each standard natural number. So their inverses in the nonstandard rational numbers are infinitesimal (but nonzero) in the sense of being smaller (in absolute value) than every standard rational number.

Posted by: Toby Bartels on October 21, 2009 10:28 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Stephen wrote:

What does this have to do with non-standard models? I thought this meant allowing infinitesimals.

Infinitesimals show up in nonstandard analysis. We’re not talking about that.

We’re talking about the fact that theories of the natural numbers have nonstandard models.

Posted by: John Baez on October 23, 2009 8:44 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Infinitesimals show up in nonstandard analysis. We’re not talking about that.

We’re talking about the fact that theories of the natural numbers have nonstandard models.

These are very closely related, as in my answer above. Your wording seems to imply that they are unrelated, like the two senses in which the solubility of Diophantine equations might be ‘undecidable’. (Actually, I wouldn't be surprised if those are related in some way too.)

Robinson's nonstandard analysis needs to take some care about which nonstandard models it's using, to have the desired power and still keep the transfer principle.

Posted by: Toby Bartels on October 24, 2009 1:06 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Infinitesimals show up in nonstandard analysis. We’re not talking about that.
We’re talking about the fact that theories of the natural numbers have nonstandard models.

These are very closely related, as in my answer above.

Yes, and of course this is the origin of the term “nonstandard analysis.”

However, the point of view is, I think, quite different. Nonstandard analysis uses a nonstandard model and its relationship to the standard model in order to prove things about the standard model (or, at least, a model that you’re given and choose to call “standard”). Here the nonstandard models are something people want to “get rid of,” except the point is that you can’t characterize the “standard model.”

Posted by: Mike Shulman on October 24, 2009 4:03 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

And actually, I think that as long as your theory is consistent, then for no formula Φ\Phi can it prove that it can’t prove Φ\Phi (as opposed to proving that if it is consistent, then it can’t prove Φ\Phi). For since an inconsistent theory proves everything, if there’s anything it can’t prove, it must be consistent, and by 2nd incompleteness a consistent theory can’t prove its own consistency.

Posted by: Mike Shulman on October 21, 2009 6:44 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

JB originally: “There are lots of Diophantine equations where you can’t prove they do have a solution and you can’t prove they don’t.”

JB: “Okay, good point! I overstepped my case and will have to rethink it a bit.”

SH: I certainly don’t see how. There are an infinite number of Diophantine equations which are solvable and an infinite number which are not. Tom wrote:

For such an equation, I’d feel compelled to decide that it doesn’t have a solution. For if x were a solution then the simple computation showing that x is a solution would constitute a proof that the equation had a solution.

The second sentence is true, if you find a solution, that serves as a proof enough and that logic works for an infinite number of solvable Diophantine equations (DEs).

The first sentence has no impact on what you claimed. As you pointed out before, some DEs, infinitely many, won’t provide a solution in some short finite amount of time, and one can’t tell if they have no solution, or a solution which will take a very long time to compute. So I’ve never seen a reference that because one can determine one solvable DE, that the next one that you can’t determine is more likely to be unsolvable, rather than solvable, but in a much larger span of time, which seems to be Tom’s claim.
There is no inference to be drawn.

Later you made another statement which was in agreement with the statement that you retracted, “So, I am not actually able to present Aaron with any Diophantine equations for which I can prove that the existence of solutions is undecidable.”

However, some apparent facts are open to dispute.

Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.

That strikes me as a philosophical claim because what seems to be a similar claim for the Goldbach Conjecture, that it is either true or not true, is not considered a fact, but a philosophical claim, neither true nor untrue. I’m not certain of this analogy, because there is a distinction between truth and verifiability.


Posted by: Stephen Harris on October 21, 2009 2:41 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

That strikes me as a philosophical claim because what seems to be a similar claim for the Goldbach Conjecture, that it is either true or not true, is not considered a fact, but a philosophical claim, neither true nor untrue

Assuming classical logic, it is a mathematical fact that the Goldbach Conjecture is either true or false. It’s also a mathematical fact that every Diophantine equation either has a solution or it doesn’t. Even if, for some particular Diophantine equation, we can’t prove it either way. That’s just because it’s a law of classical logic that for every statement Φ\Phi, the statement Φ¬Φ\Phi\vee \not\Phi is true.

A person of a certain philosophical persuasion may find this to be an argument against classical logic.

Posted by: Mike Shulman on October 21, 2009 6:34 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Ok, I thought there was some conflict between Platonists and Constructivists about the status of Goldbach’s Conjecture (Brouwer -> no law of excluded middle).
I’m more interested in making sense of Tom Leinster’s remark to John Baez. I thought JB was right all along. Also that non-standard model I may have confused with non-standard arithmetic. Thanks for your reply. You are being worked pretty hard!
I’m on the side that there are yes or no answers so I don’t need convincing there.

Posted by: Stephen Harris on October 21, 2009 8:21 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Ok, I thought there was some conflict between Platonists and Constructivists about the status of Goldbach’s Conjecture

Don't worry, there is; Constructivists are the people that Mike was talking about here:

A person of a certain philosophical persuasion may find this to be an argument against classical logic.

Posted by: Toby Bartels on October 21, 2009 10:38 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Someone wrote:

Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.

Stephen wrote:

That strikes me as a philosophical claim…

That’s true. I just want to point out that I wasn’t making this claim. I was claiming that Aaron was making this claim. And I think he agreed.

Posted by: John Baez on October 23, 2009 8:52 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Okay, so the easiest way to fix this post is to replace ‘existence of solutions to a Diophantine equation’ by ‘truth of a formula in arithmetic’. I need to use formulas with a few alternating \forall’s and \exists’s, not just \exists’s.

So let me say it right:

Gödel’s Completeness Theorem says that if a formula in first-order logic holds in all models, then it’s provable. The converse is true too: this is the Soundness Theorem. If we combine these with Gödel’s First Incompleteness Theorem, we can see the following:

There are lots of statements about arithmetic that you can’t prove, but where you also can’t prove their negation.

Whenever this happens, there’s a model of the integers where the statement is true, and one where it’s false.

Whenever this happens, you are free to make up your mind whether the formula is true or not. You can do this using your intuitions about math, your esthetic judgement, or by flipping a coin — whatever you like.

When you tell us your decision, you are telling us a bit about which models of the integers you like and which you don’t like. The models you like are the ones that do what you want!

But you’ll never pin down a specific model this way: there will always be infinitely many models that do what you want. And there will always be further statements about arithmetic which will require further choices on your part.

So, I’ll never know what you consider the ‘standard integers’ and what you consider the ‘crazy fake ones’.

And neither will you! Because it’s a theorem that there’s no systematic way to make enough choices to settle this question.

If despite all this it comforts you to imagine that one of these models is the ‘real integers’, all I can do is bow my head in despair. I can’t argue you out of it. All I can do is make you ashamed for believing it.

It’s like this. Everyone in the world thinks of themselves as ‘me’. But you’re saying that of all the people in the world, one is ‘really me’, while the rest just ‘feel like they’re me’ — even though nobody can tell which one it is.

Posted by: John Baez on October 21, 2009 6:47 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Okay, so the easiest way to fix this post is to replace ‘existence of solutions to a Diophantine equation’ by ‘truth of a formula in arithmetic’.

I think that post was fine as far as it went; by picking Yes or No for each Diophantine equation one at a time, Aaron would never finish.

However, he could answer them all No in one fell swoop, and then he would finish; and this is what he'd do if he believed that the system he was using (PA, ZFC, whatever) is consistent.

So it's better to generalise, as you just did, from Diophantine equations to arbitrary formulas. For one thing, assuming consistency will not answer all of them. For another thing, there's no way to consistenty answer them all Yes or No; some of these formulas (in fact all of them, assuming that Aaron is using a system with classical logic) are negations of other formulas.

Posted by: Toby Bartels on October 21, 2009 10:38 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

That is, the game is played with sequences of symbols called terms…

Something that has confused me for quite a while: how do knot diagrams fit in to this framework? For example, the Reidemeister moves appear to behave as if they are axioms in a first-order theory where the models are links and the elements are one-dimensional subsets. Is it the case that terms that are most naturally written as diagrams can always be translated into symbol-sequences with appropriately careful handling?

Posted by: Scott McKuen on October 20, 2009 8:48 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Yes

Posted by: Bruce Westbury on October 20, 2009 10:15 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Is it the case that terms that are most naturally written as diagrams can always be translated into symbol-sequences with appropriately careful handling?

Yes, assuming that your diagrams only convey a finite number of bits of information. Consider, for example, the XYpic/TikZ/postscript/pdf/etc. code that draws your diagram! That’s a sequence of symbols.

Which isn’t, of course, to say that a computer-based system for mathematics wouldn’t ideally be able to deal with diagrams as terms in their own right, at least from the user’s point of view.

Posted by: Mike Shulman on October 20, 2009 4:15 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I’ll be more verbose than Bruce, while agreeing with him :-)

The first point is that you can express knot diagrams in terms of symbols from a finite diagram. If you had the patience, you could draw any knot diagram as ASCII art, using symbols such as | and — and //.

The second point is that while this translation is possible, it is unnatural. As you say, the terms are ‘most naturally written as diagrams’. My feeling is that a knot diagram is just as much a piece of syntax as an expression such as x 2+y 2x^2 + y^2.

Posted by: Tom Leinster on October 20, 2009 4:35 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Yes, you could use ASCII art, but a more natural translation into syntax would be algebraic. The category of tangle diagrams (knots are just tangles from 0 to 0) is finitely-generated as a monoidal category. You have generators for the cup and cap (let’s say ‘U’ and ‘A’), and for the left and right crossings (‘L’ and ‘R’), and the identity on one strand (‘I’), and of course the operations of composition and monoidal product (‘*’ and ‘@’).

For example, a projection of the trefoil might be written

(A@A)*(I@L@I)*(R@R)*(I@U@I)*(U)

Rewrite rules are a lot more of a headache, since this really is essentially a two-dimensional syntax we’re forcing into one dimensional strings, but it’s possible. And it’s probably the simplest way I know to feed a tangle into a computer program to calculate, say, a matrix invariant.

Posted by: John Armstrong on October 20, 2009 5:11 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

OK, the fact we can represent a knot diagram in TeX is a great example of how to pass back to plain-old one-dimensional syntax, as is the category of tangles. What concerns me is this kind of interpretation can kill off interesting structures in the original language, like Fraisse limits. Is there some kind of bi-interpretability theorem that prevents this, or is it possible that there are standard first-order constructions that, for the right class of structures, could only be carried out in a two-dimensional syntax because every sequential syntax rules out the necessary embeddings or creates extra substructures that interfere?

For example, naively it seems like there ought to be a “Fraisse link” in the “language” of knot diagrams: the hereditarity, joint embedding, and amalgamation properties all look like they should work if we treat the relationship between a link and its components as “substructure”. On the other hand, I’m not clear how you would preserve this setup in a POODS.

(Just to be clear for everyone: Fraisse limits are how model theory generalizes the idea that every finite order embeds in the rationals, or that every finite graph embeds in the Erdos-Renyi random graph. For knots, if it existed, this would be some kind of wild link that every finite link embeds into in a way that preserves knot diagrams on finite sets of the individual components.)

Posted by: Scott McKuen on October 20, 2009 7:20 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I’m not sure what you’re worried about. If I have an injection from any set DD (like the set of knot diagrams) to a set TT (like the set of one-dimensional terms), then any structure on DD can equally well be placed on its image in TT. (This is getting us into structuralism, the topic of the followup post…)

Posted by: Mike Shulman on October 21, 2009 2:46 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Whoops - now I get it, after rereading the main article more carefully. That’s certainly true, since my original question was just whether you could take a particular set of terms and represent them consistently in another.

And since you’ve defined “terms” for this discussion to live at the meta-level, so they include the formulas and sentences of the base level, the rest of the stuff I was bringing up in my second response (interpretations in the style of model theory) doesn’t really apply here. Sorry about the confusion.

Posted by: Scott McKuen on October 21, 2009 2:43 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

the fact we can represent a knot diagram in TeX is a great example of how to pass back to plain-old one-dimensional syntax

Is it not the case that everything has to be funneled thru “plain-old one-dimensional syntax”?
Computer RAM as well as written or spoken discourse.

Posted by: J-L Delatre on October 21, 2009 5:49 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

JLD: Is it not the case that everything has to be funneled thru “plain-old one-dimensional syntax”? Computer RAM as well as written or spoken discourse.

Is it not the case that none of us will ever know more than a finite number of sentences in our lives, and so our competence with language is adequately covered by a finite state machine?

Posted by: Jon Awbrey on October 21, 2009 1:12 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

No, because there is not (necessarily) any uniform upper bound on the number of sentences anyone will ever know.

Posted by: Tom on October 21, 2009 4:42 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Yes, the point is the art of choosing appropriate idealizations, the rational(ist) ideas that best fit the empiric(ist) data, even though a nominal(ist) thinker can always object that these general terms are “mere names”.

A simpler example occurs in many statistical settings, where “Infinity begins at N=30N = 30” — when for all the difference it makes you might as well pass from the actuality of the binomial distribution to the idealization of the normal distribution.

Posted by: Jon Awbrey on October 21, 2009 5:34 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Is it not the case that everything has to be funneled thru “plain-old one-dimensional syntax”?

Computer RAM as well as written or spoken discourse.

No, I don't think so. If I draw a diagram on a board to explain something to you, nothing goes through a one-dimensional syntax.

What seems to be the case is that everything can be funneled thru a syntax consisting of finite strings from a finite alphabet. But (like the Church–Turing thesis) this is an empirical observation, not anything that I could imagine proving.

Posted by: Toby Bartels on October 21, 2009 10:11 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

To me it’s interesting that while you can continuously deform a knot diagram in certain ways without changing its meaning, the same is true of ordinary symbols. The truth of a mathematical proposition should not depend on exactly how I write the letter ‘E’.

In both cases, the space of things you can write is effectively continuous, but it’s divided up into regions on which the meaning of what you write is constant… separated by regions in which what you write is ‘ambiguous’ or ‘ill-defined’. If I write the letter ‘E’ illegibly, you simply won’t count my proposition as legitimate.

So, in both cases our symbolism takes an effectively continuous space of possible drawings and makes it play the role of a discrete space. It’s this discreteness that makes it possible for mathematical statements to have sharp yes-or-no truth values.

Whereas, for example, the quality of a painting could vary continuously as we morphed it from the Mona Lisa to a cartoon smiley face.

Posted by: John Baez on October 20, 2009 7:24 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

?\exists?

In cognitive psychology this phenomenon is known as … wait for it … categorical perception.

Posted by: Jon Awbrey on October 20, 2009 10:56 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

As a mathematical physicist I must say the following: for a physicist a model always means some approximation of reality. As a convinced Platonist, reality (our universe) for me is a part of the World of Ideas which is realized in time. The key difference between mathematics and physics is the absence of time in mathematics (math is timeless), reflecting the fact that math deals with the timeless part of the World of Ideas. Our brains have the capacity to see the abstract mathematical ideas, and those we see to live in our world represent the physical laws. As far as the various constructions of the model theorists are concerned, for me all these constructions exist in the World of Ideas. However, an interesting question is which of these are realized in our world.

Posted by: Aleksandar Mikovic on October 20, 2009 9:24 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Isn’t it a bit more complicated in that there’s “approximation in both directions”? In the that one can have theories with multiple models for which the extent of observable consequences of any possible physical realisation (assuming the physical universe contains a finite amount of “stuff” to embody the entities) in which some property is differently true or false in the different theories but for which any direct physical instantiation will be size limited enough that there’s no observable difference. (Obviously the concrete reasoning that the models have different properties is itself a physcial instantiation of the difference, but being able to show that there’s a “platonic” difference between the model’s properties is distinct from there being an “observable” difference within the physical universe.)

Of course, my understanding is that Mike Shulman is talking about differences that may also arise within “physically realisable” regimes.

Posted by: bane on October 20, 2009 12:40 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

my understanding is that Mike Shulman is talking about differences that may also arise within “physically realisable” regimes

I don’t know what “physically realizable” means when applied to math. I suppose one could consider three apples to be some sort of physical realization of the number three. But I’m a bit skeptical that there exists a “physical realization” in this sense of anything infinite.

The way I see it, mathematics is a game played with symbols and ideas. We choose the symbols and ideas and the rules of the game in such a way that its results can be used (we hope) to predict certain sorts of observations about the “real world” (or, to be even more careful, to certain sorts of sensory input we receive). But I don’t think it’s reasonable to say that anything in the “real world” really is anything mathematical. I don’t think that spacetime is a 4-manifold (or a 10-manifold or whatever), or that an electron is a complex-valued function, or even that they could be said to form a “model” of them.

Posted by: Mike Shulman on October 20, 2009 7:00 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

But it is precisely the physical properties and signs that you have to consider when you try to do anything computational.

If by “idea” you have in mind referring in any way to the states of a physical agent, then you are back to the Kantian notion that concepts are just mental symbols.

Posted by: Jon Awbrey on October 20, 2009 7:16 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

But it is precisely the physical properties and signs that you have to consider when you try to do anything computational.

You have to find computational ways of representing ideas, but that doesn’t make the ideas themselves physical.

Since I’m not a philosopher, I’m not going to even try to define “idea.”

Posted by: Mike Shulman on October 20, 2009 7:21 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

JA: But it is precisely the physical properties [of] signs that you have to consider when you try to do anything computational.

MS: You have to find computational ways of representing ideas, but that doesn’t make the ideas themselves physical.

MS: Since I’m not a philosopher, I’m not going to even try to define “idea.”

I’m glad we all understand that there have to be undefined terms somewhere.

So let us save ourselves a lot of futile fumbling with FOL and make our foundation this:

“Object”, “Arrow”, “Natran” are undefined terms, the sum of whose meanings reside solely in the relations defined among them.

Posted by: Jon Awbrey on October 21, 2009 4:28 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

So let us save ourselves a lot of futile fumbling with FOL and make our foundation this: “Object”, “Arrow”, “Natran” are undefined terms, the sum of whose meanings reside solely in the relations defined among them.

Unfortunately, simply stating what your undefined terms are doesn’t provide much of a foundation for anything. The role of FOL is to describe the “relations defined among them.”

Posted by: Mike Shulman on October 21, 2009 5:06 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

MS: Unfortunately, simply stating what your undefined terms are doesn’t provide much of a foundation for anything. The role of FOL is to describe the “relations defined among them.”

And if FOL + YFAST (Your Favorite Axiomatic Set Theory) were a natural medium for doing that, then we’d all be happy — well, as happy as math folk get — no matter whether it was point,line,plane\lang point, line, plane \rang or object,sign,idea\lang object, sign, idea \rang or whatever in the paradigmatic slots.

But a long list of putative reductions of actual mathematical practice — from PM and YFAST to FOL and Ada — have failed to do that, not just in principle, but in practice.

The way I remember it, Category Theory was instituted to recapture the aspects of actual mathematical practice that had been lost or obscured by previous fundamentalist and reformist programmes. Among its first quarry were the natural equivalence classes or “natural kinds” of mathematical objects that seemed to be escaping through through the nets of purely set-theoretic isomorphisms.

That was the saving grace, the spirit of the enterprise, as I remember it.

Posted by: Jon Awbrey on October 21, 2009 2:26 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The way I remember it, Category Theory was instituted to recapture the aspects of actual mathematical practice that had been lost or obscured by previous fundamentalist and reformist programmes. Among its first quarry were the natural equivalence classes or “natural kinds” of mathematical objects that seemed to be escaping through through the nets of purely set-theoretic isomorphisms.

Something about this sounds off to my ears. I wouldn’t say Category Theory was ever “instituted”, and the sentiment here seems to be that Eilenberg and Mac Lane, who invented the notion of natural transformation, were on some sort of philosophic mission and out to capture quarry.

Trying to read this comment slightly more sympathetically, it sounds vaguely like a call to renew the program of Lawvere (who did and does have a kind of missionary zeal) to develop a category of categories as foundations, rather than something like ETCS. But it’s so vaguely worded that I can’t tell.

Posted by: Todd Trimble on October 21, 2009 4:12 PM | Permalink | Reply to this

Lawvere and Trimbling

No, you exe.gize me all wrong — I’m way too Unitarian for that.

Cf. My Old Testimony “On Fundamentalism”.

Posted by: Jon Awbrey on October 21, 2009 4:46 PM | Permalink | Reply to this

Re: Lawvere and Trimbling

That I may be, but then again I don’t feel like I’m particularly to blame for that. If you want us to interpret you correctly, try writing more clearly for those of us who are not familiar with your ways of thinking. Then we can respond more meaningfully and not waste each other’s time. Cf. how Mike responded a few comments ago.

A little less punning and a little more real substance would help.

Posted by: Todd Trimble on October 21, 2009 5:38 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Maybe an acronym will serve to concentrate attention on “Actual Mathematical Practice” (AMP).

Arnold Neumaier set out his vision for a computational system that could do some good for actual mathematical practice — I’ll avoid the term “real mathematics” except in quotations as it seems like a pun on “real analysis” and might lend itself to the misconception that all mathematics is about real numbers.

A vision like that falls in line with a venerable tradition extending way back through Wolfram, Knuth, Wirth, McCarthy, Backus, as far as you want to take it, so let’s stop for now at Peirce, Boole, Leibniz.

This spurred a couple of very long tangents, that seemed to be mostly about the difference between REC and R.E. and the fact that some Diophantom Equations are mathematical puns, all of which I’m sure Arnold understands well enough — and so I phased out for a while — but somehow the whole idea of supporting actual mathematical practice looks like it got lost in the fray.

And that would be a bad thing …

Posted by: Jon Awbrey on October 21, 2009 7:25 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The word “model” is obviously used in many different ways — by artists, engineers, mathematicians, and physicists.

The sense of “model” used in model theory is “something that satisfies a theory, in particular, a formal object that a theory is true of”.

But it’s fairly easy to see the relation between the different senses.

In physics or any empirical science the object being modeled, that is, the objective phenomenon being approximated, imaged, or represented, is also a model in the logical sense.

Consider the painter …

Posted by: Jon Awbrey on October 20, 2009 12:42 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

As a convinced Platonist, reality (our universe) for me is a part of the World of Ideas which is realized in time. The key difference between mathematics and physics is the absence of time in mathematics (math is timeless)

This seems to me a strange point of view in the 21st century, given what general relativity says about how time can’t be separated or even distinguished from space and the rest of the universe, and can even (in theory) flow sideways or backwards or in loops.

Posted by: Mike Shulman on October 20, 2009 6:59 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

This is a short summary of the book written by Matiyasevich on Hilbert’s 10th Problem.

That implies that H_o is not Turing decidable. In other words, it is impossible to construct a Turing machine that, beginning with the representation of a number k on the tape, will halt after a finite number of steps in state q_2 or q_3, depending on whether the equation with code k is or is not solvable.
The set H-o is itself Diophantine, since it is defined by equation (4.6.7). This fact shows that Turing machines are incapable of of deciding whether or not the equations belonging _to one particular family of Diophantine equations_ have solutions, to say nothing of _arbitrary Diophantine equations_.
In the previous sections of this chapter we have obtained two, no doubt remarkable, results. Namely, we have establish that
the class of Diophantine sets
is identical to
the class of Turing semidecidable sets
and Hilbert’s Tenth Problem is Turing undecidable.
How is formal Turing (semi)decidability related to (semi)decidability in the intuitive sense? One relation is evident: Turing (semi)decidable sets are recognized by most mathematicians as (semi)decidable in the intuitive sense. The converse is know as
CHURCH’S THESIS: _Every set of n-tuples that is (semi)decidable in the intuitive sense is also Turing (semi)decidable.
In this section we shall see that two other famous problems, seemingly having little to do with Diophantine equations, can nevertheless be reformulated as statements about the unsolvability of _particular_ Diophantine equations. We begin with _Goldbach’s Conjecture_, …

I brought this up because, besides being interesting, I don’t see this result as depending on either the consistency of PA nor ZFC etc. nor do I see how this result is impacted by a non-standard model? Although there is an alternate proof given using Godel’s Incompleteness Theorem.

Posted by: Stephen Harris on October 22, 2009 4:29 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I don’t think the resolution of Hilbert 10 depends on anything logical. Remember that there are two meanings of “undecidability”—this problem is talking about the computability meaning, not the provability meaning.

Posted by: Mike Shulman on October 22, 2009 5:25 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Yes, the Wikipedia backs you up.

Relationship with Gödel’s incompleteness theorem

The concepts raised by Gödel’s incompleteness theorems are very similar to those raised by the halting problem, and the proofs are quite similar. In fact, a weaker form of the First Incompleteness Theorem is an easy consequence of the undecidability of the halting problem. This weaker form differs from the standard statement of the incompleteness theorem by asserting that a complete, consistent and sound axiomatization of all statements about natural numbers is unachievable. The “sound” part is the weakening: it means that we require the axiomatic system in question to prove only true statements about natural numbers (it’s very important to observe that the statement of the standard form of Gödel’s First Incompleteness Theorem is completely unconcerned with the question of truth, but only concerns the issue of whether it can be proven).

Posted by: Stephen Harris on October 22, 2009 8:58 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

MS wrote:

I don’t think the resolution of Hilbert 10 depends on anything logical. Remember that there are two meanings of “undecidability”—this problem is talking about the computability meaning, not the provability meaning.

Aaron asked about solutions to Diophantine equations (DE) and specified integers. He was told that there are models in which a DE solution which worked in one model would not work in another (non-standard) model. Is that a fair summary?

I read Peter Suber’s webpage about this topic where he mentions the The Lowenheim-Skolem Theorem (LST). Perhaps I didn’t understand the article, but I got the impression that non-standard models required the domain of the reals and not the integers which Aaron specified. So, now I have some doubt that there are non-standard models which obviate a DE solution in another model which specifies the integers as the domain?

Posted by: Stephen Harris on October 23, 2009 7:06 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I guess the following quote answers my question, Asher M. Kach, kach.pdf

“Remark: Note that a diophantine equation with parameters from the standard model is solvable in the standard model if and only if it is solvable in some non-standard model.”

Posted by: Stephen Harris on October 23, 2009 7:39 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Stephen wrote:

Aaron asked about solutions to Diophantine equations (DE) and specified integers. He was told that there are models in which a DE solution which worked in one model would not work in another (non-standard) model. Is that a fair summary?

Not clear. When you say “a Diophantine equation solution which worked in one model”, people probably think you mean a solution like this:

3 2+4 5=5 23^2 + 4^5 = 5^2

Any equation like this which is true in one model of Peano arithmetic is true in all.

But what do I mean by an equation “like this”? I mean an equation we can write down with a finite number of symbols 0,S,+,×0, S, +, \times. Here SS means successor and the above equation is an abbreviation for

(SSS0) 2+(SSSS0) 2=(SSSSS0) 2 (SSS0)^2 + (SSSS0)^2 = (SSSSS0)^2

But in what I just said, I’m presuming you agree with what I mean by a “finite number of symbols”. I think we can agree in the example above. But to agree in general, we need to already agree on what the standard natural numbers are, at least in our metalanguage!

So, I’m really just saying that any standard solution of a Diophantine equation is a solution in all models. But as I’ve repeatedly tried to explain, the concept of ‘standard natural number’ is impossible to define in a non-circular way. You might as well define it using the first sentence in this paragraph!

But anyway, the place where different models give visibly different results is not in equations like above, but equations above with quantifiers in front of them! Here we can easily write down examples of formulas which are valid in one model and invalid in another.

For a preliminary discussion of this, see my remarks over here and Mike Shulman’s comments. But I’d like to say more someday, to settle this matter in my own mind.

Posted by: John Baez on October 23, 2009 8:21 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Stephen wrote:
Aaron asked about solutions to Diophantine equations (DE) and specified integers. He was told that there are models in which a DE solution which worked in one model would not work in another (non-standard) model. Is that a fair summary?

JB: Not clear. When you say “a Diophantine equation solution which worked in one model”, people probably think you mean a solution like this:

3^2 + 4^2 = 5^2

Any equation like this which is true in one model of Peano arithmetic is true in all. … So, I”m really just saying that any standard solution of a Diophantine equation is a solution in all models.

Yes, I did mean 3^2 + 4^2 = 5^2, and I thought that is what Aaron meant also. Your example is not exactly what I mean because I think 2nd order Diophantic Equations are solvable, but with that proviso, yes, and this is why. I’m going to include some background though it’s a pain, because not everybody remembers the earlier posts in this thread.

JB writes October 20:
1)So, you really haven’t done much more than say this: “Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.” You didn’t need a model of ZFC to claim this; a model of Peano arithmetic would do just fine.

2)So, what Mike is saying is that the set of integers will be different in different models of ZFC. And not only will it be a different set, the operations on it — like addition, multiplication and so on — will have different properties.

3)And regardless of whether you use Peano arithmetic or ZFC, no matter how you ‘pick a model’, there’s no systematic method to use it to decide for all Diophantine equations whether they have solutions or not!

SH: So it is your #2 statement in conjunction with Toby’s remark, which follows that led to my thinking.

Aaron Bergman wrote:
“Because those are _my_ integers, dammit.
Thus, the Diophantine equation either has a solution in the standard integers
that I’ve grown rather attached to, or it doesn’t.”

MS replied: “That’s okay as long as you realize that “the standard integers”
will be different in different models of ZFC.”

Toby Bartels posted October 21:
JB: So, you really haven’t done much more than say this: “Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.”

AB: “But that’s all I ever wanted to say. You guys just weren’t letting me do so.”

TB: “_All Mike said was that there are some Diophantine equations[DEs] that have solutions in some models and none in others_; you’re the one who took this to mean that this contradicts the statement above.”

I’m going to requote your point which I think is relevant,
“Any equation like this which is true in one model of Peano arithmetic is true in all. … So, I’m really just saying that any standard solution of a Diophantine equation is a solution in all models.”

SH: Now, didn’t Toby assert that there are _some_ DEs that have solutions in some models [meaning particular DEs with solutions] but none in other models. How does that agree with your assertion “any standard solution of a Diophantine equation is a solution in all models.”?
It seems it would have to trivialize Toby’s remark into meaning that some non-standard models are different than other non-standard models. Both of you guys were talking about DEs with known solutions. The relationship of some to all means that some DEs are identical to DEs in other models, and while some of these have solutions in one model, there are none (no solutions) in others (TB).
Also, it seems to me that your 2) above,
agrees more with Toby (choice of ZFC or PA not being relevant).
2)So, what Mike is saying is that the set of integers will be different in different models of ZFC. And not only will it be a different set, the operations on it — like addition, multiplication and so on — will have different properties.

SH: Why does 2) suggest that solutions to equations will remain the same?
About the non-standard arithmetic. I’ve read something about “infinite integers” being used for non-standard models, which is not the same as: the set of integers is infinite.
“So, I’m really just saying that any standard solution of a Diophantine equation is a solution in all models.”

SH: How does this happen or hold true without the choice or description of the “standard set of integers” remaining the same across all models where the solutions remain the same? How does it change so that in non-standard models
TB:”_All Mike said was that there are some Diophantine equations[DEs] that have solutions in some models and none in others_;” is true, considering?

MS: “That’s okay as long as you realize that “the standard integers” will be different in different models of ZFC.

Posted by: Stephen Harris on October 23, 2009 11:43 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

didn’t Toby assert that there are some DEs that have solutions in some models [meaning particular DEs with solutions] but none in other models. How does that agree with your assertion “any standard solution of a Diophantine equation is a solution in all models.”?

Well, just looking at the phrasing of those two sentences, it should be clear that some Diophantine equations in some models have nonstandard solutions.

I’m having trouble parsing the rest of your comment because I can’t make out what is being quoted, what is being replied to, and what is new. Why don’t you try using blockquotes more? Even nesting blockquotes a few times can be helpful to see what was a reply to what. Remember, if you use a “Markdown” text filter then you can put things in blockquotes with just a “>” at the beginning of the line (or “>>” for nested, etc.)

Posted by: Mike Shulman on October 24, 2009 4:09 PM | Permalink | PGP Sig | Reply to this

Block quote tips (Was: Syntax, Semantics, and Structuralism, I)

Remember, if you use a “Markdown” text filter then you can put things in blockquotes with just a “>” at the beginning of the line (or “>>” for nested, etc.)

And if you don't want to use a Markdown filter, then follow the instructions here (scroll down just a little).

Posted by: Toby Bartels on October 24, 2009 5:43 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Aaron Bergman wrote:

“Because those are _my_ integers, dammit.

Thus, the Diophantine equation either has a solution in the standard integers that I’ve grown rather attached to, or it doesn’t.”

MS replied: “That’s okay as long as you realize that “the standard integers”
will be different in different models of ZFC.”

Toby Bartels posted October 21:

JB: “So, you really haven’t done much more than say this: “Every Diophantine equation either has a solution or not — one way or the other, it’s a definite fact.”

AB: “But that’s all I ever wanted to say. You guys just weren’t letting me do so.”

TB: “_All Mike said was that there are some Diophantine equations[DEs] that have solutions in some models and none in others_; you’re the one who took this to mean that this contradicts the statement above.”

MS wrote: “Well, just looking at the phrasing of those two sentences, it should be clear that some Diophantine equations in some models have nonstandard solutions.”

SH: I provided a context of quotes so that there would be background to interpret isolated quotes that refer back to this background later. The long quote shows up in blockquotes in my browser now and in blockquotes in my last post. I quote MS next normally to provide contrast. Guidelines for writing say don’t use blockquotes for one line of quoted text and rarely for two lines, and technically, not to use quote marks.

Well, “just looking at the phrasing of those two sentences,”; I don’t think that is relevant since I took care to provide many sentences, imo quite legibly, to provide the context for understanding my later partial quotes. Notice me putting quote marks around your quoted phrase?
That means I’m quoting you.

Likewise, when you use “standard integers”
in a post responding to where Aaron has used “standard integers”, it means _standard integers_. It is not secret code for _nonstandard_.

MS: “Well, just looking at the phrasing of those two sentences, it should be clear that some Diophantine equations in some models have nonstandard solutions.”

JB said that all Diophantine equations which have solutions in the standard model have solutions in _all_ models, which includes nonstandard models.

I’ll again quote what you and Toby said which seems inconsistent with this clarity.

MS replied: “That’s okay as long as you realize that “the standard integers”
will be different in different models of ZFC.”

TB: “_All Mike said was that there are some Diophantine equations[DEs] that have solutions in some models and none in others_; you’re the one who took this to mean that this contradicts the statement above.”

Aaron was asking about whether DE solutions were the same in other models,

AB: “So, I’m happy to assume ZFC is consistent and pick a model. Now, we write down our Diophantine equation, but because we’ve chosen a model for ZFC our statement that was independent of ZFC has a definite truth value. Thus, the Diophantine equation either has a solution in the standard integers that I’ve grown rather attached to, or it doesn’t. But all this depends on the fact that are working in the given model. Thus, we can’t ever prove in ZFC that the Diophantine equation has a solution or not, but as soon as we pick a model there is a definite answer.”

For emphasis:

MS wrote: I think that’s all right, except maybe:

AB: “Thus, the Diophantine equation either has a solution in the standard integers that I’ve grown rather attached to, or it doesn’t.”

MS finishes with “That’s okay as long as you realize that “the standard integers” will be different in different models of ZFC.

MS only disagreed or chose to emphasize that one point. Since JB said
‘that all Diophantine equations which have solutions in the standard model have solutions in _all_ models’, which includes nonstandard models. That means you and Toby are pointing out that some different nonstandard models will have different DE results with each other, which seems trivial to me, or, that nonstandard models can have solutions to DEs that won’t work in the standard model. Which is nonstandard –> nonstandard, or nonstandard –>standard model. JB’s statement covers all standard models –> working in all nonstandard models. That seems to me what Aaron is asking about.

But MS did not disagree with Aaron’s claim, “Thus, we can’t ever prove in ZFC that the Diophantine equation has a solution or not, but as soon as we pick a model there is a definite answer.”

Well, in the standard model whether there is a solution to the class of DEs is undecidable. Why would it be that “as soon as we pick a model there is a definite answer”? This would mean that in some chosen nonstandard model, once it was chosen, that the class of all Diophantine Equations was no longer undecidable? That is what it should mean since the question of whether Diophantine Equations in general have a definite answer or not is a philosophical claim and I don’t see how it has anything to do with the discussion of actual DE solutions obtained from the standard model carrying over to nonstandard models, or the converse.

As I preview this, there is a substantial amount of blockquoting. My style of writing is intended for people who read all of a thread before posting a reply, so thereby their reading comprehension has not been attenuated. That certainly applies to people who don’t read all of one specific post before starting a reply.

Posted by: Stephen Harris on October 24, 2009 10:34 PM | Permalink | Reply to this

blockquotes

Guidelines for writing say don’t use blockquotes for one line of quoted text and rarely for two lines

That may be appropriate for writing an essay or a paper, but I don’t think it applies to email or a blog discussion, where there is much more back-and-forth and it’s harder to keep track of who said what and what was a response to what. I use blockquotes for everything I quote (at least, everything longer than a few words), and also nested blockquotes when I reply to a reply. I only suggested Markdown because I think it makes it much easier to do nested blockquotes.

Posted by: Mike Shulman on October 24, 2009 11:32 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Well, just looking at the phrasing of those two sentences, it should be clear that some Diophantine equations in some models have nonstandard solutions.

JB said that all Diophantine equations which have solutions in the standard model have solutions in all models, which includes nonstandard models.

Indeed. But if a Diophantine equation doesn’t have a solution in the standard model, then it can also have nonstandard solutions in some nonstandard models.

I think possibly there’s a confusion of levels going on here. In a given model of ZFC, there is a uniquely defined set called “the natural numbers.” It can be defined uniquely because in ZFC we can use second-order properties. It satisfies the Peano axioms, but there are also many other sets which satisfy those axioms (in any model of ZFC). We may call it “standard” and those others “nonstandard”—in the context of our given model of ZFC. Any Diophantine equation which has a solution in this “standard” model will also have a solution in any “nonstandard” model, but some equations that do not have solutions in the standard model may have solutions in some nonstandard models.

So far PA was our object-theory and ZFC was our meta-theory, but we can also move up a level and consider ZFC the object-theory in some ambient meta-theory. Now no matter what the metatheory is, it will contain multiple models of ZFC. And each such model will contain its own “standard” set of natural numbers, but those sets may be different in different models. That’s why I took issue with Aaron’s comment that

the Diophantine equation either has a solution in the standard integers that I’ve grown rather attached to, or it doesn’t

I was pointing out that “the standard integers” only makes sense relative to a model of ZFC, and there is no standard model of ZFC (unless you assume a metatheory strong enough to characterize it uniquely in a second-order way… but then your metatheory itself has no “standard model,”and so on). So I think it’s misleading to believe that there is a set of “standard integers” that one can have “grown attached to”.

Is that at all helpful? I’m still having trouble figuring out why you think the statements you’re quoting are contradictory.

That means you and Toby are pointing out that some different nonstandard models will have different DE results with each other, which seems trivial to me, or, that nonstandard models can have solutions to DEs that won’t work in the standard model.

Yes, those are both true, and are probably the essence of the point. I don’t know if I’d call either of them trivial. After all, by the Completeness Theorem, the existence of nonstandard models with different solvability properties for Diophantine equations is equivalent to saying there are Diophantine equations whose solvability is undecidable/independent of the axioms. I think some people have been finding that unintuitive.

Well, in the standard model whether there is a solution to the class of DEs is undecidable.

What is “a solution to the class of DEs”? We’re just talking about the existence of solutions to some particular Diophantine equation.

The meaning of “undecidable” here is that “if the axioms are consistent, then they cannot prove the existence or nonexistence of solutions to this particular Diophantine equation.” Whether or not we’ve chosen a model has nothing to do with undecidability in this sense, because picking a model doesn’t change what the axioms are or what the rules of proof are. The equation will either have or not have a solution in the particular model we’ve chosen, but that doesn’t change statements about provability.

Posted by: Mike Shulman on October 24, 2009 11:32 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

In a given model of ZFC, there is a uniquely defined set called “the natural numbers.” It can be defined uniquely because in ZFC we can use second-order properties. It satisfies the Peano axioms, but there are also many other sets which satisfy those axioms (in any model of ZFC).

I thought that we agreed (following Wikipedia) that ‘the Peano axioms’ means the second-order system while ‘Peano arithmetic’ and ‘PA’ mean the first-order system. So the uniquely defined set called “the natural numbers” is the only second-order model of the Peano axioms (in a given model of ZFC, up to unique isomorphism of models), but it's not the only model of Peano arithmetic (nor is it the only Henkin model of the Peano axioms).

Of course, even the unique second-order model of the Peano axioms is only unique relative to the given model of ZFC.

we can also move up a level and consider ZFC the object-theory in some ambient meta-theory. Now no matter what the metatheory is, it will contain multiple models of ZFC.

I would say, it will contain multiple models of ZFC if it contains any at all; it might not be strong enough to prove the consistency of ZFC, in which case it may have none at all. (And I suppose you want to say it's a model of the metatheory that contains these models of ZFC, just as it was a model of ZFC that contained a model of PA before?)

Posted by: Toby Bartels on October 25, 2009 12:41 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I thought that we agreed (following Wikipedia) that ‘the Peano axioms’ means the second-order system while ‘Peano arithmetic’ and ‘PA’ mean the first-order system.

I don’t remember agreeing to that myself. It seems to me like a confusing way to make the distinction. How am I supposed to remember the difference between “axioms” and “arithmetic” and whether the “A” in “PA” stands for “Arithmetic” or “Axioms”? If this really is “standard” usage even outside Wikipedia (can you point to references?) then I guess I’ll try to fall in line. But my natural inclination is to always mean first-order unless I specifically say “second-order.”

Posted by: Mike Shulman on October 25, 2009 3:24 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I would say, it will contain multiple models of ZFC if it contains any at all; it might not be strong enough to prove the consistency of ZFC, in which case it may have none at all. (And I suppose you want to say it’s a model of the metatheory that contains these models of ZFC, just as it was a model of ZFC that contained a model of PA before?)

I remember about Peano Axioms/Postulates. The quote of your post directed me in the direction of the FOM and Timothy Chow.

In order to deduce “ZFC is inconsistent” from “ZFC |- ~con(ZFC)” one needs something more than the consistency of ZFC, e.g., that ZFC has an omega-model (i.e., a model in which the integers are the standard integers).

To put it another way, why should we “believe” a statement just because there’s a ZFC-proof of it? It’s clear that if ZFC is inconsistent, then we *won’t* “believe” ZFC-proofs. What’s slightly more subtle is that the mere consistency of ZFC isn’t quite enough to get us to believe arithmetical theorems of ZFC; we must also believe that these arithmetical theorems are asserting something about the standard naturals. It is “conceivable” that ZFC might be consistent but that the only models it has are those in which the integers are nonstandard, in which case we might not “believe” an arithmetical statement such as “ZFC is inconsistent” even if there is a ZFC-proof of it.

So you need to replace your initial statement that “we assume throughout
that ZFC is consistent” to “we assume throughout that ZFC has an omega-model”; then you should see that your “paradox” dissipates. Tim

Thanks for your thoughtful comments.

Posted by: Stephen Harris on October 25, 2009 7:57 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I will think more on your post later. Your comment discussed below caught my eye, now.

MS quoted Stephen:
“Well, in the standard model whether there is a solution to the class of DEs is undecidable.”

MS queried:
What is “a solution to the class of DEs”? We’re just talking about the existence of solutions to some particular Diophantine equation.
The meaning of “undecidable” here is that “if the axioms are consistent, then they cannot prove the existence or nonexistence of solutions to this particular Diophantine equation.”

Yuri Matiyasevich wrote:
“The set H-o is itself Diophantine, since it is defined by equation (4.6.7). This fact shows that Turing machines are incapable of of deciding whether or not the equations belonging _to one particular family of Diophantine equations_ have solutions, to say nothing of _arbitrary Diophantine equations_.

SH: It is why I often quote. Yuri said one particular family, not class, and also included particular Diophantine equations.
There are some DEs which can be solved, but there are countably infinite DEs which cannot be solved=undecidable.
I was thinking of those countably infinite unsolvable DEs in terms of a collection.


Posted by: Stephen Harris on October 25, 2009 1:17 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I was thinking of those countably infinite unsolvable DEs in terms of a collection.

Ah, okay, so “whether there is a solution to the class of DEs is undecidable” meant “for each DE in some class, whether it has a solution is undecidable.”

I think the rest of my reply is unchanged:

Whether or not we’ve chosen a model has nothing to do with undecidability …because picking a model doesn’t change what the axioms are or what the rules of proof are. The equation will either have or not have a solution in the particular model we’ve chosen, but that doesn’t change statements about provability.

Posted by: Mike Shulman on October 25, 2009 3:28 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I think the rest of my reply is unchanged:
Whether or not we’ve chosen a model has nothing to do with undecidability …because picking a model doesn’t change what the axioms are or what the rules of proof are. The equation will either have or not have a solution in the particular model we’ve chosen, but that doesn’t change statements about provability.

That isn’t under dispute (I don’t think). I don’t understand provability that well yet. Especially the difference between the computability result and provability result, although it seems to be soundness.

Posted by: Stephen Harris on October 25, 2009 5:31 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The equation will either have or not have a solution in the particular model we’ve chosen, but that doesn’t change statements about provability.

That isn’t under dispute (I don’t think).

Okay. You said:

This would mean that in some chosen nonstandard model, once it was chosen, that the class of all Diophantine Equations was no longer undecidable?

which seemed to me to be saying that undecidability was dependent on the choice of model.

Posted by: Mike Shulman on October 25, 2009 6:13 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

MS quoted Stephen:
“Well, in the standard model whether there is a solution to the class of DEs is undecidable.”

MS queried:
What is “a solution to the class of DEs”? We’re just talking about the existence of solutions to some particular Diophantine equation.

The meaning of “undecidable” here is that “if the axioms are consistent, then they cannot prove the existence or nonexistence of solutions to this particular Diophantine equation.”

SH: Why is your claim about “particular” true? My information from a book by Martin Davis and another book, use the word “class”. I don’t think we have the same assumptions.

“A class of questions is recursively unsolvable if there is no single
algorithm which provides answers for all the questions in that class.

2. Decision Problems. We are at last in a position to treat, in a precise
manner, the question of decision problems, discussed in the Introduction.
As indicated there, a decision problem inquires “as to the existence of
an algorithm for deciding the truth or falsity of a whole class of statements.
. .. A positive solution to a decision problem consists of giving an algorithm
for solving it; a negative solution consists of showing that no algorithm for
solving the problem exists, or, as we shall say, that the problem is unsolvable.”

SH: I see my usage of “class” in evidence.

“R.A. Mollin, On the insolubility of a class of Diophantine equations” …

“On a class of diophantine equations related to the numbers of” …

“On the solvability of a class of diophantine equations and” …

SH: Just before I posted this I checked the replies and saw this reply of yours. But I’ve put too much work into this post to cancel it now.

MS: “Ah, okay, so whether there is a solution to the class of DEs is undecidable” meant “for each DE in some class, whether it has a solution is undecidable.

SH: I didn’t mean trivial in the mathematical sense. I meant unimportant to clearing up Aaron’s concerns. He is a physicist who probably solves DEs with computer software. He cared mostly about whether his solutions, his integers were going to be have their integrity shorn in some of other model.


Aaron Bergman wrote:
If I give you a solution to a Diophantine equation, I can phrase that solely using the Peano axioms (having established that every natural number is some successor of “zero”).

*Does it not therefore have to hold in all models thereof?*

The answer to his question, imo, is a simple yes. I don’t think your answer or Toby’s assure Aaron. JB’s answer, says yes it holds in any model.

Other Aaron Bergman wrote:
“Because those are _my_ integers, dammit.
Thus, the Diophantine equation either has a solution in the standard integers
that I’ve grown rather attached to, or it doesn’t.”

SH: So replies like

TB: “All Mike said was that there are some Diophantine equations that have solutions in some models and none in others”

MS replied: “That’s okay as long as you realize that “the standard integers”
will be different in different models of ZFC.”

SH: What those answers don’t do is answer Aaron’s expressed concern, AB wrote:
“*Does it not therefore have to hold in all models thereof?*”

Your answers give him some room to suspect that his DE solutions obtained in “the standard integers” don’t hold in all models thereof” because your answers aren’t clear like JB’s ‘yes they hold in any model” so your answers are obfuscations, because they present unimportant information to Aaron at the expense of a direct answer to his question? Particularly Toby’s reply could be interpreted as some DEs have solutions in some of the models which concern Aaron, but not in _all_ models which concern Aaron. Your replies concerned exceptions that Aaron did not ask about.

I guess if you don’t understand my point we may as well drop this. I don’t think you understood the trend of Aaron’s questions, what he wanted to know.

Also PA does mean 1st-order. The 2nd-order Peano Axioms which have an axiom of inductance is replaced by an inductive schema in Peano Arithmetic.

Here is something I may be wrong about. I think that Q, Robinson Arithmetic, has the axioms of Peano Arithmetic minus the inductive schema. It still is complex to fall under Godelian Incompleteness, which I think means it is sufficient to prove that Hilbert’s 10th Problem regarding the unsolvability of DEs can be established in Q also. I’m going the other way with this than what Toby wrote, which I didn’t understand really.


Posted by: Stephen Harris on October 25, 2009 5:18 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Remember that there are two different meanings of “undecidable.” Anything that you see talking about undecidability of a “class” of problems is almost surely talking about the computability meaning, which as we’ve said several times is distinct from the logical one, which is the one in use here.

If I give you a solution to a Diophantine equation, I can phrase that solely using the Peano axioms (having established that every natural number is some successor of “zero”).

Does it not therefore have to hold in all models thereof?

Certainly, the answer to this is a simple “yes” as long as “I give you” means “I write out explicitly on a piece of paper” or something like that, because in that case the integers making up the solution certainly must exist in any model. I guess I didn’t feel the need to say that explicitly. If Aaron was confused about that, then I’m sorry. But I thought it sounded like he was fairly satisfied with the outcome.

The philosophical point I was making is that it doesn’t follow from this that you can identify exactly what “your” integers are in order to assert that the Diophantine equation has a solution in them or doesn’t. Because the mere existence of a solution in some model doesn’t mean that that solution could necessarily be “given” in such a way. But when working in ZFC, you can identify “the” integers, and that’s fine. Although John pointed out that a model of PA is almost as good for identifying “the” integers. And in neither case do we have any canonical way to pick a model without being given one to start out with. But for practical purposes, of course one can just work in ZFC or whatever (or in no foundation at all, for that matter); this is all basically philosophy.

Posted by: Mike Shulman on October 25, 2009 6:31 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Remember that there are two different meanings of “undecidable.” Anything that you see talking about undecidability of a “class” of problems is almost surely talking about the computability meaning, which as we’ve said several times is distinct from the logical one, which is the one in use here.

I agree that the logical one was the usage for undecidability here. Why is it that it is useful and true to speak about classes in the computability usage, but not in the logical usage? Is it an advantage or a limitation that the logical usage of undecidability apparently doesn’t permit talking about a collection of like undecidable problems?
You apparently had a good reason for remarking,

What is “a solution to the class of DEs”? We’re just talking about the existence of solutions to some particular Diophantine equation.

So another way of asking my question is how is it a conceptual improvement to talk about the existence of solutions only to a particular DE when it seems true that there are true statements which apply to families of DEs with solvability problems and their solutions?
Perhaps there is a trade off and the logical approach enjoys an advantage in another area.

I haven’t finished reading Yuri’s book on Hilbert’s 10th. He mentions that integers are used but I haven’t read anything about a problem defining the standard integers, yet.

“In the Light of Logic” by Sol Feferman

It was only when Turing, in 1937, offered the definition in terms of his “machines” that Godel was ready to accept such an identification, and thereafter he referred to Turing’s work as having provided the “precise and unquestionably adequate definition of formal system” by his “analysis of the concept of ‘mechanical procedure’” needed to give a general formulation of the incompleteness results.

Thank you for explaining your reasoning in regard to the other issues.

Posted by: Stephen Harris on October 25, 2009 1:05 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I agree that the logical one was the usage for undecidability here.

Great. I think this means that the quotes you gave in this post about undecidability of “a class of questions” are irrelevant, since they are talking about the computability usage.

Why is it that it is useful and true to speak about classes in the computability usage, but not in the logical usage?

The logical usage is about a single statement: we say PP is undecidable in the theory TT if (assuming TT is consistent) neither PP nor ¬P\not P is provable from TT. One can of course speak about classes of statements, for instance one can assert that “for every nn, the statement P nP_n is undecidable”, or one can move the quantifier inside and assert “the statement n.P n\forall n. P_n is undecidable” or “the statement n.P n\exists n. P_n is undecidable.”

By contrast, the computability usage can only be about a class of problems, since any single problem is always decidable in the computability sense. (If there is a solution SS, then the problem is solved by the algorithm print S, and otherwise it is solved by the algorithm print "no solution".) Since the meaning of “undecidable” in the two cases is different, I don’t think it makes sense to say that one has an “advantage” or “disadvantage” in any sense; it’s comparing apples and oranges.

So another way of asking my question is how is it a conceptual improvement to talk about the existence of solutions only to a particular DE when it seems true that there are true statements which apply to families of DEs with solvability problems and their solutions?

It’s a more precise statement to say “here is a particular Diophantine equation pp whose solvability is undecidable (in the sense of logic)” than to say “there exists a Diophantine equation whose solvability is undecidable (in the sense of logic)”. And both of them are totally different (apples and oranges) from saying “the class of problems that consist of solving Diophantine equations is undecidable (in the computability sense).”

That isn’t to say there aren’t relationships between the two kinds of undecidability, for instance I believe that at least one of the Incompleteness Theorem and the Halting Problem can be used in the proof of the other. But the meanings of the word in the two cases are apples and oranges. That’s why you probably won’t find anything about nonstandard integers in a book on Hilbert 10, and you won’t necessarily find anything about Turing machines in a book on model theory.

Posted by: Mike Shulman on October 26, 2009 3:37 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

By contrast, the computability usage can only be about a class of problems, since any single problem is always decidable in the computability sense. (If there is a solution S, then the problem is solved by the algorithm print S, and otherwise it is solved by the algorithm print “no solution”.) Since the meaning of “undecidable” in the two cases is different, I don’t think it makes sense to say that one has an “advantage” or “disadvantage” in any sense; it’s comparing apples and oranges.

This seemed to be your major point as to why the usages of logical undecidability and computability undecidability were apples and organges. This is not how I learned it =/= your idea that any _single_ problem is always decidable in the computability sense. I think you gave the definition of a decidable problem, not (semi)decidable) or undecidable.
I believe that computability undecidability applies to particular Diophantine equations as well as to families of Diophantine equations. One such DE isn’t always decidable in the computability sense.

Decidability
“A decision problem A is called decidable or effectively solvable if A is a recursive set. A problem is called partially decidable, semidecidable, solvable, or provable if A is a recursively enumerable set. Partially decidable problems and any other problems that are not decidable are called undecidable.”

Decidable Problem: Definition: A decision problem that can be solved by an algorithm that halts on all inputs in a finite number of steps. The associated language is called a decidable language.

Partially Decidable Problem
Definition: One whose associated language is a recursively enumerable language. [SH: Diophantine equations are r.e. but the undecidable DEs are not recursive.]

Equivalently, there exists an algorithm that halts and outputs 1 for every instance having a “yes” answer, but for instances having a “no” answer is allowed either not to halt or to halt and output 0.

The above partially decidable does halt and output “print” or 1 for every instance with a “yes” answer, but the “no” answer does not output “no solution” (unless there is an artificial time constraint applied to the algorithm) because it cannot be determined whether there is a solution, but not found in the allotted time, or there really is no solution. One can’t tell why it doesn’t halt when it is partially decidable, or undecidable, and there is no way of knowing that ahead of time. Supposing, you pick some arbitrary, particular DE equation from an urn which contains a countably infinite number of decidable DEs and also a countably infinite number of undecidable DEs, which is why the set is (semi)decidable.

This is a different short summary of the book written by Matiyasevich on Hilbert’s 10th Problem.

The essence of a decision problem lies inthe requirement to find a single method that can be used for obtaining the answer to any of the individual subproblems. Since Diophantus’s time, number theorists have found solutions to many Diophantine equations and have established the unsolvability for many other equations. However, for many particular classes of equations, and even for certain individual equations, it was necessary to invent a special method. In the Tenth Problem, Hilbert asked for a universal method for deciding the solvability of Diophantine equations. Although Hilbert asked for a general method for deciding whether a single Diophantine equation does or does not have a solution, Diophantus had considered systems of equations as well. However, it is easy to see that a positive solution of Hilbert’s Tenth Problem would also give us a method for deciding whether a system of Diophantine equations does or does not have a solution.

This fact shows that Turing machines are incapable of of deciding whether or not the equations belonging _to one particular family of Diophantine equations_ have solutions, to say nothing of _arbitrary Diophantine equations_.

In the previous sections of this chapter we have obtained two, no doubt remarkable, results. Namely, we have establish that

the class of Diophantine sets

is identical to

the class of Turing semidecidable sets

and Hilbert’s Tenth Problem is Turing undecidable.

It may be the case that the two “undecidabilities” are indeed apples and oranges, but it doesn’t appear to me that the reason is this one that you provided (partially quoted).

Posted by: Stephen Harris on October 27, 2009 12:59 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The reason they are apples and oranges isn’t because one applies to classes and one doesn’t; it’s just because their definitions are different. One is about provability, the other is about computability. They use different terms; they’re from different branches of mathematics.

That said, I stand by my assertion that any single problem is always decidable in the computability sense, for the reason I gave above; I didn’t see anything in the references or quotes you gave that contradicts this. For a fixed Diophantine equation PP, the question of whether PP has a solution is always computability-decidable for this reason. A different problem is whether the set of solutions of PP is computability-decidable, but it is, because for any putative solution we can simply plug it into the equation and see whether we get zero.

However, the set of all Diophantine equations that have solutions is semidecidable (we can enumerate all (equation,solution) pairs and will eventually get to every equation that has a solution), but not computability-decidable. This follows from Hilbert 10, which as in the quote you quoted, exhibits a particular family of Diophantine equations such that the set of all equations in that family which have a solution is not computability-decidable.

Posted by: Mike Shulman on October 27, 2009 1:40 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

qcpages.qc.cuny.edu/~rmiller/Notices.pdf

Of course, in computability theory, our algorithms often involve simple (perhaps even simple-minded) search procedures. An analogous situation arose in Hilbert’s Tenth Problem, which demands an algorithm for determining whether an arbitrary Diophantine equation (i.e. a polynomial in Z[X1, … , Xn], for any n) possesses a solution in Zn.
Intuitively, the real question is to find such a solution, not merely to prove that one exists. However, a simple search procedure suffices: just check, for each m ≥ 0 in order, whether any ~a ∈ Zn with Pi |ai| = m solves the equation. This algorithm certainly produces a solution, assuming only that one exists; the difficulty is that if no solution exists, the algorithm will simply run forever, without ever giving an answer. So the question of effectively finding a solution reduces to Hilbert’s question of how to decide whether such a solution exists.
Matijasevic proved in [7], building on work of Davis, Putnam, and Robinson,
that no algorithm exists which will compute the set of Diophantine equations
possessing solutions, by showing that such an algorithm would allow us to
compute K (and all other c.e. sets).
[Repeating for emphasis]
“This algorithm certainly produces a solution, assuming only that one exists; the difficulty is that if no solution exists, the algorithm will simply run forever, without ever giving an answer.”

SH: That means there is no such thing as an algorithm which will print “no solution” as you previously stated for any particular undecidable Diophantine equations (which means arbitrary).

MS said: ” (If there is a solution S, then the problem is solved by the algorithm print S, and otherwise it is solved by the algorithm print “no solution”.)

I think JB thinks this aspect of the discussion has drifted off-topic so you
may have the last remark if you want.

Posted by: Stephen Harris on October 27, 2009 2:56 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

no algorithm exists which will compute the set of Diophantine equations possessing solutions

That means there is no such thing as an algorithm which will print “no solution” as you previously stated for any particular undecidable Diophantine equations (which means arbitrary).

No, it doesn’t mean that. What I said was that for any particular Diophantine equation, there is an algorithm which will tell you whether it has a solution; namely the trivial algorithm I described. These algorithms are obviously different for different equations. The point that you quoted says the different, also true, statement that there is no single algorithm that works for all equations.

By all means, let’s wrap this up; we both seem to be reduced to repeating ourselves. (-:

Posted by: Mike Shulman on October 27, 2009 4:12 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

What I said was that for any particular Diophantine equation, there is an algorithm which will tell you whether it has a solution; namely the trivial algorithm I described. (If there is a solution S, then the problem is solved by the algorithm print S, and otherwise it is solved by the algorithm print “no solution”.)

Explain to me how this “algorithm” halts, or terminates so that it can print “no solution”? An undecidable problem can’t halt or terminate, therefore it can never print “no solution”. What you think of as a possible simple algorithm cannot possibly exist, it is not an effective procedure and it never can be. And if the algorithm does not halt, you cannot infer that it is undecidable. Here is a related idea.

http://www.efgh.com/math/impossible.htm
“Now consider the problem of determining whether a program can print out a specified string S (with or without other output). If you can solve the halting problem, you can solve this problem.”

It would be nice to have another expert opinion. It seems to me that you are trying to say that each and every (all the particular instances) of the Diophantine set are decidable by different algorithms; and only the totality of the set of Diophantine equations are undecidable, and that is because what is meant is just by one algorithm. I think that if each DE is decidable (even by many, many different algorithms) that the answer to Hilbert’s Tenth Problem would be Yes, instead of No. [www.cs.ucr.edu/~jiang/cs215/ravi-new.pdf]


Posted by: Stephen Harris on October 27, 2009 7:13 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

If there is a solution S, then the problem is solved by the algorithm print S, and otherwise it is solved by the algorithm print "no solution".)

Explain to me how this “algorithm” halts, or terminates so that it can print “no solution”?

Go write a program in your favorite programming language that consists of just the one instruction print "no solution". (Plus whatever other cruft is required to make it a valid program.) Run it. You will see that it terminates. Quite quickly, in fact.

Posted by: Mike Shulman on October 27, 2009 7:56 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

You claimed that the “algorithm”, IF it found that a particular Diophantine equation had no solution, THEN it would print out “no solution”. Determining whether a particular Diophantine equation is decidable or undecidable (has a solution) is a decision problem. I don’t see how your example qualifies as a decision problem or helps to explain how one would work.

MS: Go write a program in your favorite programming language that consists of just the one instruction print “no solution”. (Plus whatever other cruft is required to make it a valid program.) Run it. You will see that it terminates. Quite quickly, in fact.

Your example leaves out the part where the algorithm has to decide something first before it can terminate with Print “no solution”. That is not in the same category as calling the Print function.

Posted by: Stephen Harris on October 27, 2009 12:09 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

This paper seems relevant to the subject matter of this blog entry — perhaps even more relevant than most of what we’ve been talking about!

It even mentions ‘setoids’, which previously I’d only heard about from Toby. A very important concept!

(I don’t love the Wikipedia definition of a setoid as a set with an equivalence relation, since it seems to miss the point that setoids are mainly interesting as a concept that comes before sets in certain constructive approaches to foundations.)

Posted by: John Baez on October 26, 2009 8:32 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

(I don’t love the Wikipedia definition of a setoid as a set with an equivalence relation, since it seems to miss the point that setoids are mainly interesting as a concept that comes before sets in certain constructive approaches to foundations.)

I don't think that's really correct. IME, the term ‘setoid’ is used only after the term ‘set’ has been introduced, usually in a context where the sets are not sufficiently general (and in particular do not form a pretopos because one cannot take quotient sets). This is fixed by moving from these too specific sets to setoids.

In contrast, if you call these too specific things ‘types’, ‘presets’, or ‘completely presented sets’ (probably other terms are also used), then you never need the term ‘setoid’ at all but can simply say ‘set’ instead. So a ‘setoid’ can always be a ‘set’ with an equivalence relation, whatever a ‘set’ may be.

In fact, Palmgren's use of ‘setoid’ to contrast with ‘iterative set’ somewhat matches (in the contexts of his paper) our contrast of ‘structural set’ with ‘material set’; perhaps he wouldn't have felt the need to use such terminology if he'd had our terminology. As evidence that it is new, note that his ‘type’/’setoid’ terminology (as opposed to ‘type’/’set’ or ‘set’/’setoid’) doesn't match the usages cited on page 9 of this reference (which is also by Palmgren, although page 9 seems to be attributed to Aczel).

However, I'm the one who started the Wikipedia entry with that definition, so keep that in mind when I defend it. (^_^)

Posted by: Toby Bartels on October 26, 2009 6:05 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I realize now that I’d been mixing up ‘setoids’ with ‘presets’. You’d been telling me about presets! Everything I said here about setoids was supposed to be about presets!

Life is so complicated…

Posted by: John Baez on October 28, 2009 4:03 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Hi, thanks for this reference and to Mike for his explanation for an issue which must seem too basic. I was looking at the FAQ on how to make posts look nice (which I think you wrote) so here goes. Recall that according to the Wikipedia, that the Peano Axioms have a 2nd-order Inductance axiom which becomes the 1st-order “inductance scheme” in Peano Arithmetic.

$$\omega$$-inconsistent theories are formally consistent with the PA axioms, and thus have models by Gödel’s completeness theorem. These are called non-standard models of arithmetic. They basically have all the sets of naturals the ordinary natural numbers have, except they admit more subsets of the naturals – they admit vague sets of natural numbers as well as the old precise sets. Intuitively this is right – when we only had precise sets we got into all sorts of trouble. We couldn’t even talk about the set of large numbers because it didn’t exist; it was a vague set.

What is interesting is that some of these new sets of natural numbers don’t have smallest members. In fact, the set of all non-standard elements is one of these sets, but there are many others. So my suggestion here is that the set of large numbers is one of these non-standard sets of naturals.

How to make sense of all this? Well, the first thing to bear in mind is that the non-standard models of arithmetic are not to be taken too seriously. They show that the view in question is consistent, and are also a good guide to seeing what sentences are in fact true. For example in a non-standard model the second order universally quantified induction axiom is false, since the second order quantifiers range over vague sets, however the induction schema is true, provided it only allows instances of properties definable in the language of arithmetic (this is how the schema is usually stated) since those instances define only precise sets.

Posted by: Stephen Harris on October 26, 2009 6:10 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

ω\omega-consistent theories

This server works oddly. When I hit cancel and before the post is accepted the post isn’t canceled. So I wind up with duplicate posts because the original post doesn’t show up on the server for some time after either, when I check to see if the cancellation took effect.

The FAQ says one can use “markdown with itex to MathML” if you prefer a simple markdown system.

But, I had to use itex to MathML with parbreaks. I think the server should be placed on reduced oil rations.

Posted by: Stephen Harris on October 26, 2009 7:28 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

When I hit cancel and before the post is accepted the post isn’t canceled.

After you click “post,” the comment is usually posted pretty quickly, even though the server takes a while to return to you from that request. (Jacques has said this is probably due to sending a notification email to the poster synchronously.) You shouldn’t expect to be able to cancel your posting after you click “Post,” just like you shouldn’t expect to be able to back out of buying something at Amazon after you click “Purchase”.

But, I had to use itex to MathML with parbreaks.

Can you be more specific? Why did you “have to” use it?

Posted by: Mike Shulman on October 26, 2009 7:40 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I couldn’t produce both the TeX for “omega” and a blockquoted text with “Markdown with itex to MathMl without errors. Using the identical content but changing the text filter to: itex with MathMl and parbreaks there were no errors. Originally on this post effort, I tried to include the errors, but this created additional errors. Maybe I should have put them inside quotes. However, the server then aborted completely with my next attempt, which means I don’t know whether my first reply on this topic is going to appear or not. So I’m going to try to leave the text filter permanently set on itex to MathMl with parbreaks if the remember cookie button works.

Posted by: Stephen Harris on October 26, 2009 9:32 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I think the problem was here,
itex to MathML which uses,
Raw HTML (you need to insert all your left/right arrow p tags by hand), but with
embedded itex equations.

I thought that Markdown would take care of the insertion of the paragraph tags, as in Markdown with itext to MathML. I thought that Markdown did the par tags automatically.

But, apparently, this is only automatic with itex to MathML with parbreaks.

Posted by: Stephen Harris on October 26, 2009 9:56 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Markdown does insert paragraph tags at blank lines. Were you maybe trying to use explicit <blockquote> tags? Markdown just wants you to use > at the beginning of each blockquoted paragraph.

Posted by: Mike Shulman on October 26, 2009 10:07 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Yes

Posted by: Stephen Harris on October 27, 2009 11:00 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Mike wrote:

Why did you “have to” use it?

If you go into the back door of the Café and look at the source code of his comment you’ll see what he did. Among other things, he wrote

$$\omega$$-inconsistent theories

and got

$$\omega$$-inconsistent theories

instead of

ω\omega-inconsistent theories.

Why would he want double dollar signs here instead of single ones? Only God knows! Personally I would prefer to write

$\omega$-inconsistent theories

and get

ω\omega-inconsistent theories

But that’s just me.

He claims this happened when he was using ‘Markdown with itex to MathMl’ — unfortunately it’s impossible to verify this (it’s easy to screw up), and since I never use this particular text filter I don’t know how it’s supposed to work. I tried to fix his comment as soon as I saw it, but I didn’t know how.

Posted by: John Baez on October 28, 2009 4:15 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

All those constructivists use the word “set” to mean “type”. It’s confusing to classical mathematicians (who expect “set” to mean ZFC-style sets), but it’s an standard usage.

Translating the Wikipedia article to standard mathematician, “A setoid is a type equipped with an equivalence relation.” When you’re doing theorem proving in dependent type theory, they show up quite frequently, in irritatingly non-esoteric cases.

Posted by: Neel Krishnaswami on October 26, 2009 6:12 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

All those constructivists use the word “set” to mean “type”. It’s confusing to classical mathematicians (who expect “set” to mean ZFC-style sets), but it’s an standard usage.

And it's confusing to categorially-minded mathematicians, who don't expect sets to be ZFC-style but do expect them to form a pretopos.

But I take issue with ‘All those constructivists’. Bishop, in particular, does not use ‘set’ to mean type; he uses ‘preset’ for that. His sets do form a pretopos and are comprehensible as such to categorially-minded structuralists.

Posted by: Toby Bartels on October 26, 2009 6:34 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

But I take issue with ‘All those constructivists’. Bishop, in particular, does not use ‘set’ to mean type; he uses ‘preset’ for that. His sets do form a pretopos and are comprehensible as such to categorially-minded structuralists.

This is a fair criticism – I was thinking of systems like Coq and Agda, which do use this terminology. They inherit this usage from Martin-Loef, who does call types “sets”, even though his philosophy is as vigorously predicative as you might want.

Posted by: Neel Krishnaswami on October 27, 2009 11:27 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

All those constructivists use the word “set” to mean “type”. It’s confusing to classical mathematicians (who expect “set” to mean ZFC-style sets), but it’s a standard usage.

And it’s confusing to categorially-minded mathematicians, who don’t expect sets to be ZFC-style but do expect them to form a pretopos.

I would go so far as to say that most mathematicians (most of whom are “classical”) use “set” to mean “type,” even if they are not aware of it. When questioned about what a “set” is, they may fall back on a ZFC-style conception since that’s what they’ve been taught, but if you look at how they use sets, they never make use of a global membership predicate. That is, they treat sets “structurally,” which is to say they treat them like types. Hence why I believe structural set theories like ETCS and SEAR are justified in using the word “set” (rather than “type”) even though their sets don’t have a global membership predicate like ZFC-style sets do.

Posted by: Mike Shulman on October 26, 2009 7:04 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I would go so far as to say that most mathematicians (most of whom are “classical”) use “set” to mean “type,”

Where of course if you don’t believe in quotient types, you should replace “type” by “setoid” or whatever other word you want to use for “a quotient of a type”.

Posted by: Mike Shulman on October 26, 2009 7:37 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I think the most characteristically set-theoretic operation is powerset. I imagine that most mathematicians would be more shocked if something like the Knaster-Tarski theorem failed to hold, than by strictures about type structure (which I agree they mostly already observe). But powerset turns sets into objects, which is alien to type theory – you never compute with types.

You can have data representing types, and you can have hierarchies of judgements to classify types, and types can occur in terms, but a type itself is never an object classified by a type. In fact, Girard showed that the most natural way of adding this to type theory was inconsistent.

You could argue, and I would probably agree, that what we really want is not powerset per se, but impredicativity. But I don’t think anyone has a satisfying (ie, both normalizing and with decidable type checking) proof theory of impredicative type theory extended with the necessary parametricity axioms. This is not a problem for consistency, but I don’t think you really have a respectable type theory until you can show cut-elimination/normalization for it.

Posted by: Neel Krishnaswami on October 27, 2009 11:16 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

a type itself is never an object classified by a type

I agree that this is more set-theoretic than type-theoretic, and bad for proof theory, and I’m not a type theorist, but I think “never” is too strong a word. I’ve definitely seen type theories that contain powertypes, which classify subtypes, and universes, which classify “small” types.

Toby and I had a discussion about the difference between type theory and set theory starting here.

Posted by: Mike Shulman on October 27, 2009 5:52 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Yeah, I used the word “never” precisely because it is so strong – and those examples you mention were precisely what I was thinking of when I qualifieds that claim. A type theory with a genuine, consistent powerset operation would be a fascinating beast.

That link is pretty interesting. As an aside, it’s fairly easy to give consistent type theories where you *can* quantify over all types. The second-order lambda calculus (Girard’s System F) is an example, and the Calculus of Constructions is an example of one more suitable for use for developing mathematics. The interesting thing about them is that while you can use their impredicativity to *define* things like the naturals, you can’t easily show uniqueness. (I think the categorical way of saying this is something like “you can only construct a weak natural numbers object in the internal language”.)

Personally, I regard the consistency of these impredicative theories as something of a mathematical miracle, since (with the addition of axioms to handle uniqueness) they capture every bit of impredicative reasoning I have ever felt the slightest urge to do, while still blocking all the paradoxes. In fact, I feel the absence of those quantifiers in ordinary mathematics quite keenly.

But I’m not sure: did you mean something else by “quantification over all types”?

Posted by: Neel Krishnaswami on October 28, 2009 10:44 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

A type theory with a genuine, consistent powerset operation would be a fascinating beast.

I don’t understand what you’re after here. The higher-order type theory that describes the internal language of a topos has a powerset operation and is presumably consistent. Is it not “genuine”?

it’s fairly easy to give consistent type theories where you can quantify over all types. The second-order lambda calculus (Girard’s System F) is an example, and the Calculus of Constructions is an example of one more suitable for use for developing mathematics.

Thanks! I will have to learn more about those (I don’t even know enough about them to answer your question). As I said, I’m not a type theorist.

I’m curious: would you call SEAR a type theory, as Toby would?

Posted by: Mike Shulman on October 28, 2009 3:04 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Sorry, my language was confused: I think all of the examples you give are type theories. The extra principle that I want is for a type theory to additionally satisfy the propositions-as-types principle.

Toposes/higher-order-logic don’t meet this requirement: they give us a distinguished type of propositions, and a set of axioms about that type. That is, they give us a distinguished truth value object (the subobject classifier), and the logical connectives are (generalized) elements of the subobject classifier, with logical deductions showing up as facts about the equality of these elements. This feels to me a bit like building upon an algebraic/lattice-theoretic view of logic and truth – provability is the fundamental notion, and proofs have no independent existence.

However, props-as-types asks that all the types of our languages be interpretable as propositions, and all the terms be intepretable as the proofs of those. E.g., the simply-typed lambda calculus is a model of intuitionistic propositional logic, and each type is also readable as a proposition. Its categorical models are CCCs, so that (e.g.) conjunction is interpreted as products and disjunction as coproducts, implication as exponentials, and so on. Furthermore, the morphisms represent the proofs, so this is a semantics where proofs do have a life of their own.

So what I want is something similar, only extended to a full mathematical vocabulary. I would not be terribly surprised to learn that this has already been done, or that I need to shift my perspective to understand it – my knowledge of categorical logic isn’t remotely comprehensive.

Posted by: Neel Krishnaswami on October 28, 2009 4:49 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Can you say precisely what you mean by “satisfy the propositions-as-types principle”? I mean, in any type theory at all I can interpret any type AA as the proposition “AA is inhabited” and any term a:Aa:A as an existence proof that AA is, indeed, inhabited. But clearly you have something else in mind.

Posted by: Mike Shulman on October 29, 2009 1:56 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

MS: But clearly you have something else in mind.

Yes, I’m guessing he has the propositions as types principle in mind.

Folks who are interested in some Diabolic Variations on this Theme might want to check out my fugue-ative canon, still in search of title — Phantom of the Operators is already taken, so I’m currently contemplating The Floating Operator.

Posted by: Jon Awbrey on October 29, 2009 2:02 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

So, running with your examples, suppose we have particular types AA and BB, which we interpret as the propositions “AA is inhabited” and “BB is inhabited”. Then, we can take a:Aa : A and b:Bb : B as existence proofs of these relationships.

Now, let’s ask the question: is there a type corresponding to the particular proposition “AA is inhabited and BB is inhabited”?

Unless there is a type-former corresponding to logical and, then the answer to that question is no! A logic over a type theory has the propositions-as-types property if each of the logical connectives we’re interested in is given by means of a type constructor. (For example, we could interpret conjunction as pairing, and introduce a type A×BA \times B, whose introduction terms are pairs (a,b)(a,b).)

Then, the terms of the type theory will constitute the proofs of the propositions in our logic. Therefore a model of the type theory will automatically give you a model for the proofs of our logic, too.

To contrast, let’s compare this to a topos model of logic. There, the closed sentences of the logic are interpreted as elements of the suboject classifier, and two sentences are logically equivalent if they are the same element. So the sentence “AA is inhabited and BB is inhabited” is interpreted as some element of the truth value object Ω\Omega, and it is true when it is equal to the element for true. So a model of the type theory doesn’t contain a model of the proofs of the logic – it contains a model of truth for the logic (e.g., Ω\Omega will be a Heyting algebra for intuitionistic logic).

As an aside, thanks for your patience – explaining what I think has been very helpful for helping me understand semantics better. (Mostly I know structural proof theory, and the little categorical logic I’ve picked up has been out of guilt for having an unbalanced view of the subject.)

Posted by: Neel Krishnaswami on October 29, 2009 2:48 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I’m still confused—the type theory of a topos does have a product type operator, so what’s the problem? Why can’t I interpret “AA is inhabited and BB is inhabited” as “A×BA\times B is inhabited”?

You haven’t specified a type theory that does satisfy what you want, but I’m imagining something like Martin-Lof dependent type theory. However, I think I can get the internal type theory of a topos from Martin-Lof type theory just by adding something, namely the subobject classifier. How does simply making the language more expressive destroy the property you’re interested in?

Posted by: Mike Shulman on October 29, 2009 3:46 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Why can’t I interpret “A is inhabited and B is inhabited” as “A×B is inhabited”?

You can, but that’s just not what conjunction means for the usual interpretation of higher-order logic in topos theory. In that intepretation, conjunction is the meet in the algebra of truth values Ω\Omega gives you.

If you choose to interpret the logical connectives by means of type operators, then you aren’t using the subobject classifier to interpret the logic. It’s a different interpretation, with different properties.

Since we have two different interpretations, we can’t move between them without a theorem telling us how.

This all seems obvious, so my wording must be flawed in some maximally opaque way….

How does simply making the language more expressive destroy the property you’re interested in?

I hope I’ve answered your first question already, but this question is one of my personal hobby horses. :)

I think this happens all the time, all over mathematics. As a concrete example, think of data abstraction in programming languages.

Suppose we have an abstract type of natural numbers, which we might implement either in unary form, or as binary sequences. If we only expose the arithmetic operations to the client programmer, then we expect he won’t be able to tell which representation you choise (ignoring questions of runtime). However, if we add to the language the ability to ask a term what its representation is, then the client programmer can tell the difference between the two representations, and write programs which do things differently depending on the choice.

In other words, we lost a desirable property (data abstraction) when we made the language more expressive (by adding the ability to examine representations).

(The analogy to global membership should be irresistible!)

Posted by: Neel Krishnaswami on October 29, 2009 5:03 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The interpretation of higher-order logic in topos theory that I’m most familiar with interprets a proposition φ(x)\varphi(x) with a variable x:Ax:A of type AA as a subobject [φ]A[\varphi] \hookrightarrow A. In particular, a variable-free proposition is interpreted by a subterminal object (or type), and the assertion that this proposition is “true” is the assertion that the corresponding subterminal object (or type) is inhabited.

Now of course, subobjects of AA are in bijection with morphisms AΩA\to \Omega, and in particular subterminal objects are in bijection with global elements 1Ω1\to \Omega. But it doesn’t seem that essential properties of the interpretation should be changed by passage back and forth across such a bijection.

I do sort of grasp the intuition that there is some desirable feature that the higher-order logic of a topos lacks, but for the above reason it doesn’t seem to me like “it is possible to interpret propositions as inhabitation of types” is quite it.

How does simply making the language more expressive destroy the property you’re interested in?

I think this happens all the time, all over mathematics.

Yes, you’re certainly right; my phrasing was overly general. I still don’t understand, however, how exactly it is that making this particular language more expressive ends up destroying the particular property that you’re interested in—probably mainly because I still don’t really grasp what exactly the property is that you’re interested in.

As an aside, thanks for your patience

It’s I who should be thanking you! (-: This is an aspect of type theory that I’ve never really been able to wrap my head around. Perhaps I am more dense than most and you can straighten me out.

Posted by: Mike Shulman on October 30, 2009 2:32 AM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The interpretation of higher-order logic in topos theory that I’m most familiar with interprets a proposition φ(x) with a variable x:A of type A as a subobject [φ]↪A. In particular, a variable-free proposition is interpreted by a subterminal object (or type), and the assertion that this proposition is “true” is the assertion that the corresponding subterminal object (or type) is inhabited.

If I correctly understand what “subterminal” means, I think it exactly pinpoints the difficulty. That is, a subterminal object is one in which all morphisms into it are equal. Now, as of course you know, we interpret the closed terms showing inhabitation of a type AA as the morphisms from 1A1 \to A.

Putting these two facts together, this means if we interpret a proposition by means of a subterminal object, then there can be at most one closed term in the syntax having that type. This means that we cannot take the proofs of a proposition to be the terms of that proposition’s type – e.g., even if we have the arrow from 11 to the object for the proposition ABA \vee B in our hand, we don’t know which of AA or BB was used to prove the disjunction. The term we have is a witness only to the statement’s truth, and is not an actual proof that we can examine and get one of AA or BB out of.

This is the sense in which a topos gives us a semantics of provability, rather a semantics of proofs.

Posted by: Neel Krishnaswami on October 30, 2009 6:12 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Putting these two facts together, this means if we interpret a proposition by means of a subterminal object, then there can be at most one closed term in the syntax having that type.

It sounds like the property you really want is that “it is possible to interpret propositions as inhabitation of types that can be inhabited by more than one unequal term”?

even if we have the arrow from 1 to the object for the proposition ABA\vee B in our hand, we don’t know which of AA or BB was used to prove the disjunction.

I don’t expect this is quite what you meant, but I believe the higher-order logic of a topos does have the “disjunction property” that if ABA\vee B is provable, then either AA or BB is provable.

This is the sense in which a topos gives us a semantics of provability, rather [than] a semantics of proofs.

Perhaps this will betray my fundamental lack of understanding of the mentality of type theorists, but I don’t know what is meant by a “semantics of proofs.” I have always been taught that proofs belong to the domain of syntax, while models belong to the domain of semantics. Maybe if you can explain this, or point me to a reference which explains it, I will understand better what you’re getting at.

Posted by: Mike Shulman on October 30, 2009 7:08 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Hi Mike,

I can give it a try. I’ll try and tell a story of proof-theoretic semantics, and to do so I’ll use the philosophy of intuitionism to give it a narrative arc. But as is normal a) the philosophical viewpoint yields a bunch of theorems and technical devices that practical-minded people can use without thinking about philosophy, and b) there are yet other philosophically-minded people who agree with the technical program but reject the philosophy in favor of a different one.

As you probably know, intuitionistic logic can be understood using the BHK interpretation of the connectives. That is, we take the assertion “AA is true” to mean “it is possible to construct a canonical proof of AA”, where by canonical proof we mean the following things:

  • A canonical proof of \top is the datum ()().
  • A canonical proof of ABA \wedge B is a pair (a,b)(a, b) where aa is a canonical proof of AA and bb is a canonical proof of BB.
  • There is no canonical proof of \bot.
  • A canonical proof of ABA \vee B is a pair (i,p)(i, p), where ii is either the symbol 00 or 11, and if ii is 00, then pp is a canonical proof of AA, and if ii is 11, then pp is a canonical proof of BB.
  • A canonical proof of ABA \implies B is a procedure, which given a canonical proof of AA as an input, constructs a canonical proof BB as an output.
  • A canonical proof of x.P(x)\forall x.\; P(x) is a procedure, which given a numeral nn as an input, constructs a canonical proof of P(n)P(n) as an output.
  • A canonical proof of x.P(x)\exists x.\; P(x) is a pair (n,p)(n, p), where nn is a numeral and pp is a canonical proof of P(n)P(n).
  • A canonical proof of an atomic proposition PP is whatever primitive notion of verification is suitable for it.

Now, the first thing to note is that nobody ever writes canonical proofs. If you do a proof, whether informally on paper or formally in a type-theory like ML type theory, then you will make heavy use of lemmas. And this means that the proof you write down will probably not be a canonical proof. E.g., to prove P(17)Q(18)P(17) \vee Q(18), we might prove x.P(x)Q(x+1)\forall x.\; P(x) \vee Q(x + 1), and then instantiate the quantifier with 1717. Observe that this doesn’t tell us immediately whether the left or the right branch of the disjunction holds. So we have to ask, on what grounds do we believe that our non-canonical proofs actually give us truths? Obviously, we must have reason to believe that it’s always possible to do some work to reconstruct a canonical proof from the non-canonical one.

This is basically what a cut-elimination proof gives you: it tells you that within a particular formal system, it’s always possible to get rid of lemmas and put the proof into canonical form. Now, the second thing to note is that cut elimination induces a notion of equivalence of proofs, since we can say two proofs are equivalent if they share the same canonical form. (You might also naturally ask if all cut-elimination proofs give equivalent normal forms for the same non-canonical proofs. The answer is yes*^* for well-structured logics, with the asterisk indicating an omitted Lakatosian digression explaining how to change the question so the answer is affirmative.)

At this point we have a bunch of syntax for propositions and proofs, along with a non-trivial notion of equivalence of proofs, arising purely from proof-theoretic considerations. Now, for simple systems, you can prove the theorems you want just by doing an induction over the syntax. But working directly with syntax rapidly grows painful: inductive proofs with a number of case distinctions quadratic in the number of rules are not unheard of. So it’s natural to seek models of propositions and proofs to manage (and hopefully avoid) the book-keeping, by making visible and exploiting the combinatorial and topological structure latent in the syntax and its dynamics.

However, traditional model theory, like you find in intro logic textbooks, doesn’t do the job. The reason is that cut-elimination, due to its awareness of proofs, is a more restrictive notion than simple consistency. Cut-elimination implies consistency (since there is no canonical proof of falsehood), but the converse fails, since we can add consistent axioms to a system which destroy cut-elimination. For example, if we add an axiom PQP \vee Q, then it’s no longer necessarily the case that every disjunctive proof has a canonical form of either a left- or right-injection, since we can prove disjunctions using the axiom. (This stringency is one of the great virtues of structural proof theory, since it gives very powerful guidance about how to design logical systems. It’s why type theorists put up with exhausting combinatorial analyses that would drive other mathematicians to drink.)

So we need more sophisticated semantics that permit us to model the different proofs, and not just truth. That’s basically what categorical proof theory brings to the party: it shows how to give models in which the interpretation of each proposition is inhabited by the interpretations of the canonical proofs. E.g., bicartesian closed categories let us interpret the propositions and the proofs of intuitionistic propositional logic as the objects and morphisms of the category. Note that this view actually generalizes the standard Heyting algebra semantics for IPL, since we can view a Heyting algebra as a CCC in which all arrows between two objects happen to be equal (with the consequence that we model provability rather than proofs) – so categorical proof theory also suggests a way to marry proof-theoretic approaches with the model-theoretic view.

Posted by: Neel Krishnaswami on November 2, 2009 2:51 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

There is something skew about this whole discussion. The metaphorical associations of words like “inhabit” and “populate” may be the cause. It makes sense to say that a term inhabits a type — the sense being that the denotation of the term as an individual name falls under the denotation of the type as a collective name — but it’s forcing the sense of words a bit too far to say that proofs inhabit propositions. It is not the sense of the analogy to obscure the actual nature of the relation between proofs and propositions.

I hope you all appreciate how difficult it was for me to avoid introducing the term “co-habit” into the above remarks.

Posted by: Jon Awbrey on November 2, 2009 3:44 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

It makes sense to say that a term inhabits a type — the sense being that the denotation of the term as an individual name falls under the denotation of the type as a collective name — but it’s forcing the sense of words a bit too far to say that proofs inhabit propositions.

It's certainly forcing the sense of words; we may disagree about whether it goes too far. But I think that it is exactly this sort of forcing that characterises the interpretation of propositions as types, rather than merely seeing the analogy between propositions and types.

Posted by: Toby Bartels on November 2, 2009 4:31 PM | Permalink | Reply to this

Re: Such Tricks Hath Strong Imagination

Dont’ Get Me Wrong, it my habitual practice to force the sense of words to the breaking point, but I do that in a purely experimental spirit, for the sake of gaining information about the corresponding concept or lack thereof.

The experimental question is always, “What is really going on here?” — and the test of the experiment is whether it gives us any information about that.

In the case of the propositions as types analogy, my sense is that containment metaphors are leading us astray from the forms of things we’d know.

Posted by: Jon Awbrey on November 2, 2009 5:08 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Wow, thank you! That is the first explanation I’ve ever read that really made me care about cut elimination. And about logic that’s weaker than topos logic. How would you feel about adding it to the nLab at, say, proof-theoretic semantics, or maybe as an additional subheading at type theory?

Are there any introductory books I can read that take the perspective you just described?

Posted by: Mike Shulman on November 3, 2009 6:58 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Jumping back down some tree levels after your very nice explanation.

the simply-typed lambda calculus is a model of intuitionistic propositional logic…. Its categorical models are CCCs, so that (e.g.) conjunction is interpreted as products and disjunction as coproducts, implication as exponentials, and so on. Furthermore, the morphisms represent the proofs, so this is a semantics where proofs do have a life of their own.

According to my current understanding, Martin-Löf dependent type theory, and its models in locally cartesian closed categories, seems like it should be what you want for “intuitionistic predicate logic”? By contrast with the “semantics of provability” where we interpret a proposition ϕ(x)\phi(x) with a free variable of type x:Ax:A as a subobject of [A][A], for a “semantics of proofs” we can instead interpret ϕ(x)\phi(x) as just an object over [A][A]. Since each slice category is a CCC, we have a local propositional semantics of proofs, and the left and right adjoints to pullback in a LCCC give us quantification. Is that right?

So what I want is something similar, only extended to a full mathematical vocabulary.

Are you referring to powersets and impredicativity here?

Also, earlier you said:

I feel the absence of those quantifiers in ordinary mathematics quite keenly.

It’s not clear to me what “those quantifiers” was referring to here, can you explain?

Posted by: Mike Shulman on November 3, 2009 7:12 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Yeah, the LCCC models are just about what I want for predicate logic. However, there’s no type theory that is an entirely satisfactory realization of this semantics! MLTT comes in two varieties. The “extensional” version supports proving that two functions are equal if they are equal at all points, but suffers from an undecidable typechecking problem. On the other hand, the “intensional” version only permits showing functions are equal if they are equal up to beta-equality, but has decidable typechecking. It would be really nice to have both properties….

-*-*-*-

The quantifiers I’m referring to are impredicative quantification over types, ie, where you can give a type α:type.αα\forall \alpha:type. \alpha \to \alpha. This is what is found in Girard and Reynolds’ System F, or in the Calculus of Constructions.

This is very pleasant to use, because it gives you enough impredicativity to do things like define (weak) natural numbers by typing the Church encodings. (By weak naturals, I mean that you can prove the existence but not the universality properties internally in the type theory.)

On the flip side, you can only define impredicative things in very uniform ways: polymorphic functions must act uniformly (aka parametrically) at all types. This lets you beef up (by external reasoning) the weak existence properties of objects given by the Church encodings into the corresponding strong existence properties. For example, you can show externally that all inhabitants of the type α:type.αα\forall \alpha:type. \alpha \to \alpha are observably equivalent to the identity function.

It’s a very tough question how to internalize reasoning about parametricity into the type theory, though if you can, I think you could do things like soundly formalize things like Dedekind’s (unsound in sets) construction of the natural numbers. (Of course, I haven’t proved this last point yet, since there is as yet no such type theory to do the construction in.)

Posted by: Neel Krishnaswami on November 4, 2009 4:24 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The “extensional” version supports proving that two functions are equal if they are equal at all points, but suffers from an undecidable typechecking problem. On the other hand, the “intensional” version only permits showing functions are equal if they are equal up to beta-equality, but has decidable typechecking.

It sounds like you want the (still hypothetical) higher categorical model of DTT! The two-dimensional version of LCCCs can be shown to model the syntax of DTT with extensionality of equality only for identity types and extensionality for functions only up to identity, and I gather that the hope is that in the ω\omega-categorical case extensionality can be eliminated entirely.

Posted by: Mike Shulman on November 4, 2009 4:43 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The quantifiers I’m referring to are impredicative quantification over types, ie, where you can give a type ∀α:type.α→α. This is what is found in Girard and Reynolds’ System F, or in the Calculus of Constructions.

Interesting. What sort of semantics do those systems have?

This is very pleasant to use, because it gives you enough impredicativity to do things like define (weak) natural numbers by typing the Church encodings.

I don’t think I’ve ever felt the need to use any such quantification. I’m generally happy to define natural numbers using the ordinary universal property in SetSet; what’s the advantage of the Church encodings?

Posted by: Mike Shulman on November 4, 2009 4:44 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

You can prove that F has no ordinary set-theoretic models in classical logic (ie, there is no model in which $type$ is a universe of sets, and function space is interpreted by set-theoretic space and quantification is interpreted by indexing over the universe). However, it does have such models in intuitionistic set theory.

As a result, its models tend to be quite complex – though I am told all known models can be be factored into a topos construction (to give you an intuitionistic set theory) followed by a construction of an appropriate universe in that set theory. (I haven’t checked this myself, since I quickly run out of categorical horsepower. I get farther with each try, though.)

In addition, it’s often the case that you want your models of F to only include parametric values, so that you can prove that the NNO definable in F really is an NNO. So there’s a second layer of complexity where you have to ensure that the model only includes things that are “uniform enough”.

Now, to some extent the advantage to this story is precisely its weakness: we don’t know how to properly internalize the desired proofs of universality in impredicative type theory. We’ve got models that validate the expected principles, and these models are based on the most sophisticated understanding of how data abstraction works that programming language researchers possess. Internalizing it will force us to understand much more deeply the mathematical machinery underlying extensionality and its interplay with equality. That seems potentially very valuable to me.


A second benefit is that this is simply a case where impredicativity appears in a different form than it does in set theory – these theories have nothing equivalent to the powerset axiom, but still allow free universal and existential quantification over the whole universe. So we might be able to use the differences to separate accident from essence, if you will.


Finally, a short introduction to structural proof theory is Girard’s lecture notes “Proofs and Types”. It’s available online, and he’s so opinionated you can’t help but imbibe some philosophy along the way. If you just want the philosophy, Per Martin-Lof’s Siena lectures (also online) are a good place to look.

Posted by: Neel Krishnaswami on November 4, 2009 9:39 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

You can prove that F has no ordinary set-theoretic models in classical logic

I certainly wouldn’t expect it to. I would expect or hope its “natural” semantics to be some sort of category, as you described perhaps with a universe.

these theories have nothing equivalent to the powerset axiom, but still allow free universal and existential quantification over the whole universe.

I have to keep reminding myself that when you say “quantification” you really mean “construction of function spaces.” Right? The type α:type.αα\forall \alpha:type.\, \alpha\to\alpha is what I would write as Πα:type.αα\Pi \alpha:type.\, \alpha\to\alpha, meaning the type of functions that take a type as input and output an endomorphism of that type. For me, and I believe for most non-type-theorists, \forall is an operation on propositions not on types. I know that you want to interpret propositions as types, but this use of language still confuses me.

Not that I expect you to be able to do anything about it, I’m just complaining. (-: And maybe trying to alleviate any similar confusion that others might be feeling.

Finally, a short introduction to structural proof theory is Girard’s lecture notes “Proofs and Types”.

Thanks!

Posted by: Mike Shulman on November 5, 2009 3:06 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

The “extensional” version supports proving that two functions are equal if they are equal at all points, but suffers from an undecidable typechecking problem. On the other hand, the “intensional” version only permits showing functions are equal if they are equal up to beta-equality, but has decidable typechecking. It would be really nice to have both properties….

I can't really get into the debate between intensional and extensional equality, because I have accepted Bishop's philosophy that equality should be defined. So we have a notion of beta-convertibility (which need not even be symmetrised, as far as I can see) that comes from the syntax, but we shouldn't expect that to be the intended notion of equality of functions (or of anything else, for that matter).

I know how this works in a full-fledged set theory whose model is a locally cartesian closed pretopos, although I don't understand it as well for a theory without sum types and quotient types.

Posted by: Toby Bartels on November 4, 2009 9:56 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I can’t really get into the debate between intensional and extensional equality, because I have accepted Bishop’s philosophy that equality should be defined.

It seems like the real issue is whether “types” are expected to be more like “sets” or more like presets. If types are like sets, then you should have quotient types, which in turn means that you can encode stuff of arbitrary complexity in an equality relation, so that you can’t reasonably expect anything that depends on equality to be decidable. This sort of “extensional” type theory has a nice semantics in Π\Pi-pretoposes.

On the other hand, if types are like presets, then you can expect their “pre-equality” or “syntactic equality” to be decidable, but in order to get a good semantics for them one should expect to have to define a “set” to be a type with an equivalence relation. In other words, the semantics of “sets” should be the exact completion of the semantics of “presets.” So the direct semantics of such an “intensional” type theory should be valued in some category whose exact completion is a Π\Pi-pretopos. A natural choice for such a category is a lextensive category with weak dependent products, i.e. a local construction of “weak exponentials” that satisfy the existence but not uniqueness part of the usual universal property. It’s known that exact completion takes lextensive categories to pretoposes, and categories with weak dependent products to locally cartesian closed categories.

Posted by: Mike Shulman on November 5, 2009 2:42 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Ah, so we should be satisfied if the semantics for intensional Martin-Löf type theory are given by a locally cartesian weakly-closed category, that is a category with all pullbacks and with pullback-stable weak dependent products. I should check whether that is true ….

Incidentally, even extensional Martin-Löf type theory does not have all quotient types. And then there is my theory, which I should finish writing down some day, which is not only intensional and also lacks identity types (no undefined equality) and whose models accordingly don't even have all pullbacks.

Posted by: Toby Bartels on November 5, 2009 6:58 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

At the NovemberFest this weekend, Pino Rosolini advocated the following three-step procedure for constructions like “type\rightsquigarrowsetoid”. We start with a suitable fibered poset p:EAp\colon E\to A (I’m not sure of the precise conditions required). Then:

  1. Construct the fibered poset p¯:Vert(p)E\bar{p}\colon Vert(p) \to E, which “freely adds comprehension” to pp.
  2. Construct the allegory of relations in this fibred poset, Rel p¯(E)Rel_{\bar{p}}(E), and its category of maps Map(Rel p¯(E))Map(Rel_{\bar{p}}(E)), which is a regular category. This is the “free regular completion of a category with a factorization system.”
  3. Construct the ex/reg completion of Map(Rel p¯(E))Map(Rel_{\bar{p}}(E)).

We then have a series of pullback squares, implying that the logic of the original fibration is faithfully represented in the logic of the resulting exact category:

E Vert(p) Sub(Map(Rel p¯(E))) Sub(Map(Rel p¯(E)) ex/reg) A E Map(Rel p¯(E)) Map(Rel p¯(E)) ex/reg.\array{E & \to & Vert(p) & \to & Sub(Map(Rel_{\bar{p}}(E))) & \to & Sub(Map(Rel_{\bar{p}}(E))_{ex/reg})\\ \downarrow && \downarrow && \downarrow&& \downarrow\\ A & \to & E & \to & Map(Rel_{\bar{p}}(E)) & \to & Map(Rel_{\bar{p}}(E))_{ex/reg}. }

Examples include:

  • If we start from the fibration of types and propositions in a logic over a type theory, we obtain the syntactic category of the type theory.
  • If we start from the fibration of “assemblies” over sets, we obtain the effective topos. (In this case EE is already regular, so the second step acts as the identity.)
  • If we start from the fibration arr(M) poMarr(M)_{po}\to M, where MM is the category of types in some type theory and arr(M) poarr(M)_{po} means the fiberwise poset reflection of the arrow category of MM, then we obtain the ex/lex completion of MM.

I think he said that in the last case, MM needs to only have finite products and weak equalizers. Of course, the ex/lex completion requires only weak finite limits, although it simplifies noticeably if you have actual finite products. The fibration arr(M)Marr(M)\to M requires actual pullbacks for its definition, of course, but arr(M) poMarr(M)_{po}\to M requires only weak pullbacks.

It seems like in a theory with no equality at all, you can’t expect to have even weak pullbacks or weak equalizers. But actually, in a theory with no equality at all, I don’t even know how you can get a category; how would you state the identity and associativity axioms?

Posted by: Mike Shulman on November 16, 2009 3:55 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

ex/reg+completion

That should be

ex/reg+completion

It seems like in a theory with no equality at all, you can’t expect to have even weak pullbacks or weak equalizers. But actually, in a theory with no equality at all, I don’t even know how you can get a category; how would you state the identity and associativity axioms?

Although they may be no internal notion of equality, there's still an external one. So while there may be no category of presets as such, there's still a category of contexts for any theory of presets. That's not quite satisfying, however.

And no, there are no pullbacks in general (nor weak pullbacks as far as I can see). There are, however, pullbacks of display morphisms.

Posted by: Toby Bartels on November 17, 2009 7:07 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

There are, however, pullbacks of display morphisms.

Okay, so then we should start from the fibration DispMDisp \to M of display maps, and probably obtain the correct exact category of sets.

Posted by: Mike Shulman on November 17, 2009 6:09 PM | Permalink | PGP Sig | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Neel wrote:

I imagine that most mathematicians would be more shocked if something like the Knaster-Tarski theorem failed to hold…

I imagine many mathematicians would be shocked to discover that logicians expected them to have heard of the Knaster–Tarski theorem. At least I was. What’s the Knaster–Tarski theorem?

(I could look it up, but I prefer to wallow in my ignorance a little longer.)

Posted by: John Baez on October 28, 2009 4:24 PM | Permalink | Reply to this

Don’t get your Knasters in a knot; Re: Syntax, Semantics, and Structuralism, I


In the mathematical areas of order and lattice theory, the Knaster–Tarski theorem, named after Bronisław Knaster and Alfred Tarski, states the following:

Let L be a complete lattice and let
f : L → L be an order-preserving function. Then the set of fixed points of f in L is also a complete lattice.

It was Tarski who stated the result in its most general form,[1] and so the theorem is often known as Tarski’s fixed point theorem. Some time earlier, however, Knaster established the result for the special case where L is the lattice of subsets of a set, the power set lattice.[2]

The theorem has important applications in formal semantics of programming languages.
[1] Alfred Tarski (1955). “A lattice-theoretical fixpoint theorem and its applications”. Pacific Journal of Mathematics 5:2: 285–309. http://projecteuclid.org/Dienst/UI/1.0/Summarize/euclid.pjm/1103044538.
[2] B. Knaster (1928). “Un théorème sur les fonctions d’ensembles”. Ann. Soc. Polon. Math. 6: 133–134.

Posted by: Jonathan Vos Post on October 28, 2009 6:49 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

KT says that if you’ve got a complete lattice LL, and a monotone function f:LLf : L \to L on that lattice, then the set of fixed points of ff is also a complete lattice. (In particular, this means that f has least and greatest fixed points.)

It’s not a complicated theorem, despite having Tarki’s name in it, but it’s used a lot in computer science. It’s a handy way of showing recursive functions terminate – the correctness proofs of most compiler optimizations rely on it. That’s as hard-headed and practical as theoretical computer science gets.

But another easy proof is that powersets form complete lattices. So you can find yourself doing transfinite induction over really huge sets without any special fuss – which is a fact worth making a bit of fuss about!

Posted by: Neel Krishnaswami on October 28, 2009 7:00 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

See Stoy, J.E. (1977), Denotational Semantics : The Scott-Strachey Approach to Programming Language Theory, MIT Press, Cambridge, MA.

See especially “The Snag” (p. 346).

Posted by: Jon Awbrey on October 28, 2009 7:26 PM | Permalink | Reply to this

Categorial version of Axiomatic Set theory

I was away travelling, and am slowly catching up with the discussion…..

This answers Toby Bartel’s post to the earlier thread where we discussed structural set theory (SEAR and ETCS),

AN: I am satisfied if you can translate a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available) into a document starting from scratch with the axioms for a category, at a comparable level of rigor and only need a comparable number of pages.

TB: I reserve the right for lack of time not to complete these, but I’ll try to at least indicate how they would be done. […] All right, I’ll do Section 1.3 and start on Chapter 2 of Mileti, and hopefully the point at which it is clear that I could continue indefinitely will come before the point at which it is tiresome to continue. (^_^) I should be able to look at Rasiowa and Sikorski later this week.

Here is my commented version of your Categorial version of Axiomatic Set theory.

I can see how you proceed, and that it may give an interpretation of everything mathematicians are doing, by allowing a sufficient amount of abuse of language and notation (though far more than with Bourbaki).

But I think that few mathematicians would leave Cantor’s paradise for the vision of a hell as presented in Toby’s Categorial version, full of artificial baggage but lacking the familiar lattice structure of sets.

In view of the remark by John Baez

– “I’m enjoying your discussion immensely: it’s the first time I’ve seen someone mount a sustained attack against `structural foundations’ that’s significantly more intelligent than `categories suck’ or `ZFC was handed down from heaven’.” –

I thought it might be useful to collect the main points of my current (very critical) views on structural set theory in an essay on Foundations of Mathematics.

It is clear to me that I’ll never like your way of putting things. I thoroughly dislike what is here called “structural set theory” (the common structure of SEAR and ETCS), which completely destroys the traditional algebraic structure of ordinary sets.

So it seems appropriate to end here my participation in the discussion about structural foundations of mathematics. Thanks to all who participated and shared their insights.

I’ll continue to discuss other aspects of foundations where agreement seems less remote, in particular aspects that are possibly related to teaching abstract mathematics to the computer.

Posted by: Arnold Neumaier on October 31, 2009 5:40 PM | Permalink | Reply to this

Re: Categorial version of Axiomatic Set theory

Here is my commented version of your Categorial version of Axiomatic Set theory.

Thanks.

I can see how you proceed, and that it may give an interpretation of everything mathematicians are doing, by allowing a sufficient amount of abuse of language and notation (though far more than with Bourbaki).

All right, that's good.

But I think that few mathematicians would leave Cantor’s paradise for the vision of a hell as presented in Toby’s Categorial version, full of artificial baggage but lacking the familiar lattice structure of sets.

Keep in mind, that version isn't my paradise either; when I posted it, I described how I would prefer to do a few things differently. But it's close to paradise. The baggage of adding the technical distinction between a subset (a ‘set’, an element of a power set in my formalism) and its underlying set (its ‘shadow’) to the other technical distinctions that we make, such as between the subset and its characteristic function, is a small price to pay to eliminate the baggage of 121 \in \sqrt{2} as a meaningful (even if undecidable) proposition.

I thought it might be useful to collect the main points of my current (very critical) views on structural set theory in an essay on Foundations of Mathematics.

So it seems appropriate to end here my participation in the discussion about structural foundations of mathematics. Thanks to all who participated and shared their insights.

In that case, I won't reply further to your comments on my Categorial version or to your essay. Although I would like to state for the record that I answer Yes to both questions in your Questionairre; this is because I normally interpret philosophical questions so as to make ordinary language mean what people usually want it to mean. To draw pedantic distinctions would be no longer ‘ohne grosses Nachdenken’.

Posted by: Toby Bartels on October 31, 2009 7:52 PM | Permalink | Reply to this

Re: Categorial version of Axiomatic Set theory

TB: But it’s close to paradise. The baggage of adding the technical distinction between a subset (a ‘set’, an element of a power set in my formalism) and its underlying set (its ‘shadow’) to the other technical distinctions that we make, such as between the subset and its characteristic function, is a small price to pay to eliminate the baggage of 1∈2 as a meaningful (even if undecidable) proposition.

But you sacrifice structure. Traditionally, sets form a lattice under inclusion. This is no longer the case in your `paradise’; neither is it in ETCS or SEAR.

Posted by: Arnold Neumaier on November 1, 2009 12:54 PM | Permalink | Reply to this

Re: Categorial version of Axiomatic Set theory

Yes, that’s true; instead, in structural set theories, we have local lattices (subsets of a given set form a lattice), which is all that’s needed in practice.

In algebraic set theory (à la Joyal and Moerdijk), one can define a model of ZF within a category of “classes” obeying suitable axioms, and that model of course is a lattice, while the ambient superstructure is perfectly structural.

So it seems appropriate to end here my participation in the discussion about structural foundations of mathematics.

Apparently you weren’t done yet!

Posted by: Todd Trimble on November 1, 2009 5:29 PM | Permalink | Reply to this

Re: Categorial version of Axiomatic Set theory

Traditionally, sets form a lattice under inclusion.

I have never needed to use this lattice structure, except in two cases:

  1. when studying the von Neumann hierarchy;
  2. when I really want a bounded lattice (with a top element as well as a bottom).

I can tell that I'm doing (1) because I'm doing set theory itself, not simply doing some other mathematics with a foundation of set theory. When I started learning structural foundations, I wondered what happened to this, until I learnt about the constructions that I tried to describe at pure set. Those do form a lattice (without top).

Most of the time, I'm doing (2), and there material foundations are a little bit off, since the lattice of all sets has no top element. From a material perspective, I'm really looking at the lattice of subsets of some fixed set UU; now UU is the top element of this lattice. And structural foundations can also discuss the lattice of subsets of some fixed set UU.

So (1) is more natural in material foundations, which is no surprise, since there I am studying precisely what material foundations take to be the basic concept. Conversely, studying the category of sets is more natural in category-based structural foundations, again no surprise. But (2) takes a little bit of work either way, and is just as easy structurally as materially. As this includes all of the applications of the lattice structure of sets, the two approaches are equivalent in practice.

In fact, the structural approach seems more honest to me, since the material approach invites one to pretend that one is actually working in the unbounded lattice of all sets until something funny happens when one considers the top of the lattice, while the structural approach insists on getting the correct lattice up front. For an example, consider the proposition that, in any topological space XX, the intersection of finitely many open sets is an open set.

  • If you interpret ‘open set’ here as an element of the von Neumann hierarchy that happens to belong to the collection of open subsets of XX, and if you interpret ‘intersection’ as the meet operation on that hierarchy, then this is not quite correct; 00 is a finite number, but the intersection of no open sets is not a set at all, much less an open one. So the literal interpretation is not valid; there must be an abuse of language here, although it is a subtle one that is easy to miss.
  • Structurally, however, the abuse of language is clear up front; we must interpret ‘open set’ as an element of the power set of XX (that is a subset of the underlying set of XX) that happens to belong to the collection of open subsets of XX, and so we interpret ‘intersection’ as the meet operation on that power set. Then the statement is true, even when the number of open sets is 00. There is nothing subtle about the abuse of language here; it is honest and direct.

In each case, we are really working in the bounded lattice of subsets of XX, which is an example of (2); the difference between this and the unbounded lattice of all pure sets can be subtle materially but is obvious structurally.

Posted by: Toby Bartels on November 1, 2009 6:32 PM | Permalink | Reply to this

Re: Categorial version of Axiomatic Set theory

TB: the material approach invites one to pretend that one is actually working in the unbounded lattice of all sets until something funny happens when one considers the top of the lattice

I don’t see the pretense. One really works in the unbounded lattice, knowing that it has no top. Then nothing funny happens. The intersection of finitely many sets is defined in the model treatise of axiomatic set theory you chose (Mileti) only for a positive integral number of sets. So, by any reasonable standard, it is clear that this is meant when talking about finite intersections.

TB: For an example, consider the proposition that, in any topological space X, the intersection of finitely many open sets is an open set. […] this is not quite correct; 0 is a finite number, […] So the literal interpretation is not valid.

With your amendmend, the literal interpretation is still not valid: minus one or sqrt(2) are also finite numbers. You may say that these numbers are not intended here - but the same holds for the number 0, and no amendment is needed.

As always, one needs to consider the notion of number (or of any other ambiguous concept) appropriate for each context.

TT: instead, in structural set theories, we have local lattices (subsets of a given set form a lattice), which is all that’s needed in practice.

But subsets are no longer sets. So you need to rewrite all occurrences of “set” into “subset”, or accept a heavy abuse of language from the very start.

AN: So it seems appropriate to end here my participation in the discussion about structural foundations of mathematics.

TT: Apparently you weren’t done yet!

Yes, I can’t help defending tradition, even by doing the inappropriate - inappropriate in the sense that it seems to open up no further insights on either side….

Posted by: Arnold Neumaier on November 2, 2009 2:46 PM | Permalink | Reply to this

Re: Categorial version of Axiomatic Set theory

Please recall the theorem that I am trying to interpret:

In any topological space XX, the intersection of finitely many open sets is an open set.

If that seems vague to you, let me rephrase it:

For each topological space XX, for each collection GG of open sets in XX, if GG is finite, then the intersection of GG is an open set in XX.

It seems clear to me that the first sentence means precisely the second sentence, but maybe that is not clear to you. If not, I offer my apologies, and replace my original formulation with the second one above.

Now, the idea that one may have 1-1 or 2\sqrt{2} open sets is not relevant here. However, the idea that one may have 00 sets is perfectly relevant. And the theorem that I have stated is true, as I have stated it, even for this case, when properly interpreted.

You would, I think, want to amend the theorem to this form:

For each topological space XX, for each collection GG of open sets in XX, if GG is finite and inhabited, then the intersection of GG is an open set.

You would do this in order to fit it into the straitjacket of your language. Yet this is a weaker theorem!

Posted by: Toby Bartels on November 2, 2009 4:17 PM | Permalink | Reply to this

Re: Categorial Version of Axiomatic Set theory

It looks like you just rediscovered the concept of a universe of discourse.

The feature that BG(vNK…) and ZF±C have in common is a shift from absolute, ontological interpretations of general terms to relative, contextual interpretations. In other words, to be is to be related.

Posted by: Jon Awbrey on November 2, 2009 3:00 PM | Permalink | Reply to this

Re: Categorial Version of Axiomatic Set theory

It looks like you just rediscovered the concept of a universe of discourse.

I didn't rediscover it! Knowing that concept helps one to see mathematics as naturally typed, which leads to the structural view. So in fact, universes of discourse were very important in bringing me to my current understanding.

Posted by: Toby Bartels on November 2, 2009 4:20 PM | Permalink | Reply to this
Read the post In Praise of Dependent Types
Weblog: The n-Category Café
Excerpt: Dependent types should be everyone's friend!
Tracked: March 3, 2010 6:42 PM

Re: Syntax, Semantics, and Structuralism, I

I agree with your statement that “at the most basic level, what we do when we do mathematics is manipulate symbols according to specified rules.” However I have noticed that when people try to formalize some mathematical theory, they tend do it in a one-dimensional manner. They then introduces laws that have no meaning in the mathematical theory that is formalized.

Let me take a simple example to make clear my point. When people define formally what is a 2-category, they state the interchange law:

(a . b) * (c . d) = (a * c) . (b * d)

My claim is that this law is not a law of 2-categories but is an artifact coming from the formalization in a one-dimensional syntax that gives us two ways of writing down something that is essentially two-dimensional, and thus we have to introduce the interchange law to equate them. If instead we were using a two-dimensional syntax, then we wouldn’t need the interchange law.

I’m sorry if I am not very clear, but I am stuck with a one-dimensional text editor on a computer that has a one-dimensional memory.

Posted by: David Leduc on March 5, 2010 12:41 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I agree in principle that there’s nothing privileged about 1-dimensional syntax. Although certainly every other sort of syntax can be coded as 1-dimensional, so no real expressivity is lost, while restricting to 1-dimensional syntax is certainly convenient for purposes of writing it down on paper and in computers.

I don’t agree that the interchange law is an artifact of 1-dimensional notation, though, since there are higher-dimensional structures that don’t satisfy the interchange law. For instance, a “sesquicategory” is like a 2-category, but without interchange. Or in a weak 3-category, the interchange law only holds up to an invertible 3-cell. So I think using a 2-dimensional syntax in which the interchange law was a consequence of syntax, rather than an axiom you could state, would be a bad idea. It is, of course, possible to write the interchange law in 2-dimensional syntax; you just have to be willing to use “2-dimensional parentheses.”

This sort of behavior, where the syntax may quietly force some law to hold too strictly unless you’re sufficiently careful, already happens in 1 dimension, cf. for instance the distinction between implicit and explicit substitution.

Posted by: Mike Shulman on March 5, 2010 3:51 PM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

I sympathize with the (tongue-in-cheek) thought. On several occasions, I found myself frustrated by my inability to explain a simple picture “in words” and wanted to blame it on our choice of 1-dimensional language.

For some reason Mike’s comment made me think of Frolicher spaces. Roughly, my whacky thought goes like this…

We can try to make an analogy between “saying something” and “mapping a line to a space”. “Saying something” is (roughly, as in “you know what I mean”) a 1-dimensional process, but by saying something we can explain anything.

Then maybe “hearing something” is analogous to “mapping a space to a line” in that we can communicate anything into a 1-dimensional sentence or memory or thought or whatever.

Anyway, I need a coffee…

Posted by: Eric on March 6, 2010 2:42 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Sure we can code everything in 1-dimensional syntax but then it becomes cumbersome. I am sure you too do not code your commutative diagrams in 1-dimension but keep them in 2-dimensional form. Unfortunately, when encoding those diagrams in Coq or Agda, one has to formalize them in the 1-dimensional dependent type theory of the system.

I do not understand your objection. In the case of a sesquicategory we could use a mechanism like explicit substitution to forbid the syntax to quietly force interchange law.

Posted by: David Leduc on March 6, 2010 8:59 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

So the question is between:

  1. writing in 2-dimensional language where the interchange law is an axiom that you can assume or not assume, versus
  2. writing in 2-dimensional language in which the interchange law is a syntactic equality, except when working with structures which do not satisfy the interchange law, in which case we change the language so as to make it no longer a syntactic equality.

I find the first to be conceptually clearer, but there’s nothing wrong with the second in principle. What I was objecting to was your claim that the interchange law “is not a law of 2-categories but is an artifact coming from the formalization in a one-dimensional syntax,” since the examples of 2-dimensional structures that violate the interchange law means that it really is a law for 2-categories, not an intrinsic aspect of 2-dimensionality.

There’s not really anything specific to 2 dimensions here; consider associativity in 1 dimension. If we write the product of xx and yy without parentheses as xyx \cdot y, then syntactically we literally have an associativity “law” xyz=xyzx \cdot y \cdot z = x \cdot y \cdot z. But of course there are 1-dimensional structures that are not associative, so (I think) it’s more sensible to write the product as (xy)(x\cdot y) and then assume the associativity axiom (x(yz))=((xy)z)(x \cdot (y \cdot z)) = ((x \cdot y)\cdot z) in those cases when it’s satisfied.

Posted by: Mike Shulman on March 7, 2010 4:56 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

Mike said:

But of course there are 11-dimensional structures that are not associative, so (I think) it’s more sensible to write the product as (xy)(x\cdot y) and then assume the associativity axiom (x(yz))=((xy)z)(x\cdot(y\cdot z))=((x\cdot y)\cdot z) in those cases when it’s satisfied.

But in practice people do write xyzx\cdot y\cdot z (or just xyzx y z) when dealing with associative structures, although usually only informally, so the syntax does get adapted to the semantics when this makes things simpler. This sort of thing doesn’t usually get formalised, since it would produce a proliferation of syntaxes, but it’s a normal part of mathematical practice, and perhaps of some philosophical interest from that point of view.

Posted by: Tim Silverman on March 7, 2010 11:54 AM | Permalink | Reply to this

Re: Syntax, Semantics, and Structuralism, I

But in practice people do write xyzx\cdot y\cdot z (or just xyzx y z) when dealing with associative structures

Of course we do. I was just saying that associativity is not a consequence of the 1-dimensionality of syntax; it’s a property that a structure may or may not satisfy.

Posted by: Mike Shulman on March 8, 2010 1:24 AM | Permalink | PGP Sig | Reply to this
Read the post Coalgebraic Tangles
Weblog: The n-Category Café
Excerpt: I'm sinking in a sea of administrative duties at the moment, so for a bit of sanity I thought I'd jot down the glimmer of a thought I had. We spoke back here about the term model for a...
Tracked: January 25, 2011 11:52 AM

Post a New Comment