Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

September 10, 2009

Towards a Computer-Aided System for Real Mathematics

Posted by John Baez

I’ve known Arnold Neumaier for quite a while, thanks to many discussions on the newsgroup sci.physics.research. Recently he sent me a proposal for a system called FMATHL (Formal Mathematical Language), designed to be:

a formal framework which will allow — when fully implemented in some programming language — the convenient use of and communication of arbitrary mathematics on the computer, in a way close to the actual practice of mathematics, with emphasis on matching this practice closely.

He asked me for comments, and I gave him a few. But I said that some of you have thought about this subject more deeply, so your comments might be more valuable. So he agreed to let me post links to his proposal here.

Here’s a slightly edited version of what Arnold Neumaier sent me:

I am currently working on the creation of an automatic mathematical research system that can support mathematicians in their daily work, providing services for abstract mathematics as easily as Latex provides typesetting service, the arXiv provides access to preprints, Google provides web services, Matlab provides numerical services, or Mathematica provides symbolic services.

The mathematical framework (at present just a formal system – a kind of metacategory of all categories) is designed to be a formal framework for mathematics that will allow (some time in the future) the convenient use and communication of arbitrary mathematics on a computer, in a way close to the actual practice of mathematics.

I would like to make the system useful and easy to use for a wide range of scientists, and hence began to ask various people from different backgrounds for feedback.

At the present point where a computer implementation is not yet available (this will take at least two more years, and how useful it will be may well depend on your input), I’d most value:

  • your constructive feedback on how my plans and the part of the work already done should be extended or modified in order to find widespread approval,
  • your present views on what an automatic mathematical research system should be able to do to be most useful.

Here are two pdf files:

You can find more background work on my web page.

Posted at September 10, 2009 9:28 PM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2056

470 Comments & 1 Trackback

Re: Towards a Computer-Aided System for Real Mathematics

Incredibly ambitious! I dare say even ridiculously ambitious…

Reading this reminded me of a much more modest idea. I’d love a plugin for my web browser and/or PDF previewer that can automagically follow citations like [Theorem 6.5, 17]. This would perhaps work by deciphering out what [17] is (by reading the bibliography of the same paper), extracting or locating online a copy of the paper, and searching through that for Theorem 6.5. I’d imagine the user interaction is just right-clicking on a citation, and having a pop-up box showing me the theorem statement, that I could then click on to get the statement in the original paper.

At least for papers on the arXiv, which are generated using hypertex, it shouldn’t be too hard to algorithmically locate “Theorem 6.5”.

Posted by: Scott Morrison on September 11, 2009 12:02 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Incredibly ambitious! I dare say even ridiculously ambitious…

Hear hear. But very exciting, I think. It seems to me that aspects of this project, at least, could be implemented in a finite time frame by dedicated people.

I’m particularly excited by the idea of formalizing mathematics starting in the middle. I’ve recently been playing around with formal proof assistants like HOL, Isabelle, Mizar, etc., and I have definitely been struck by their insistance on building everything from the ground up. The authors of such systems seem fond of invoking Bertrand Russell’s quip:

The method of “postulating” what we want has many advantages; they are the same as the advantages of theft over honest toil.

I’m not sure exactly how Russell intended this (no doubt someone here can set me straight), but it seems to me that the way we really do mathematics is to start by setting up some theory, i.e. “postulating” some structure and axioms, and then we prove things based on that structure and axioms. Isabelle/Isar seems to have a bit of support for this with its “locales” (no relation to pointless topology, but by and large the attitude is different. But formalizing mathematics starting in the middle, not insisting that everything be completely machine-checkably justified at first, with a system that provides other benefits to mathematicians so that we’ll actually use it, seems like it has a good chance of actually building to critical mass.

I also like the idea of a “web of trust,” with theorems “signed” by sources of varying degrees of believability (from “verified with Isabelle” to “found in textbook X” to “I saw Karp in the elevator and he said it was probably true”). Of course it reminds me of the discussions we’ve been having about how to “certify” different research pages on the nLab. And the “semantic wiki” idea is one that I’ve mentioned before in those discussions.

I’ll need to think a bit more before I can come up with constructive feedback and suggestions, however.

Posted by: Mike Shulman on September 11, 2009 1:46 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

To see the context of the Russell quotation see here.

Posted by: David Corfield on September 11, 2009 8:28 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Could you please summarize the relevant context of the Russell quotation, for the benefit of the European readers who (because of unclarified copyright issues) cannot see any Google book pages?

Posted by: Arnold Neumaier on September 11, 2009 5:56 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’ve recently been playing around with formal proof assistants like HOL, Isabelle, Mizar, etc., and I have definitely been struck by their insistance on building everything from the ground up. […] the way we really do mathematics is to start by setting up some theory, i.e. “postulating” some structure and axioms, and then we prove things based on that structure and axioms.

You certainly can do this sort of thing in formal proof assistants (Coq is the one that I know best). Of course you know that you can since they are Turing-complete, but Coq (at least) has support for doing this naturally, with commands like Assume and Hypothesis.

On the other hand, Coq comes with its own foundations of mathematics (the Calculus of Inductive and Coinductive Constructions), and if you want to use one that doesn't match theirs, then you not only have to write it yourself (not too hard) but choose terminology that doesn't conflict with what Coq already thinks a ‘Set’ is. If FMathL is designed with more flexibility in mind, then that would be a Good Thing.

Posted by: Toby Bartels on September 11, 2009 5:56 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Toby Bartels wrote:

Coq comes with its own foundations of mathematics (the Calculus of Inductive and Coinductive Constructions), and if you want to use one that doesn’t match theirs, then you not only have to write it yourself (not too hard) but choose terminology that doesn’t conflict with what Coq already thinks a `Set’ is. If FMathL is designed with more flexibility in mind, then that would be a Good Thing.

FMathL has a concept of nested contexts in which reasoning happens; these contexts can be defined, opened, modified, and closed, according to established informal practice, just slightly formalized.

The intention is to take care of all common practices of mathematicians. Knowing that mathematicians freely use the same names for different concepts, depending on what they do (variable name conventions may even vary within the same document), FMathL will allow one to redefine everything (if necessary) by creating appropriate contexts.

The outermost context is always the standard FMathL context with its axioms, but in nested contexts one can override surface conventions (language constructs denoting concepts, relations, etc.) valid in an outer context in a similar way as, in programming, variable names in a subroutine can be chosen independent of variables in the calling routine. Internally, however, all this is disambiguated, and concepts are uniquely named.

One must also be able to define one’s own syntax for things, just by saying somewhere in the mathematical text things like “We say that (x,y) sign z word w if formula involving x,y,z,w”, overriding old, conflicting uses of sign and/or word valid in an outer contexts. (This creates high demands on the parser; we currently study how to meet this challenge.)

Such overloading of meanings or syntax is not really recommended, though, in the interest of transparency. Standard mathematics, i.e., what undergraduates should be able to understand without confusion and with limited effort, will be handled with a minimum of such artifices. (There are a number of ambiguities in traditional terminology and notation, which we hope to handle in some “natural” way, though we haven’t yet decided exactly how.)

Posted by: Arnold Neumaier on September 11, 2009 8:31 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

FMathL has a concept of nested contexts in which reasoning happens; these contexts can be defined, opened, modified, and closed, according to established informal practice, just slightly formalized.

OK, Coq has these too (called ‘Sections’); I think that Isabel's ‘locales’ are the same idea.

The outermost context is always the standard FMathL context with its axioms,

How strong are these? I would find it very nice if even the basic foundations were highly modular.

but in nested contexts one can override surface conventions (language constructs denoting concepts, relations, etc.) valid in an outer context.

Coq does not allow one to rename or redefine things within a Section, although it does allow one to rename things within a Module, which is basically like a Section except that it's stored in a different file. (The idea is that one introduces a Section for temporary convenience within a single document, but a Module should be reasonably self contained and is intended to be used by many different people.) It might be nice if FMathL is a little more forgiving than Coq about this.

One must also be able to define one’s own syntax for things, just by saying somewhere in the mathematical text things like “We say that (x,y) sign z word w if formula involving x,y,z,w”, overriding old, conflicting uses of sign and/or word valid in an outer contexts. (This creates high demands on the parser; we currently study how to meet this challenge.)

Coq allows this too (I mean symbols, since I've already dealt with redefining the things themselves), but people don't like to use it, since specifying the right order of operations is an annoying technicality. If you can program the parser to figure that out for us, that would be nice … if it's even possible.

Sorry to say a lot of ‘Yeah, the program that I know can already do all of this.’. The thing is, there are a lot of ways that people have developed to formalise rigorous mathematics on a computer (such as all of the ones that Mike mentioned), but none of them have caught on with practising mathematicians, so if you can create something with a better design, then that's good! There's nothing that anyone can point to and say ‘Instead of developing FMathL, just use that; it's good enough.’, because nothing is good enough yet.

Posted by: Toby Bartels on September 11, 2009 9:10 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

TB: There’s nothing that anyone can point to and say “Instead of developing FMathL, just use that; it’s good enough.”, because nothing is good enough yet.

Look at the links in FMathL - Formal Mathematical Language to see what I had already looked at before realizing the need (and a realistic possibility) to do it all in our Vienna group. Nothing that exists is easy to use, nothing looks like ordinary math, nothing attracts typical mathematicians.

I’d have preferred not to have to develop such a system myself. But it will not come without mathematicians playing a leading role in its development. They do not even do small, easy things such as How to write a nice, fully formalized proof, that would make things more readable without much effort.

Posted by: Arnold Neumaier on September 11, 2009 10:21 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

One must also be able to define one’s own syntax for things, just by saying somewhere in the mathematical text things like “We say that (x,y) sign z word w if formula involving x,y,z,w”,

Coq allows this too (I mean symbols, since I’ve already dealt with redefining the things themselves)

And Isabelle has something it calls “mixfix,” which seems to be along the same lines.

Posted by: Mike Shulman on September 12, 2009 5:08 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

AN: The outermost context is always the standard FMathL context with its axioms

TB: How strong are these? I would find it very nice if even the basic foundations were highly modular.

As Bourbaki, FMathL assumes classical logic and the global axiom of choice, but, weaker than Bourbaki, only the ability ot form the set of all subsets of the continuum, since this is sufficient to be able to reflect the whole FMathL conception inside itself and prove some natural properties. See Logic in context. More can be added in the standard way by making assumptions.

Since one must be able to use FMathL as a comfortable metalevel, intuitionistic logic is inadequate. Even treatments of intuitionistic logic usually use on the metalevel classical reasoning. (I’d like to know of a book that doesn’t, if such a book exists!)

The FMathL axioms are assumed on the specification level. But one can decide to work in a reflection level (one layer below the specification level), where one can define one’s own axioms and inference rules in a completely free way. Since FMathL will be reflected itself, in a number of partially nested, partially independent contexts, one can just take the part of the FMathL specifications one is happy with and augment it in one’s own way.

TB: specifying the right order of operations is an annoying technicality. If you can program the parser to figure that out for us, that would be nice … if it’s even possible.

There are no intrinsic difficulties; it is just a matter of getting the parser to do it correctly. This means that one needs to automatically generate the new grammar and update the parser accordingly. Thus we will provide a way easy for the user, but since we haven’t fixed the structure of our grammar yet (we need a grammar that works well in an incremental mode and can handle ambiguities and attributes), it will take some time before we can consider in detail, how.

Posted by: Arnold Neumaier on September 14, 2009 2:56 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I certainly can’t give constructive feedback on this amount of work, but while reading the (27) axioms I was thinking to myself:

are these axioms fixed and transcendent like in a true axiomatic framework

or

can these axioms be changed within FMathL (once it is working), a flexibility one might want to have when doing mathematics? Is FMathL its own metatheory?

Posted by: Uwe Stroinski on September 11, 2009 8:16 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Uwe Stroinski asked:
or can these axioms be changed within FMathL (once it is working), a flexibility one might want to have when doing mathematics? Is FMathL its own metatheory?”
—————————————-

SH: I wondered about this. JB provided an edited summary in which he wrote:

“The mathematical framework (at present just a formal system – a kind of metacategory of all categories) is designed to be a formal framework for mathematics that will allow…”
————————————-

Neumaier wrote in “A Semantic Turing Machine.pdf
“Besides serving as a theoretical basis of all programming languages (λ-calculus being another), Turing machines have many interesting applications, reaching from problems in logic, e.g., the halting problem c.f. Odifreddi [11], to formal languages (see Cohen [2]).”
———————————-

SH: The halting problem (HP) applies to a formal language such as Lisp for instance. It seems to me that Neumaier should have explained why the halting problem could have no adverse impact on the USTM he proposes and my first thought was that the USTM would either be incomplete or inconsistent. Since Neumaier provides references regarding the HP, perhaps he has considered this. Another statement which troubled me was this,
…”or in other words, not every STM program, regarded as a function on the context, is Turing computable. For example, external processors might have access to the system clock etc..”

SH: This is a difference in architectures, but I learned that every computable process which runs on a PC is Turing computable and that has nothing to do with a PC having a system clock. A TM and a PC can compute exactly the same range of calculations (power) except because the TM is an ideal machine, it could compute more digits of Pi and similar cases for instance, because there are no physical constraints (time and memory). PCs have the physical computability limitations but both the TM and PC compute the same kind of effective procedures called Turing computability.

Posted by: Stephen Harris on September 11, 2009 11:35 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Steven Harris wrote:

It seems to me that Neumaier should have explained why the halting problem could have no adverse impact on the USTM he proposes.

It poses the same problems as for an ordinary computer. To deserve its name, the USTM (universal semantic Turing machine) halts iff the program it simulates halts. More is not needed.

The halting problem for a semantic Turing machine poses no problems for mathematics done in FMathL, except that some searches for a proof (or other searches) may never terminate. But mathematicians also sometimes search for a proof without getting a result. Of course, FMathL will have, like mathematicians, an option to quit a search early if it seems hopeless.

Stephen Harris:

I learned that every computable process which runs on a PC is Turing computable and that has nothing to do with a PC having a system clock.

This holds only if there is no external input. A clock, or a human being who types in a reply to a query, are not computable, at least not in any well-documented sense. But the result of the program depends on that input and hence is generally not computable either.

Posted by: Arnold Neumaier on September 11, 2009 1:25 PM | Permalink | Reply to this

How many axioms may a foundation have?

US: …while reading the (27) axioms…

This is an advance over ZF, which needs infinitely many to express the existence of sets defined by properties. (NBG is finitely axiomatized, though.)

Posted by: Arnold Neumaier on September 14, 2009 12:19 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Uwe Stroinski wrote:

while reading the (27) axioms I was thinking to myself: are these axioms fixed and transcendent like in a true axiomatic framework, or can these axioms be changed within FMathL (once it is working), a flexibility one might want to have when doing mathematics? Is FMathL its own metatheory?

The framework has fixed axioms, since it defined the common part of different subject levels. Within the framework, one can do arbitrary mathematics, using the terminology of the framework as a metalevel.

Thus, if you like and it matters to you, you’ll be able to define your own (e.g., intuitionistic) logic, your own version of sets (say Bishop-style), functions (say, terminating algorithms), and real numbers (say, oracles defining one decimal digit after the other), and then reason in the resulting system.

You’ll be able to arrange with a presentation style file that the printed version of your theory does not show a trace of your assumption but regards them as well-known background knowledge, or that is outlined, or that it is explained in detail.

For other users of FMathL, this will just look like a particular context that you created, one that those who want to build upon your work can include into a context of their own. You can create as many such contexts as you like, and include in your current context any other context that you want - but you are responsible for maintaining consistency.

But the FMathL axioms were selected in such a way that, for most mathematics outside set theory and mathematical logic, one does not need these private contexts but can just work in the standard context which satisfies the FMathL axioms. Then one adds definitions and results from the desired fields until one has enough to do one’s own work.

At least the usual linear algebra, real and complex analysis, and elmentary algebra will be in standard contexts; if you can read German, you may look at

http://www.mat.univie.ac.at/~neum/FMathL/ALA.pdf

to see the possible content of such a standard context.

Posted by: Arnold Neumaier on September 11, 2009 1:07 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I use and develop mathematical formulae and relationships in computer vision (which obviously has a different flavour from both more “full theory” and “proof based” areas), but I’d say one important element of your new system should be: heed the lesson of internet search (google, etc) and design things in such a way that simple, brute force search is possible. To expand on that, they staggeringly amazing thing about even sophisticated search engines is that what the algorithms they use are SO elementary relative to what an actual human would do (although they’re sophisticated in their own way) they generally manage to produce search results which quite often help you on your task (even if only to help refine the vocabulary you use to express your goal). Likewise, one can do reasonably well at various tasks just using relatively dumb program scripts that churn away on some big database (eg, I remember something in the paper Scientific American about formalising biosciences paper results just enough to be able to do “brute force” connections between multiple paper’s results to suggest new things to try; unfortunately it appears SA’s paywall stops me finding a reference).

I know your primary focus is unambiguously communicating well-formed results with some level-of-trust certificate in a human centred way, but don’t design things in such a way that programs can’t get in at the contents in non-standard ways. (In particular, definitely make it possible to access individual statements in a “document” in the database directly if they fit some “pattern” (some mathematics oriented variant of a regular expression) without having to go through everything in the docuement, or having to commit to an axiom scheme, etc.) In one way it’s galling that brute force (rather than careful thought) can actually achieve so much, in another it’s exhilarating that new relationships and hints at entities may be uncovered by such means (in an analogous way to the monstrousness of the discovery of the Monstrous moonshine relationship).

Posted by: bane on September 11, 2009 1:44 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

bane: definitely make it possible to access individual statements in a “document” in the database directly if they fit some “pattern” (some mathematics oriented variant of a regular expression)

What kind of patterns would you like to search for in a math text? How would you like to pose such a query? How insensitive should the search be to details in formulas? Please give some telling examples.

Some sort of search will certainly be possible. But trying Wolfram|Alpha shows that even simple searches for mathematical patterns are difficult for today’s technology.

FMathL will have to rely for search, automatic proof, and other well-studied techniques on what others can do. We need to concentrate our efforts in order to have real impact. Therefore, FMathL is intended to be innovative mainly in the things that reside in so far neglected areas of relevance to mathematics for the computer, and otherwise just interfaces to known systems.

Thus I don’t know how far we’ll be able to proceed in the direction of structural search.

Posted by: Arnold Neumaier on September 11, 2009 8:51 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Thanks for your earlier clarification of my questions. Peter Schodl provided this rough summary at http://www.mat.univie.ac.at/~schodl/
“Roughly speaking, we want to teach a computer to understand a LaTeX-file well enough to communicate its essential content to other software. Firstly, we concentrate on mathematical text specifying optimization problems.”
————————————–

SH: I think the issue of not all blogging software being latex cross-compatible has come up on this forum. Does FMathL solve this problem or would the blogging software still need to be changed?

Fair use excerpt from my copy of “Introduction to Mathematical Philosophy”
By Bertrand Russell

“From the habit of being influenced by spatial imagination, people have
supposed that series _must_ have limits in cases where it seems odd if
they do not. Thus, perceiving that there was no _rational_ limit to the
ratios whose square is less than 2, they allow themselves to “postulate”
and _irrational_ limit, which was to fill the Dedekind gap. Dedekind in
the above-mentioned work, set up the axiom that the gap must always be
filled, i.e. that every section must have a boundary. It is for this
reason that series where his axiom is verified are called “Dedekindian.”
But there are an infinite number of series for which it is not verified.

The method of “postulating” what we want has many advantages; they are
the same as the advantages of theft over honest toil. Let us leave them
to others and proceed with our honest toil.”

Posted by: Stephen Harris on September 12, 2009 12:10 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I expressed things badly: what I meant was “definitely don’t make any design decisions that will make it impossible (for others to write software)to search…”, not that your project should actively provide search technology, particularly if that’s not an area of direct interest. I say this because of two “facts of life”:

(1) when designing GUI based software for humans it’s easy to choose programming constructs and data representations so that they are effectively only usable with the GUI.

(2) If something becomes popular it will require a significant functionality increase to displace it, and even if it does get displaced it’s very rare for old “documents” to get properly converted. Eg, the TeX language was “set in stone” in 1983 (AIUI), and most of the core LaTeX “language” by about 1988 (ie, new user level commands not “implementation”). Various people, including Wolfram Research, have tried to displace it and none have gathered significant marketshare. I don’t expect TeX/LaTeX to be displaced until someone figures out inferring pen-based mathematical writing. Yet LaTeX doesn’t provide ways of indicating even simple semantic information, such as distinguishing eqnarrays with multiple = signs based upon whether they represent cases in a definition or steps of simplification, etc. (I know it could, but it doesn’t and I doubt such a feature could be “made” to be used by everyone at this stage.)

So basically all I’m saying is: imagine both that your project is technically successful and that it becomes very widespread and the world will have 20+ years of “documents” in this format, are there any design decisions that you’d come to regret.

I’ve got to go out now, but I’ll try and think of some concrete search examples and post later.

Posted by: bane on September 12, 2009 3:19 PM | Permalink | Reply to this

design decisions that you’d come to regret.

bane: imagine both that your project is technically successful and that it becomes very widespread and the world will have 20+ years of “documents” in this format, are there any design decisions that you’d come to regret.

If I’d know it today, I’d certainly try to avoid it. In any case, the advantage of a fully semantic description of a subject matter in a self-reflected environment such as FMathL is designed to be makes it a fully automatic task to upgrade the whole database to a new representation.

Writing a program for doing so would be much easier than writing a program that upgrades LaTeX (which has an ill-defined semantics) to a different environment.

Posted by: Arnold Neumaier on September 14, 2009 4:09 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’m just starting to get into the description of FMathL; here are some initial thoughts.

I think it is misleading, as is done in the introduction to the description of FMathL, to conflate CCAF and ETCS. CCAF (the Category of Categories As a Foundation for mathematics) is, in my opinion, a convoluted setup, since in order to do any mathematics, you need a notion of set, so before you can get anywhere you first have to define sets in terms of categories. ETCS (the Elementary Theory of the Category of Sets), on the other hand, is a set theory, not a “version” of CCAF. The axioms of CCAF are about things called “categories” and “functors,” while the axioms of ETCS are about things called “sets” and “functions” (among those axioms being that sets and functions are the objects and morphisms of a category, the titular category of sets).

I like to state the different between ETCS and ZFC by saying that ETCS is a “structural” set theory while ZFC is a “material” set theory. In my (biased) opinion, structural set theories solve the interpretation/implementation problems of material set theories, and are also more in line with mathematical practice (for instance, they do not permit nonsensical questions such as whether 121\in\sqrt{2} or whether π\pi is equal to the Cayley graph of F 2F_2). Also, structural set theory is closely related to type theory, which is used by some existing proof assistants like HOL and Isabelle. So my question is, what is the advantage of FMathL over structural set theory or type theory as a foundation?

(By the way, while it is true that Lawvere’s original description of ETCS used the single-sorted definition of a category so that “every set is regarded as a mapping,” this (mis)feature is easily discarded (and usually is, in practice). By contrast, the feature of ZFC by which a function is a particular type of set really seems essential to the development of the theory. So I don’t think it is fair to speak of the two in the same breath as reasons that existing foundations are inadequate.)

Posted by: Mike Shulman on September 12, 2009 6:44 AM | Permalink | Reply to this

CCAF, ETCS and type theories

MS: I think it is misleading, as is done in the introduction to the description of FMathL, to conflate CCAF and ETCS. CCAF (the Category of Categories As a Foundation for mathematics) is, in my opinion, a convoluted setup, since in order to do any mathematics, you need a notion of set, so before you can get anywhere you first have to define sets in terms of categories. ETCS (the Elementary Theory of the Category of Sets), on the other hand, is a set theory, not a “version” of CCAF.

I hadt’t called ETCS a version of CCAF. It is a part of CCAF, and I discussed it as such.

But the situation is not as simple as you describe it.

Any metatheory needs a concept of sets or collections, in order to speak about the objects, properties, and actions it is going to define on the formal level. And a metatheory that may serve as a foundation of mathematics must be able to model itself by reflection.

Thus you cannot have ETCS first without having categories even “firster”. There are four possible scenarios for foundations involving categories:

(I) Start with informal sets, create a formal definition of sets, and from it a formal definition of categories.

(II) start with informal sets, create a formal definition of categories, and from it a formal definition of set.

(III) start with informal categories, create a formal definition of categories, and from it a formal definition of set.

(IV) start with informal categories, create a formal definition of sets, and from it a formal definition of categories. ZFC + inaccessible cardinal + conventional category theory realizes (I). conventional category theory + ETCS realizes (II). CCAF (with ETCS built in) realizes (III). ETCS + conventional category theory realizes (IV).

The spirit of categorial foundations requires (III), I think, which needs both CCAF and ETCS.

MS: ETCS is a “structural” set theory while ZFC is a “material” set theory. In my (biased) opinion, structural set theories solve the interpretation/implementation problems of material set theories, and are also more in line with mathematical practice (for instance, they do not permit nonsensical questions such as whether 121\in\sqrt{2}

They only replace one interpretation/implementation problem by another. Sets of sets cannot be formed in ETCS, but are very common mathematical practice. Thus categories are too “immaterial” to be useful as background theory.

MS: So my question is, what is the advantage of FMathL over structural set theory or type theory as a foundation?

Mathematics is type-free; so is FMathL. Mathematics does not have a type element and a type set (as in ETCS). At best there is a very weak typing that interprets membership in an arbitrary set as a type. (The development of type theory goes in this direction, too, e.g., with dependent types; but as it does so, types become less and less distinguishable from sets.)

FMathL models the actual practice, and does not compromise to gain formal conciseness. The elegance of common mathematical language lies in its power to be expressive, short, and yet fairly easily intelligible, which is in stark contrast to the many different type theories I have seen.

MS: “every set is regarded as a mapping,” this (mis)feature is easily discarded (and usually is, in practice).

True, but the existing versions of CCAF still differ a lot, so that the general conclusion about the categorial foundations is justified. The paper was not meant to give a fair discussion of the relative merits of the existing foundations, but just to point out that neither is adequate for the practice of mathematics.

MS: By contrast, the feature of ZFC by which a function is a particular type of set really seems essential to the development of the theory.

In the usual expositions, yes. But not intrinsically. ZFC is essentially equivalent with NBG, which was first formulated by Von Neumann (the N in NBG) as a theory in which everything was a function, and sets were special functions. It is easy to create equivalent versions of NBG and ZFC in which functions and sets are both fundamental concepts related by axioms that turn characteristic functions into sets and graphs of functions into functions.

The possibility of these apparent differences that do not lead to essential differences in the power of the theory shows that these are matters of implementation, and not of essence. FMathL intends to capture the latter.

Posted by: Arnold Neumaier on September 14, 2009 4:01 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I hadn’t called ETCS a version of CCAF.

Sorry, I misinterpreted one of your sentences.

It is a part of CCAF

I disagree. One may regard it as a part of a general universe of “categorial foundations of mathematics,” but I think CCAF has a very specific meaning which is distinct from ETCS (e.g. Lawevere, “The category of categories as a foundation for mathematics”). ETCS does not require (or, usually, have anything to do with) CCAF, while the development of math from CCAF does not necessarily go through ETCS.

Any metatheory needs a concept of sets or collections, in order to speak about the objects, properties, and actions it is going to define on the formal level.

Well, any metatheory at least needs a logic. Most theories such as ZFC, ETCS, CCAF, etc. are formulated in first-order logic. If one then wants to talk about models for that logic, then one needs a “place” in which to consider such models, which will generally involve a notion of set/collection.

The spirit of categorial foundations requires (III), I think, which needs both CCAF and ETCS.

Well, maybe. But if that’s so, then I would not argue in favor of “categorial foundations” (a phrase I generally do not use). What I’m proposing as “structural set theory” is, I think, your (IV). Note, though, that ETCS does not require a prior (formal or informal) definition of category; it can be stated in pure first-order logic.

Mathematics is type-free

When I look at mathematics, I see types everywhere. What is the objection to 121\in\sqrt{2} if not a type mismatch?

By the way, I observe that 121\in\sqrt{2} is in fact a legal statement in FMathL. Isn’t that exactly the sort of “extraneous information” that’s problematic about ZFC? I don’t actually see how FMathL solves any of the problems of material set theory.

Here’s what I see when I look at mathematics as it is done by mathematicians:

  • Mathematical structures (groups, rings, topological spaces, manifolds, vector spaces) are built out of sets and functions/relations between these sets. From the point of view of the general theory of any such structure, the elements of such sets have no internal structure; they are featureless. In particular, the general theory of a type of structure is invariant under isomorphism.

  • Elements of sets can have “internal meaning” relative to elements of the same set or other sets, as specified by functions and relations. For instance, the elements of a cartesian product A×BA\times B are interpreted as pairs (a,b)(a,b) relative to the sets AA and BB, with the relationship specified via the projection functions A×BAA\times B\to A and A×BBA\times B\to B. But once we start thinking about A×BA\times B as an object in its own right without its relationship to AA and BB, its elements lose their “internal” properties and become featureless, like the elements of any other set.

For instance, when we construct \mathbb{Q} as a set of ordered pairs of integers, we consider a subset of ×\mathbb{Z}\times\mathbb{Z} and use the interpretation of its elements as pairs to construct operations and properties of it. However, once the construction is finished, we generally forget about it and treat each rational number as an independent entity.

I think this property (that structure on a set comes only from the outside) is precisely what allows us to capture “the essence of mathematical concepts;” otherwise we will always be carrying around baggage about the internal properties of the elements of our sets.

Sets of sets cannot be formed in ETCS, but are very common mathematical practice.

I would also argue that the above interpretation of cartesian products also applies to “sets of sets.” The elements of a power set PAP A are given meaning as subsets of AA only via the “is an element of” relation from AA to PAP A, just as the elements of A×BA\times B are given meaning as ordered pairs only via the projections to AA and BB. Once we forget that PAP A was constructed in relation to AA, the meaning of its elements as subsets of AA vanishes. This is the case, for instance, when we construct the real numbers via Dedekind cuts: \mathbb{R} starts out as a subset of PP \mathbb{Q}, just as \mathbb{Q} started out as a subset of ×\mathbb{Z}\times\mathbb{Z}, but once we have proven enough about it we generally forget about the identification of a real number with a set of rationals and treat it as an independent entity.

It is true that mathematicians usually think of the elements of PAP A as “being” subsets of AA, rather than being merely “associated” to them. But this is perfectly in line with the structural philosophy that anything and everything can be transported along an isomorphism: every subset of AA corresponds to a unique element of PAP A, so we might as well consider elements of PAP A to “be” subsets of AA (as long as we continue to keep the “is an element of” relation in mind). But the structural point of view also allows us to discard the “is an element of” relation, which we can’t do in any system where the elements of PAP A really are subsets of AA.

By the way, this transport along a bijection is, I think, easily hidden from the user of any computer system. After all, what does one do with subsets of AA? The basic thing is to talk about which elements of AA are elements of them—and this is handled directly by the “is an element of” relation. All other operations and properties on elements of PAP A are most naturally defined in terms of “is an element of,” so the user won’t ever be aware that the elements of PAP A “aren’t really” subsets of AA.

The elegance of common mathematical language lies in its power to be expressive, short, and yet fairly easily intelligible, which is in stark contrast to the many different type theories I have seen.

I think type theory (along with many other kinds of formal logic) should be viewed as like assembly language. Hardly anyone writes code in assembly language, but not because it has been replaced; rather because more expressive, concise, and intelligible languages have been built on top of it. The description of FMathL seems to me like trying to build a CPU that natively runs on Fortran, rather than writing a Fortran compiler. Logicians have put a lot of effort into developing a theory that works very smoothly at a low level and is very flexible; why not build on it instead of pulling it out and replacing it wholesale?

By the way, one of the things I like about Isabelle is that it is designed with a very weak metalogic, on top of which arbitrary other logics can be implemented. People usually use either HOL or ZFC as the “object logic” but in theory one could use any logic that can be expressed via natural deduction rules. I think there are good reasons to build in this sort of modularity at a low level, just as there are good reasons for hiding it from the end user.

the feature of ZFC by which a function is a particular type of set really seems essential to the development of the theory.

In the usual expositions, yes. But not intrinsically.

You’re right.

(Regarding “sets being mappings,” it sounds like your real objection is the absence of “sets of sets,” which I’ve addressed above.)

Posted by: Mike Shulman on September 14, 2009 5:18 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

AN: It [ETCS] is a part of CCAF

MS: I disagree. One may regard it as a part of a general universe of “categorial foundations of mathematics,” but I think CCAF has a very specific meaning which is distinct from ETCS.

Distinct, yes, but including it; I only claimed that. I referenced C. McLarty, Introduction to Categorical Foundations for Mathematics, version of August 14, 2008. On p.55, he states as part of the CCAF axioms: There is a category Set whose objects and arrows satisfy the ETCS axioms.

Without such an axiom, or some replacement (as in Makkai’s version also referenced) there are no sets, and hence no complete categorial foundations for mathematics.

AN: Mathematics is type-free

MS: When I look at mathematics, I see types everywhere.

When I look at mathematics, I see structures everywhere, not types. Trying to see types, I see that typing violations abound, being justified as abuse of notation.

MS: What is the objection to 121\in\sqrt{2} if not a type mismatch?

The objection is that different constructions of the reals (considered equivalent by mathematicians) answer the question differently.

MS: I observe that 121\in\sqrt{2} is in fact a legal statement in FMathL.

Yes, but it is undecidable. It should be, since people who implement mathematics in ZFC, say, may get different (subjective, implementation-dependent) anwsers. Thus the formal status of this statement should be the same as that of the the continuum hypothesis, say.

MS: Mathematical structures (groups, rings, topological spaces, manifolds, vector spaces) are built out of sets and functions/relations between these sets. From the point of view of the general theory of any such structure, the elements of such sets have no internal structure; they are featureless. In particular, the general theory of a type of structure is invariant under isomorphism.

Earlier in my life I had been working and publishing on finite symmetry groups (E 8E_8, the Leech lattice, etc.). Everyone in this field was obviously regarding any permutation group as a group. Alt(5) acting on 5 points and PSL(2,5) acting on 6 points were nonisomorphic permutation groups, but isomorphic as groups. What was meant by ”the same” was context-dependent, taking into account more or less structure, as needed.

FMathL respects this context-dependence in a natural way; the meaning of an asserted equality depends on the context. This feature allows FMathL to remain much closer to actual practice than previous foundations.

In category theory (and in ZFC), permutation groups and groups are completely different objects, related by functors (as you describe). And much more of that, which must be handled in the traditional foundations by a pervasive misuse of notation and language. In my view, what is regarded by the purists as “misuse” is the true usage: mathematicians generally think in this falsely called misused language rather than in the clumsy purist way. The functors only appear when one forces standard mathematics into the straightjacket of category theory.

In categorial language, the harmless statement 2\sqrt{2}\in \mathbb{R} (where \mathbb{R} denotes the ordered field of real numbers) would as much be a type mismatch as the dubious statement 121\in\sqrt{2}. Writing out all the functors needed to match types would make ordinary mathematical language as clumsy to use as proof assistants based on type theory.

MS: I think type theory (along with many other kinds of formal logic) should be viewed as like assembly language. Hardly anyone writes code in assembly language, but not because it has been replaced; rather because more expressive, concise, and intelligible languages have been built on top of it.

I fully agree. This is why FMathL distinguishes between subjective levels (where particular implementations sit, like different assembler programs for the same functionality, written perhaps even in different assembler languages), and the object level, which gives the essence.

MS: The description of FMathL seems to me like trying to build a CPU that natively runs on Fortran, rather than writing a Fortran compiler.

No. The current description of FMathL is that of a language, not of a CPU. (The CPU’s correspond to the subject levels in the FMathL paper.) In your picture, FMathL tries to be a formal, easy-to-use high-level programming language, while traditional foundations are low-level assembler languages.

MS: Logicians have put a lot of effort into developing a theory that works very smoothly at a low level and is very flexible; why not build on it instead of pulling it out and replacing it wholesale?

The full system MathResS (which uses FMathL as its high level semantics) will build upon it. There will be compilers from the high level to various low levels, i.e., interfaces between FMathL and Coq, Isabelle/Isar, or HOLlight, say. (This doesn’t show in the paper on the FMathL mathematical framework, but is amply reflected in the vision document.)

But past work on proof assistants etc. produced only the assembler level, not a high level comparable to Fortran that would make writing formal mathematics easy. The high level currently only exists as informal mathematical language.

Imagine there were no Fortran or C++, and all programs would have to be specified in ordinary language plus formulas, to be translated by specialists directly into assembler. Computer science would be as little accepted by the world at large as proof assistants are by mathematicians.

The purpose of FMathL is to create a Mat-tran for translating high level mathematics that is as easy to use as modern For-tran for translating high level algorithms.

Posted by: Arnold Neumaier on September 15, 2009 11:38 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Arnold Neumaier wrote about 121 \in \sqrt{2}

Thus the formal status of this statement should be the same as that of the the continuum hypothesis, say.

I disagree with that. The generalised continuum hypothesis GCHGCH is a meaningful statement, at least as long as one is doing mathematics with enough power to recursively define families of sets indexed by the ordinals using the power set operation. Then you can prove useful things, such as V=LGCHV = L \;\Rightarrow\; GCH and GCHACGCH \;\Rightarrow\; AC.

To make GCHGCH come out true or false, you need to make certain assumptions beyond the standard ones, and these should be regarded as a matter of convention; similarly, to make 121 \in \sqrt{2} come out true or false, you need to adopt some convention beyond the standard ones. But you need to adopt such a convention even to make 121 \in \sqrt{2} meaningful, which is not necessary for the continuum hypothesis, or to prove the theorems in the las paragraph.

If you only want to talk about true statements (and thus also false ones, through their negations), then you can treat these similarly; you will be unable to prove or refute either in the standard context, but you will be able to prove or refute it in some other contexts. But I would like a system to complain of a type mismatch the moment that I write down 121 \in \sqrt{2}, until I've adopted conventions that make it meaningful; I don't want to have the statement accepted, just considered neither proved nor refuted yet.

All that said, I appreciate your point that mathematicians use a high-level language in which a host of standard type conversions are suppressed, and it will be very useful to have a formal system that already knows about all of this. Still, I would like a distinction between ‘meaningless’ and merely ‘undecidable’ statements in any given context.

Posted by: Toby Bartels on September 16, 2009 2:10 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TB: you need to adopt such a convention even to make 121\in\sqrt{2} meaningful, which is not necessary for the continuum hypothesis,

The statement is equally meaningful in ZF with the standard add-ons for numbers as is the continuum hypothesis.

TB: I would like a system to complain of a type mismatch the moment that I write down 121\in\sqrt{2} until I’ve adopted conventions that make it meaningful; I don’t want to have the statement accepted, just considered neither proved nor refuted yet.

You can make the system complain by adding to your standard context the requirement that xyx\in y is nominal for numbers xx and yy. Similarly, the user can enforce any desired typing rules by specifying them.

But to have a system in which one can use ZF naturally, one cannot make the typed behavior the default. The default must be as much agnosticism as possible, while there must be simple ways for users to make the system more restrictive, to have it conform to the amount of typedness they want.

Posted by: Arnold Neumaier on September 16, 2009 8:18 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

The statement is equally meaningful in ZF with the standard add-ons for numbers as is the continuum hypothesis.

Yes, but in other systems it is meaningless, whereas the continuum hypothesis is meaningful in any system in which you have the syntax to write it down.

The default must be as much agnosticism as possible

I agree whole-heartedly, but I do not agree that this is what you are doing.

Agnosticism, to me, includes considering syntax meaningless unless it has been given some meaning. Phrases like ‘π(= 2(_\pi(=^2(’ and ‘121 \in \sqrt{2}’ are meaningless to me until somebody explains to me what they mean.

In order of increasing knowledge: meaningless, undecidable, true. Or rather, it should be: meaningless so far, undecided so far, known to be true. The ‘so far’s are because additional assumptions/conventions can push things farther along in the list; the change from ‘undecidable’ to ‘undecided’ is because I know that your system, however great it might be (^_^), can't calculate whether something is undecidable in an arbitrary context.

You also, by default, include other assumptions that should be optional. All this, I suppose, to make ‘a system in which one can use ZF naturally’ out of the box. But if I think that ZF\mathbf{ZF} is a kludge, then that's not a feature for me. By all means, include a ZF\mathbf{ZF} option as a standard module, but making it the default is not ‘as much agnosticism as possible’.

there must be simple ways for users to make the system more restrictive

That is, more restrictive in typing, or less restrictive in assumptions made and conventions adopted. You do this through reflection, right? I'm inclined so far to prefer Coq (or Isabel), which also allows reflection but makes fewer assumptions up front. On the other hand, if you make this more user-friendly, then that would be a Good Thing.

Posted by: Toby Bartels on September 16, 2009 7:07 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Arnold Neumaier wrote

Earlier in my life I had been working and publishing on finite symmetry groups (E 8E_8, the Leech lattice, etc.). Everyone in this field was obviously regarding any permutation group as a group. Alt(5)Alt(5) acting on 5 points and PSL(2,5)PSL(2,5) acting on 6 points were nonisomorphic permutation groups, but isomorphic as groups. What was meant by ”the same” was context-dependent, taking into account more or less structure, as needed.

FMathL respects this context-dependence in a natural way; the meaning of an asserted equality depends on the context. This feature allows FMathL to remain much closer to actual practice than previous foundations.

In category theory (and in ZFC), permutation groups and groups are completely different objects, related by functors (as you describe). And much more of that, which must be handled in the traditional foundations by a pervasive misuse of notation and language. In my view, what is regarded by the purists as “misuse” is the true usage: mathematicians generally think in this falsely called misused language rather than in the clumsy purist way. The functors only appear when one forces standard mathematics into the straightjacket of category theory.

In categorial language, the harmless statement 22 \in \mathbb{R} (where \mathbb{R} denotes the ordered field of real numbers) would as much be a type mismatch as the dubious statement 121 \in 2. Writing out all the functors needed to match types would make ordinary mathematical language as clumsy to use as proof assistants based on type theory.

It seems to me that a structural set theory such as ETCS also upholds the principle of context-dependence as fundamental.

Let’s take for example “2\sqrt{2}”. In a structural set theory, it’s quite true that there is no object (taken in isolation) called 2\sqrt{2}. Instead, it is part and parcel of such an approach that we declare the ambient context in which 2\sqrt{2} is embedded: 2\sqrt{2} as real number, 2\sqrt{2} as complex number, etc., by writing down for example

2:1\sqrt{2}: 1 \to \mathbb{R}

Having declared the context, it is then meaningful in such a theory to ask, given a subset such as \mathbb{Q} \subseteq \mathbb{R}, a question like: is 2\sqrt{2} \in \mathbb{Q}? The question is equivalent to asking whether the point 2:1\sqrt{2}: 1 \to \mathbb{R} factors (evidently uniquely) through the given inclusion \mathbb{Q} \hookrightarrow \mathbb{R}. Or, whether the pair

(2,[]):1×P()(\sqrt{2}, [\mathbb{Q}]): 1 \to \mathbb{R} \times P(\mathbb{R})

factors through the local membership relation

R×P()\in_{R} \hookrightarrow \mathbb{R} \times P(\mathbb{R})

Thus, once such contexts have been declared, the proposition 2 \sqrt{2} \in_{\mathbb{R}} \mathbb{Q} becomes completely meaningful in a structural set theory like ETCS, and reflective of how mathematicians posit questions in actual practice.

Thus, symbol \in is not formalized in a structural set theory as a global relation on objects; it too is contextualized (or localized) by referring to a specified domain, as in the case \in_{\mathbb{R}}.

Continuing this train of thought: where a mathematician might traditionally write

x yx+y\forall_{x \in \mathbb{R}} \exists_{y \in \mathbb{R}} x + y \in \mathbb{Q}

a “purist” working in a formal setting like ETCS would disambiguate the different senses of the symbol \in, where the instances appearing below the quantifiers are interpreted as declaring the types or contexts of the variables, and the one in the predicate being quantified refers to a local membership relation (such as \in_{\mathbb{R}}) pertaining to that type. But I hardly feel like this detail is cumbersome: writing the expression

x: y:x+y \forall_{x: \mathbb{R}} \exists_{y: \mathbb{R}} x + y \in_{\mathbb{R}} \mathbb{Q}

is certainly no worse than how it would appear in a fully formal expression in ZFC, and IMO comes closer to expressing how people think (which we seem to agree is heavily context-dependent).

Regarding phrases such as “straitjacket of category theory” and the final sentence of the material quoted above: these strike me as bald assertions. What is the evidence behind them?

Posted by: Todd Trimble on September 16, 2009 1:51 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

x yx+y\forall_{x\in\mathbb{R}}\exists_{y\in\mathbb{R}}x+y\in\mathbb{Q}

vs

x: y:x+ y \forall_{x:\mathbb{R}}\exists_{y:\mathbb{R}}x+_\mathbb{R}y\in_{\mathbb{R}}\mathbb{Q}

You forgot the subscript on ‘++’. (^_^)

But there is need to abolish the first form in favour of the second. If one adopts structural foundations, then it is possible to algorithmically decode the first as meaning the second, as long as it appears in a context where \mathbb{R} has been declared as a set equipped with an operation ++, \mathbb{Q} has been declared as a subset of \mathbb{R} (which includes declaring it as a set and equipping it with an injection to \mathbb{R}, as you would be likely to do if you had earlier defined \mathbb{R} in terms of \mathbb{Q}), the quantifiers have their usual meanings, and xx and yy are free to be introduced as variables. I would argue that the default context in ordinary mathematics has these features, so the first expression is unambiguous.

Posted by: Toby Bartels on September 16, 2009 7:18 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

And I’d agree with your so arguing, particularly with the fact that translation from the traditional notation to the “purist” one is a routine algorithm. But not everyone has thought about these nuances, so it’s perhaps just as well to spell them out on occasion.

You’ve actually further strengthened the argument for the viability of structural foundations as not being at all cumbersome.

Posted by: Todd Trimble on September 16, 2009 7:41 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

But there is need to abolish the first form in favour of the second.

Whoops!, that should be ‘no need to abolish’.

Hopefully that's clear from what I wrote afterwards, but I still should watch out for missing negations.

Posted by: Toby Bartels on September 16, 2009 7:55 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Actually, this disambiguation algorithm is implemented in modern proof assistants.
E.g. Coq would use type classes for this.

Posted by: Bas Spitters on September 25, 2009 8:02 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

AN: The functors only appear when one forces standard mathematics into the straightjacket of category theory. […] Writing out all the functors needed to match types would make ordinary mathematical language as clumsy to use as proof assistants based on type theory.

TT: Regarding phrases such as “straitjacket of category theory” and the final sentence of the material quoted above: these strike me as bald assertions. What is the evidence behind them?

These assertions were to be taken with a grain of salt. The statement on proof assistants was referring to the overhead incurred when one writes a piece of math in Coq, say, vs. in LaTeX - roughly a factor of 10 in time (this factor is Freek Wiedijk’s estimate, not my exaggeration). Adding all the categorial annotations needed to be able to rigorously say things one likes to say, and making sure everything is correctly in place, is not quite as expensive but still a significant nuisance, a straightjacket that encumbers writing and reading math.

TT: translation from the traditional notation to the “purist” one is a routine algorithm

Routine, but tedious, not much different from converting a lemma and its proof from LaTeX to Coq. It makes the difference between being widely accepted and being used only by afficionados.

I’ll answer to the remainder of your post once I know how to reproduce the formulas without recomposing them myself….

Posted by: Arnold Neumaier on September 16, 2009 8:56 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Routine, but tedious, not much different from converting a lemma and its proof from LaTeX to Coq. It makes the difference between being widely accepted and being used only by afficionados.

I am fully in agreement with this. However, what I’m trying to say is that I think the best solution is not to discard typing information altogether, but rather to automate the process of inferring and adding type annotations. We should be able to omit this type information when writing mathematics, and a computer should be able to infer it just as a human reader infers it, but the information is nevertheless there and should be modeled by the formal language.

Automating this conversion/inference sounds like much the same thing that you’ve said, regarding FMathL as a “higher level language” which can be “compiled” to Coq, Isabelle/Isar, or HOL. But I don’t see how an untyped language can be compiled to a typed one in a meaningful way. When a FMathL user types 121\in \sqrt{2}, what does that get compiled to in a “lower level” typed language?

Posted by: Mike Shulman on September 17, 2009 2:05 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: Automating this conversion/inference sounds like much the same thing that you’ve said, regarding FMathL as a “higher level language” which can be “compiled” to Coq, Isabelle/Isar, or HOL. But I don’t see how an untyped language can be compiled to a typed one in a meaningful way.

In a similar way as one can pose to HOL any query formulated in the untyped first order language ZF. It is well-known that systems like HOLlight can prove much of ZF theory; so, where is the problem?

MS: When a FMathL user types 121\in\sqrt{2}, what does that get compiled to in a “lower level” typed language?

It gets compiled into a precise specification format that adds all relevant contextual information, in this case, that the user seems to have intended an interpretation of numbers as a set.

Most of real math involves such guessing of intentions, which is maintained until strange conclusions are automatically derived. In interactive mode, the formula would be found suspicious, and a query window would pop up to ask the user to confirm the interpretation, to correct the formula, or to supply additional context.

From the specification format, a dedicated spec2HOL translator would map the part of the problem to be checked for correctness into an appropriate query in HOL.

Posted by: Arnold Neumaier on September 17, 2009 3:11 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

In interactive mode, the formula would be found suspicious, and a query window would pop up to ask the user to confirm the interpretation, to correct the formula, or to supply additional context.

Oh, that makes me happy! (^_^)

And I take it that if, after developing a theory of ordinals, I were to write down the generalised continuum hypothesis (that the \aleph series equals the \beth series), then no such window would pop up?

Posted by: Toby Bartels on September 17, 2009 10:52 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TB: if, after developing a theory of ordinals, I were to write down the generalised continuum hypothesis (…), then no such window would pop up?

Then FMathL would just take notice that you added an assumption to your context. It would have made a few checks to see that one can rewrite the formula in a useful way (which cannot be done with 121\in\sqrt{2} since there are no rules for manipulating \in between numbers) and conclude that things make enough sense not to bother the user with a query.

FMathL might become suspicious, though, when you subsequently add that the simple coninuum hypothesis is violated, and would ask whether you intended to close the previous context with the GCH, since otherwise the context becomes trivially inconsistent.

In noninteractive mode, it would write all the queries into a logbook that can be inspected after compilation, and proceed on a tentative basis. (As mathematicians do when they are not sure about the meaning of a cryptic passage.)

Posted by: Arnold Neumaier on September 18, 2009 12:06 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

FMathL might become suspicious, though, when you subsequently add that the simple coninuum hypothesis is violated, and would ask whether you intended to close the previous context with the GCH, since otherwise the context becomes trivially inconsistent.

That's good.

(Of course, we can have no guarantee that FMathL will notice that I'm in an inconsistent context, especially if I write CH as a statement about subsets of \mathbb{R} instead of as a statement about \alephs and \beths. But the more contradictions that it can find in a reasonable amount of time, the better.)

Posted by: Toby Bartels on September 18, 2009 9:55 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TB: Of course, we can have no guarantee that FMathL will notice that I’m in an inconsistent context, especially if I write CH as a statement about subsets of ℝ

Yes. In particular, we might all be working in an inconsistent context without knowing it, like Cantor was for 25 years.

If FmathL discovers this and raises a query, I’d rather think of a bug in FMathL than of one in standard math…

Posted by: Arnold Neumaier on September 18, 2009 10:58 PM | Permalink | Reply to this

Cantor vs Frege

In particular, we might all be working in an inconsistent context without knowing it, like Cantor was for 25 years.

You mean Frege; Cantor knew what he was doing, even though he didn't formalise it.

Posted by: Toby Bartels on September 18, 2009 11:07 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Do you mean that you interpret an untyped theory like ZF in a typed theory by having just one type? To me that seems unfaithful to how typed systems are intended to be used.

Posted by: Mike Shulman on September 18, 2009 9:22 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: Do you mean that you interpret an untyped theory like ZF in a typed theory by having just one type? To me that seems unfaithful to how typed systems are intended to be used.

Do you see a better way? How would you pose a ZF problem to HOLlight, say?

I don’t think that there is any other option. In the The CADE ATP System Competition - The World Championship for 1st Order Automated Theorem Proving, people pose each year lots of untyped problems that they want to see solved (or were already solved) by some theorem prover. Do you think HOLlight should not be allowed to solve these?

On the other hand, if you have a naturally typed problem, the types would form in FMathL a particular family of constructive sets, and a translator to HOLlight would be able to recognize this and then make better use of the typing in the proof assistant.

Posted by: Arnold Neumaier on September 18, 2009 10:09 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

McLarty in Learning from Questions on Categorical Foundations says (on p. 51):

“If the major theorems of category theory are proved in set theory, and then I want to axiomatize them, is that not a kind of dependence on set theory? Well in the first place these theorems are not exactly proved in set theory. Their usual nave versions are incorrect in set theory. They quantify over collections too large to be ZF sets, and manipulate them too freely for Gdel-Bernays classes, and treat them too uniformly for Grothendieck universes. There are many well-known and sufficiently workable set-theoretic fixes for handling these theorems but they are all just that—fixes.”

This suggests to me that there is no known natural way to code category theory in set theory. Is this a reasonable point of view? If it is, what’s a solution to this problem?

Posted by: Eugene Lerman on September 16, 2009 10:32 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I don't think that I agree with McLarty there, and I'd like to know what in category theory he thinks can't be formulated in ZFC+GU\mathbf{ZFC} + GU, where GUGU is the axiom that every set belongs to some Grothendieck universe, about as easily as anything in ordinary mathematics is axiomatised in ZFC\mathbf{ZFC}. Categorists' language even tells you when the Grothendieck universes are coming in: whenever they say ‘small’ (or something that may be defined in terms of smallness, like ‘accessible’ or ‘complete’).

I certainly agree with McLarty's second point (not quoted above), so it doesn't matter for his overall argument. I also believe that formulating mathematics in ZFC\mathbf{ZFC} is generally to perpetrate an act of violence upon it; I just don't see how category theory is particularly special in that regard.

Posted by: Toby Bartels on September 17, 2009 12:59 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Well, I wrote a whole paper about ways to formalize category theory in (mostly material) set theory. (By which I mean, of course, ways to deal with the size distinctions in category theory; the theory of small categories never poses any problem.) Of course, “natural” is in the eye of the beholder, but I don’t think there’s a whole lot to object to, at least in the more effective versions. Any of them can be “structuralized” to eliminate any objections on that score.

I think it’s a bit misleading to say that the theorems of category theory “are not exactly proved in set theory.” One has to choose a set-theoretic formalization, but there are a number which suffice perfectly well. And when it comes to it, hardly any mathematics is actually “proved in set theory”–it’s proved in informal mathematical language which we all trust could be translated into set theory (if we’re the sort of people who believe that all mathematics should be founded on set theory). I don’t think category theory is much different there; the only thing is that it matters a bit what set-theoretic foundation you choose, but that isn’t unique to category theory either.

Of course, set-theoretic foundations are perhaps not philosophically in line with category theory, which may be more along the lines of what McLarty is getting at. For instance, there is the problem of evil: as long as your categories have sets of objects, you’ll be able to talk about equality of those objects. But I don’t see any way to solve that unless your foundational axioms are really about the 2-category of categories; even the 1-category of categories isn’t good enough.

Posted by: Mike Shulman on September 17, 2009 1:58 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Todd and Toby have said most of what I would say as well. I agree that a lot of what is written in everyday mathematics does not naively typecheck, and also that such “abuses” of notation are not “wrong” but are the real usage. However, I don’t think this is a reason to discard types, which are, I still believe, everywhere in mathematics and carry important information. Rather, we should improve the type system.

Consider your example regarding xx\in \mathbb{R} not type-checking because \mathbb{R} is not a set but a field. I would argue that a better way to describe this is that the symbol \in is overloaded, in the precise sense of computer science. Thus, what appears on the RHS of \in does not always have to be a set, but can be anything for which an appropriate semantics is defined. One could additionally supply a default semantics: when the RHS is a structure of some sort having only one underlying set, then the meaning of \in should default to referring to that underlying set. This is completely precise and maintains the important type information. Moreover, it is already possible in Isabelle:

record 'a magma =
  elements :: "'a set"
  times :: "'a => 'a => 'a" (infixl "\star\<index>" 70)

definition in_magma :: "'a => 'a magma => bool"
  where "in_magma x M = (x \in elements M)"

notation in_magma ("_ \in _")

axioms closed: "[| x \in M ; y \in M |] ==> (x M\star_My) \in M"

(Of course, this isn’t the way one would actually work in Isabelle, I’m just trying to illustrate overloading of \in with code that typechecks.)

Posted by: Mike Shulman on September 16, 2009 8:56 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Hmm, I guess this is actually more or less the same thing that Toby said above.

Posted by: Mike Shulman on September 16, 2009 9:42 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: Consider your example regarding xx\in \mathbb{R} not type-checking because \mathbb{R} is not a set but a field. I would argue that a better way to describe this is that the symbol \in is overloaded, in the precise sense of computer science.

Yes, this is more reasonable. But this doesn’t solve all type problems. For example, Todd Trimble’s suggestion that “2 as a real number” and ` `2 as a complex number” are different objects with different types is really worrying This gives a multitude of distinct objects that most mathematician would consider to be identical, e.g., “2 as an element of (,suc)(\mathbb{N},suc)”, “2 as an element of (,+)(\mathbb{N},+)”, “2 as an element of (,+,*)(\mathbb{N},+,*)”, “2 as an element of (,<)(\mathbb{N},\lt)”, “2 as an element of (,+,<)(\mathbb{N},+,\lt)”, “2 as an element of (,+)(\mathbb{Z},+)”, “2 as an element of (,+,*)(\mathbb{Q},+,*)”, to name only a few.

One gets a lot of accidental relations when formalizing numbers in ZF, which are avoided in category theory, but instead one gets a lot of accidental duplication. This shows that category theory is like ZF a useful mode of viewing mathematics, but that it does not capture its full essence.

Compare this with the way categories are defined in FMathL (Section 2.13). This is not quite the same as the traditional definition, but it is the same from the point of view of essence. In FMathL, objects and morphisms can belong very naturally to several categories.

Posted by: Arnold Neumaier on September 17, 2009 2:29 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Permit me then to try to assuage your worries! :-)

In ETCS (to give an example of a categories-based set theory, but there are ways of doing structural set theory, including one that Mike Shulman is working out in the nLab, called SEAR), a [global] element is conceived as a map of the form 1X1 \to X; the XX can be thought of as the ‘type’. The context XX is given with the element as its codomain; it is a datum of that element. Thus “elements” are not free-floating or given in vacuo; they come already attached to sets.

In just the same way, “objects” are not free-floating either – they come attached to categories in which they are conceived as being objects of. Thus we have \mathbb{N} qua set, \mathbb{N} qua structure with 0 and successor, qua ordered abelian group, qua ring, etc. – these are all objects in different categories. But each of those structures you mentioned allow one to define an element 2:12: 1 \to \mathbb{N} in the underlying set, and insofar as all those structures on \mathbb{N} you mentioned are standardly defined by exploiting the “Peano postulates” [in which the set \mathbb{N} comes equipped with a 0 and a successor, satisfying a principle of primitive recursion or universal property as natural numbers object that makes the recursive definitions of each of these structures possible], it’s provably the same element 2 (that is, same morphism 11 \to \mathbb{N} in SetSet) we’re referring to in each of those cases.

Even if we have several natural numbers objects, there is an invariant meaning of ‘2’ in each of these, insofar as the unique isomorphism from one natural numbers object to another (as natural numbers objects) takes the element ‘2’ in one to the element ‘2’ in the other. I’m puzzled why any of this should be considered a problem.

Posted by: Todd Trimble on September 17, 2009 9:36 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I meant to say \mathbb{N} as “ordered commutative monoid”, not “ordered abelian group”. :-P

Posted by: Todd Trimble on September 17, 2009 9:42 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I’m puzzled why any of this should be considered a problem.

Well, it's very different from the way that most mathematicians think about these things.

(And to handle 22 \in \mathbb{Z}, 22 \in \mathbb{C}, etc, you also have to talk about some injections between the sets before you can prove that various guises of 22 are the same underneath.)

So while I agree with you that mathematics is like this underneath, I also agree with Arnold that it would be nice to have a user-friendly system where it doesn't explicitly look like that. I would like a that does have all of that underneath it, at some level, but hides it from the user, at least until they ask for more details.

Posted by: Toby Bartels on September 17, 2009 10:57 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

And to handle 22 \in \mathbb{Z}, 22 \in \mathbb{C}, etc, you also have to talk about some injections between the sets before you can prove that various guises of 22 are the same underneath.

That goes without saying. In a categories-based set theory, such injections often arise as universal arrows (e.g., \mathbb{N} is the initial rng, affording a unique comparison map \mathbb{N} \to \mathbb{Q} in RngRng). As you know, of course.

Could you say a little more what you mean by “it’s very different from the way most mathematicians think about these things”? It seems to me that, now that categorical ideas have seeped into the general consciousness, many mathematicians do in effect “think structurally” – simply put, that context matters – for example that 2 in \mathbb{Q} is different from 2 in \mathbb{Z} since the first 2 is invertible and the second isn’t, and that 22 in \mathbb{R} is different because it has a square root.

Posted by: Todd Trimble on September 18, 2009 12:24 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Could you say a little more what you mean by “it's very different from the way most mathematicians think about these things”? It seems to me that, now that categorical ideas have seeped into the general consciousness, many mathematicians do in effect “think structurally” – simply put, that context matters – for example that 2 in \mathbb{Q} is different from 2 in \mathbb{Z} since the first 2 is invertible and the second isn’t, and that 2 in \mathbb{R} is different because it has a square root.

I don't have any statistical evidence to back it up, but my feeling is that this is still a minority. It's one thing to say that 22 behaves differently in \mathbb{Q} from how it behaves in \mathbb{Z}, but another thing to say that 22 in \mathbb{Q} is a different object from 22 in \mathbb{Z}. Like many philosphical differences, there is no practical distinction here, but I think that a lot of people would have difficulty even understanding what you were saying up above —especially the very young (undergraduates who have had little or no experience yet with abstract concepts like groups, metric spaces, and other kinds of structured sets) and the very old (and set in their ways; I can think of some examples from my days at UCR who I'm sure would not have understood what you were saying, although I'd rather not name them here). I think that most of the others would understand it but think it weird; it probably seems unnecessarily complicated.

Again, no scientific evidence; that's my feeling from talking with non-categorially-inclined mathematicians.

Posted by: Toby Bartels on September 18, 2009 9:26 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TT: it’s provably the same element 2 (that is, same morphism 11\to\mathbb{N} in Set) we’re referring to in each of those cases. […] I’m puzzled why any of this should be considered a problem.

I don’t understand; I am truly puzzled. How can you even speak of (let alone prove it) the same morphism 11\to\mathbb{N} in Set when the \mathbb{N}’s are different (being a set, a set with successor, etc., which are all different things)?

Posted by: Arnold Neumaier on September 18, 2009 12:22 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Easy: these objects (\mathbb{N} as set, \mathbb{N} as ordered commutative monoid, etc., etc.) are treated as belonging to different categories: SetSet, OrdCMonOrdCMon, etc. When one applies the appropriate “forgetful functor” [from each of these categories to SetSet, the functor that forgets or strips off structure] to each of these structured objects, in each case the output is the same: the underlying set \mathbb{N}. (Strictly speaking, I have just said something “evil”, but I am not going to worry about this on first pass.)

Are you worried about the fact that the naturals as ordered rig “is” a sextuple (,0,1,+,,<)(\mathbb{N}, 0, 1, +, \cdot, \lt), and hence a complicated type of “set”, a set different from (,0,succ)(\mathbb{N}, 0, succ) say? Yes, you can shoehorn any mathematical structure to make it a “set” (as one might do if ZF were one’s religion), but that’s not how mathematicians would normally think of it – they just bear in mind whatever structure (e.g., an ordered rig is a set equipped with certain operations and a binary relation) is relevant to the discussion at hand. And, in ever-increasing numbers, they think of the collection of structures of given species or signature as belonging to a category in its own right, with different types of structure belonging to different categories. So the set \mathbb{N} can come equipped with different types of structure, giving rise to objects in different categories, but one can always forget the structure and come back to the same underlying set \mathbb{N}.

Posted by: Todd Trimble on September 18, 2009 1:46 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TT: Are you worried about the fact that the naturals as ordered rig “is” a sextuple, and hence a complicated type of “set”, a set different from (\mathbb{N},0,succ) say?

For me, as I believe for most mathematicians, the naturals have the maximal structure, and therefore are simultaneously a set, a Peano structure, a free monoid, a semiring, etc.. This view is simple, harmless, consistent with actual practice, and minimizes the amount of trivial add-on needed to formalize it.

But to understand your position, I tried to walk in your shoees and treat them as different objects, Let’s call them NN and NN' for simplicity. (iTex has no macros…) I do not mind that NN and NN' are not sets; the problem lies somewhere else.

1N1\to N and 1N1\to N' are both sets, but 1N1\to N and 1N1\to N' consist of arrows between different categories. Since an arrow, as I suppose is the purist view (maybe I am wrong here?) knows its origin and destination as part of its identity, it is impossible that the arrows 2(1N)2\in(1\to N) and 2(1N)2'\in(1\to N') are the same, as you said is provable. They can be identified only after applying an unspoken isomorphism.

But mathematicians generally differentiate between “same” and “isomorphic”; one really needs two words for these; only confusing them is truly evil (except perhaps when taking extreme case in how one formulates things).

This is what puzzles me. In your version of the foundation, there are too many copies of everything, and trivial functors that semi-identify them again. One gets a myriad of different objects for what is naturally the same object, and this propagates into everything constructed from these objects. One can pass the buck to a myriad of unspoken isomorphisms, but one cannot remove the myriad of things needed to make everything really precise.

But that is needed for a machine that efficiently handles all of undergraduate mathematics (say) in a single, consistent implementation.

Doesn’t this look like an ideal opportunity to apply Ockham’s razor?

Purist will still be able to discuss “2 as an element of NN” and “2 as an element of NN'”, but these would be regarded as objects different from the simple natural “2”.

That this is the mathematically natural way of regarding matters can already seen by the way we refer to them in the present discussion: To be understandable we must spell out the full form of the object. This is “2” for the natural number 2, but “2” does not denote an instance of the concept of “a natural number viewed in the context of a particular category”.

Posted by: Arnold Neumaier on September 18, 2009 2:37 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

But to understand your position, I tried to walk in your shoes and treat them as different objects, Let’s call them N and N′ for simplicity. (iTex has no macros…) I do not mind that N and N′ are not sets; the problem lies somewhere else.

1N1 \to N and 1N1 \to N&#8242; are both sets, but 1N1 \to N and 1N1 \to N&#8242; consist of arrows between different categories.

There seems to be a lot of confusion just in this one sentence. For now I’ll try to carry on this discussion in good faith, but I may run out of steam soon – these are after all very elementary topics we’re spending time on, and frankly I’m beginning to lose patience.

No, those arrows are not sets.

“Consist of arrows between different categories” is not good mathematical English, but I think what you were trying to say is that those arrows are morphisms of different categories. But that’s not right either, and certainly not what I said. For one thing, there may not be any such arrows. The arrows, actually arrow I had been talking about, 2:12: 1 \to \mathbb{N}, is in SetSet.

Let me give an example to help make this clearer. Let me call CC' the category of sets equipped XX with a point x:1Xx: 1 \to X and an endofunction function s:XXs: X \to X. The set \mathbb{N} equipped with its standard Peano (or Lawvere natural number object) structure is an example. We could call this object \mathbb{N}', if you like. A one-element set 11 carries a unique such structure, where the point is id:11id: 1 \to 1 and the endofunction is id:11id: 1 \to 1, so that’s another example. Does there exist a morphism of the form 11 \to \mathbb{N}', in the category CC'? NO!

However, if U:CSetU': C' \to Set is the evident forgetful functor, so that U()=U'(\mathbb{N}') = \mathbb{N}, the set \mathbb{N}, then there is of course an arrow 2:12: 1 \to \mathbb{N}. That’s in the category SetSet. Not in CC'. In SetSet.

Each of those standard structures you mentioned a while back ((,0,succ)(\mathbb{N}, 0, succ), (,0,1,+)(\mathbb{N}, 0, 1, +), and so on), sets equipped with prescribed operations (that are themselves arrows in SetSet), give us enough structure to pick out an element 2:12: 1 \to \mathbb{N}, as an arrow in SetSet. In each case, it’s the same damned 2, provided that the structures we’re talking about are the standard ones. That’s what is provable (for Pete’s sake!). One can, and one often does, think of sets-with-structure as objects in a category in its own right, and I brought that up hoping it would help explain that yes, we can think of \mathbb{N} as bearing many different types of structure (which need to be distinguished in order to have a coherent conversation), and thinking of these as objects in different categories may help us bear such distinctions in mind, but all those structures you mentioned, as sets equipped with operations of various sorts, can be used to define one and the same 2, as a morphism of the form 11 \to \mathbb{N} in SetSet. Is what I’m saying clear now?

For me, as I believe for most mathematicians, the naturals have the maximal structure, and therefore are simultaneously a set, a Peano structure, a free monoid, a semiring, etc.. This view is simple, harmless, consistent with actual practice, and minimizes the amount of trivial add-on needed to formalize it.

I can’t really speak for most mathematicians, and I’m not sure you can either, but I’m not sure what the heck is meant by “the maximal structure”. The maximal structure on \mathbb{N} consists, I guess, of all possible functions n\mathbb{N}^n \to \mathbb{N} of any arbitrary arity nn (finite or infinite), and all possible relations on \mathbb{N} (again of arbitrary arity). Is that how most mathematicians think of \mathbb{N}? I don’t think so. I think what mathematicians do is think of \mathbb{N} however the hell they wish to think of it, referring to as much or as little structure as will suit whatever purpose they have in mind. Of course, a good mathematician will tell us what he has in mind. For example, if he is investigating decidability issues, he has to tell us whether he means \mathbb{N} as monoid or rig or whatever. (I guess if the context is unstated, the usual default is to think of \mathbb{N} as ordered rig, but there are lots of other things people do with \mathbb{N}, e.g., they can think of it as dynamical system or as a group representation in various extremely creative ways. Thus “the maximal structure” doesn’t carry a lot of coherence for me.)

Ultimately, we are on the same side: we would all like flexible, easy-to-use foundations. I entered this conversation hoping to clarify the role of context dependence in structuralist points of view on set theory, but as of this writing it’s not clear to me whether you are honestly confused by what I’m saying but still trying to understand, or just trying to pick faults in what you take my position to be.

Posted by: Todd Trimble on September 18, 2009 5:32 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TT: it’s not clear to me whether you are honestly confused by what I’m saying but still trying to understand, or just trying to pick faults in what you take my position to be.

I have no time to waste, and discuss here only to learn and to clarify.

I learnt category over 30 years ago as a student, mainly as a tools to see universal constructions from a unifying point of view.

I never needed to make any use of it in a long career in pure and applied mathematics, until this year when my vision of the foundations of mathematics was developed enough to merit a comparison with other foundations that are around. So I looked at the categorial approach, its benefits and its weaknesses. But unlike for you it is for me a mainly foreign language to which I was exposed as a youth but never had spoken it myself, apart from doing theexercises in the course where I leant it.

The FMathL mathematical framework uses categories on the foundational level just a little bit since it provides some nice features. So I had asked John Baez about his opinion, and he moved the discussion to here. I’ll keep posting to the discussion as long as I expect to learn something from it.

Maybe this restores your patience.

TT: all those structures you mentioned, as sets equipped with operations of various sorts, can be used to define one and the same 2, as a morphism of the form 1→ℕ in Set. Is what I’m saying clear now?

Yes. So the N here is always the set of natural numbers without structure, and N’ only enters as 1→U’(N’). My confusion came from the fact that before you had talked of “2 in ℚ is different from 2 in ℤ”, and I had thought that your new comment was a commentary on this, except with various forms of ℕ in place of ℚ and ℤ.

But you can see how difficult it is for someone with little practice in category theory to apply the right invisible functors at the right places.

It is definitely not something that belongs to the essence of mathematics, otherwise everyone would have to practice it before being allowed to forget it again.

And I have to find a way to teach the FMathL system to detect and overcome such misunderstandings (which can arise in reading any math in a field one is not fluent in).

TT: I’m not sure what the heck is meant by “the maximal structure”.

This was short for your 6-tuple plus succ, i.e., the union of all the stuff that is usually present in discussions about natural numbers. Depending on the context, a mathematician can add any conservative extension by defined operations to make this maximal structure rich enough. In the usual view one picks up from reading math papers, ℕ doesn’t suddenly stop to have a multiplication simply because someone doesn’t need it at the moment.

TT: Ultimately, we are on the same side: we would all like flexible, easy-to-use foundations. I entered this conversation hoping to clarify the role of context dependence in structuralist points of view on set theory,

Yes, and appreciate that. I learn from this discussion, though perhaps not at the speed you’d like to see. But this doesn’t prevent me form continuing to see all the trivial dead weight a pure categorial approach brings to fully formalized mathematics.

I want a formalization that is as free from artificialities as possible.

Posted by: Arnold Neumaier on September 18, 2009 6:32 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Arnold:

I have no time to waste, and discuss here only to learn and to clarify.

Excellent. Neither do I, and that’s also why I come here: to learn and help explain when I’m able.

I learnt category over 30 years ago as a student, mainly as a tools to see universal constructions from a unifying point of view.

Okay, thanks, this is good to know.

I never needed to make any use of it in a long career in pure and applied mathematics, until this year when my vision of the foundations of mathematics was developed enough to merit a comparison with other foundations that are around. So I looked at the categorial approach, its benefits and its weaknesses. But unlike for you it is for me a mainly foreign language to which I was exposed as a youth but never had spoken it myself, apart from doing theexercises in the course where I leant it.

The FMathL mathematical framework uses categories on the foundational level just a little bit since it provides some nice features. So I had asked John Baez about his opinion, and he moved the discussion to here. I’ll keep posting to the discussion as long as I expect to learn something from it.

Maybe this restores your patience.

It restores it some. But (and I’m not trying to browbeat you here), but I have to say in all honesty: based on what you’ve written here and elsewhere, it seems to me you have a pretty hazy understanding of category theory and what it has to say about foundations. I don’t have a problem with that, unless someone in that position then holds forth on categorical foundations, its strengths and weaknesses, what’s impossible and what’s provable, etc. I happen to know a bit about the subject myself. Not as much as some people, but “more than your average bear” as Yogi Bear used to say.

But you can see how difficult it is for someone with little practice in category theory to apply the right invisible functors at the right places.

Excuse me then. Although I did try to write clearly, it’s possible that I was talking to some degree as mathematicians often do, assuming some unstated context and background. I wasn’t aware of what your background in category theory was, perhaps, and explained things too rapidly (?).

It is definitely not something that belongs to the essence of mathematics, otherwise everyone would have to practice it before being allowed to forget it again.

Well, that’s another bald claim, but it’s not the first time I’ve heard that sort of thing.

Let’s forget ETCS and all that, then; the kernel of the complaint seems to be that the categorical approach is hard to get into – even very carefully written texts like Mac Lane and Moerdijk’s Sheaves in Geometry and Logic, a nice introduction to topos theory with a strongly logic-oriented point of view, take some effort to penetrate and master. I’ve made some forays myself into writing up some of the details of ETCS (as reproduced on the nLab), with an eye to eventually being able to explain it smoothly to undergraduates, but that series is still very unfinished and not nearly to my satisfaction. Long story short, you’re right, no one has succeeded yet in making categorical set theory look like a snap.

However, it looks like Mike Shulman has written down a very interesting program of study, different to the categories-based approach (which is extremely hands-on and bottom-up) but which is very smooth and top-down (invoking for example a very powerful comprehension principle) while still being faithful to the structuralist POV. He calls it SEAR (Sets, Elements, and Relations). That may be a more congenial place to start; in fact I strongly recommend it to gain a better appreciation of this POV (easier to read certainly than Lawvere).

If you’d like to learn more however about categorical set theory and “categorical foundations”, it might be a good idea to read some of the category texts written by people with significant formal philosophic training, e.g., Awodey, J.W. Bell, Goldblatt, and McLarty (and do the exercises!). They tend to write at a more accessible level than Johnstone say or Lawvere.

This was short for your 6-tuple plus succ, i.e., the union of all the stuff that is usually present in discussions about natural numbers. Depending on the context, a mathematician can add any conservative extension by defined operations to make this maximal structure rich enough. In the usual view one picks up from reading math papers, ℕ doesn’t suddenly stop to have a multiplication simply because someone doesn’t need it at the moment.

Obviously I agree with the spirit of the last sentence, and we obviously agree that a mathematician should have the freedom to “call” the multiplication whenever the need is felt or pass to a conservative extension rich enough for any desired purpose. What’s to disagree with. But I still don’t believe in maximal structures per se much.

For example, model theorists often consider what are called o-minimal structures, and it is widely believed (although unproven as far as I know) that there exists no maximal o-minimal structure. One can extend, world without end, amen.

But this doesn’t prevent me form continuing to see all the trivial dead weight a pure categorial approach brings to fully formalized mathematics.

It might be a good idea not to be too insistent about that before you develop a better understanding of that approach. There’s a lot going on there, and also a lot has happened in the past thirty years which you might not be fully aware of.

Posted by: Todd Trimble on September 18, 2009 8:55 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TT: it seems to me you have a pretty hazy understanding of category theory and what it has to say about foundations.

While this may be the case, I have looked at the less formula-intensive discussion of categorial foundations to get a view of what I can expect from the approach. This is what I have done in any field I entered, and if you look at my homepage you can see that I have successfully entered many fields. Time is bounded; so I need a way to see where to concentrate and to invest learning all the details. With category theory the goods to expect were never sufficient to motivate me to practice the formalism. Nevertheless, i can decipher any categorical statement with some effort, and have done so quite a number of time.

But it feels like very occasionally programming in C++ when all the time you program in an easy-to-use language like Matlab - One has to remind oneself each time what rules are applicable when, and where one needs care. (We do even the FMathL proptoyping in Matlab rather than in Haskell, say, since Matlab is much easier to use.)

I just lack the practice to express myself easily, and I’ll aquire this practice only if the benefits are high.

The main reason I cannot see why category theory might become a foundation for a system like FMathL (and this is my sole interest in category theory at present) is that a systematic, careful treatment already takes 100 or more pages of abstraction before one can discuss foundational issues formally, i.e., before they acquire the first bit of self-reflection capabilities.

In FMathL, the reflection cycle must be very short, otherwise an implementation of the system is impossible to verify by hand.

TT: I wasn’t aware of what your background in category theory was, perhaps, and explained things too rapidly (?).

Well, since this forum is read by people wil all sorts of background knowledge on categories, it pays to give a little attention in adding a bit of redundant information to remind those not doing categories every day about some context. Often a few words or an extra phrase is sufficient.

I remember when, on my very first conference, Peter Cameron, one of the big people in finite geometries, started his lecture with reminding the audience (all finite geometers) of the definition of a permutation group, a statement everyone must haver known (but not everyone used it on a daily basis). For me, it was an eye-opener for how to communicate well. (You may wish to look at my theoretical physics FAQ to get an idea on how it bore fruit.) AN: It is definitely not something that belongs to the essence of mathematics

TT: Well, that’s another bald claim, but it’s not the first time I’ve heard that sort of thing.

I wasn’t saying this of category theory, but of treating it as a foundation rather than as a tool, forcing the need for type-matchng everything by invisible functors.

TT: SEAR (Sets, Elements, and Relations).

I posted there already, after SEAR had been mentioned here.

TT: But I still don’t believe in maximal structures per se much.

Note that maximal structure is something very constructive: The union of the structure assembled about an object at a given time in a given finite context.

Of course, any context can be indefinitely extended, and as this is done, the meaning of maximal changes in a similar way as the meaning of the largest element in a finite set of numbers may change when augmenting the set.

But this doesn’t invalidate the meaning of the concept of the maximum of a finite set of numbers.

But at any time, it is well-defined. Context management is one of the basic tasks a system like MathML has to do (besides the ability to precisely specify concepts), and the way this is done decides on the feasibility of the whole projects.

AN: But this doesn’t prevent me form continuing to see all the trivial dead weight a pure categorial approach brings to fully formalized mathematics.

TT: It might be a good idea not to be too insistent about that before you develop a better understanding of that approach.

I can see the dead weight (100 pages overhead) already with the limited understanding I have now.

Show me a paper that outlines a reasonably short way to formally define all the stuff needed to be able to formally reflect in categorial language a definition that characterizes when an object is a subgroup of a group.

This is not difficult in categorial terms when you are allowed to use the standard informal mathematical metalanguage.

But the way I measure the potential merit of a conceptual framework is by how easy it is to say the same things rigorously without using the metalanguage, in particular, avoiding all the abuses that makes informal mathematics so powerful and short.

Posted by: Arnold Neumaier on September 18, 2009 11:43 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Show me a paper that outlines a reasonably short way to formally define all the stuff needed to be able to formally reflect in categorial language a definition that characterizes when an object is a subgroup of a group.

Show me that paper for ZFC\mathbf{ZFC} (or any other foundation that isn't specifically geared towards group theory), and I'll show you the same for ETCS\mathbf{ETCS}.

Posted by: Toby Bartels on September 19, 2009 12:14 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

To make this a serious but fair challenge: Give me a paper whose source is available (say from the arXiv), formalised in ZFC (or whatever), and I'll rewrite it to be formalised in ETCS. (The paper can include its own specification of ZFC too, and mine will include its own specification of ETCS.)

Posted by: Toby Bartels on September 19, 2009 12:20 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Show me that paper for ZFC (or any other foundation that isn’t specifically geared towards group theory), and I’ll show you the same for ETCS.

I’m guessing that the response to this may be “well, ZFC is no better.”

However, I must be misunderstanding the statement, because I don’t see why it is at all difficult. A subgroup of a group is a subset which is closed under the group operations. In ETCS “subset” means “injective function”. But why is that at all hard to formalize?

Along the same lines, exactly what 100 pages are you referring to?

Posted by: Mike Shulman on September 19, 2009 3:06 AM | Permalink | PGP Sig | Reply to this

Introduction to Categorical Logic

MS: exactly what 100 pages are you referring to?

I was thinking of the lecture notes Introduction to Categorical Logic by Awodey and Bauer. They take endless preparations (partly moved to the appendix) before they have reflected enough logic that would semantically adequately encode stuff like the metalanguage of ETCS (which was not encoded there). Maybe far less suffices, but the way the lecture notes are organized doesn’t make it easy to find out what can be deleted.

I believe that standard axiomatic set theory + first order logic as organized in standard textbooks is much shorter. One only needs the introductory part of axiomatic set theory (until one has functions and natural numbers) since not even the general notion of a cardinal is needed in first order logic.

But of course, these were just estimates from the literature. If you know of a shorter Introduction to Categorical Logic also starting from scratch, I’d appreciate a reference.

Posted by: Arnold Neumaier on September 19, 2009 6:48 PM | Permalink | Reply to this

Re: Introduction to Categorical Logic

I think I’m still failing to understand exactly what you mean by “reflect”. Do you mean being able to give a formal definition, in some theory, of the syntax and semantics of first-order logic? If so, that is just as easy in type theory as in ZF.

All the work in those lecture notes is geared towards describing the categorical semantics for first-order logic. This is hard work no matter what theory you are working in, whereas a simple set-based semantics is easy. I think books on categorical logic usually focus on this difficult job, often assuming that their readers are familiar with the easy version.

Posted by: Mike Shulman on September 19, 2009 8:36 PM | Permalink | PGP Sig | Reply to this

Re: Introduction to Categorical Logic

MS: I think I’m still failing to understand exactly what you mean by “reflect”. Do you mean being able to give a formal definition, in some theory, of the syntax and semantics of first-order logic?

I discussed what I mean in Sections 1.5 and 3.2 of the FMathL mathematical framework paper. It means to prepare enough formal conceptual and algorithmic ground that enables one to formally write down everything needed to explain the meaning of an ordinary mathematical text that defines the system in the usual informal way, and together with the meaning the algorithmic steps that are allowed to be performed.

In partuclar, for your vision of category theory, the system should be able to know formally what it means to treat the theory in a morally correct way.

MS: If so, that is just as easy in type theory as in ZF.

I think there is a type error in your statement. the left hand side is a way of structuring the logic, the right hand side is a way of adding mathematical functionality to it. How can they be compared for easiness?

Foundations consist of two sides - the logic and the axiom system. With respect to (classical) logic, the choice is between FOL and HOL (first and higher order logic) and between typed and type-free theories. With respect to the axiom system, the choice is between some version of set theory and some version of category theory.

I argued mainly against having as basic axiom system that of a category theory. This has nothing to do with types.

Type theories have proved their foundational value. The only complaint I have against type theoretic foundations coupled with some set theory - realized in many theorem provers - is that the cores of these provers are huge, and very hard to check by hand for correctness. But the untyped theorem provers are also far from satisfying in this respect.

Posted by: Arnold Neumaier on September 19, 2009 9:03 PM | Permalink | Reply to this

Re: Introduction to Categorical Logic

When I said “type theory” I meant to refer, not to typed first-order logic in general, but to a specific type theory including type constructors for products, subsets, quotients, powersets, exponentials, etc – a theory such as is usually used for the internal language of a topos. Perhaps this is what you are calling “some version of category theory,” although it is not intrinsically categorial.

Posted by: Mike Shulman on September 20, 2009 6:16 AM | Permalink | PGP Sig | Reply to this

Re: CCAF, ETCS and type theories

But it feels like very occationally programming in C++ when all the time you program in an easy-to-use language like Matlab - One has to remind oneself each time what rules are applicable when, and where one needs care.

Interesting that you should say that. I was having the same thought that this argument is much like the argument between statically typed and dynamically typed programming languages. Unsurprisingly, I prefer statically typed ones. I actually used to like dynamically typed ones like Perl and Python better, but over time I came to realize that there is a cleanness and elegance to a well-designed typed language (in which category I am hesitant to include C++), and that by forcing you to specify through the problem precisely, static typing eliminates many subtle errors and guides you to a conceptual solution. These days, if I could, I would do all my programming in Haskell. But, unfortunately, most of the world does not agree with me. (-:

Anyway, I feel that some of the same considerations may apply to mathematics. In particular, the role of mathematical foundations is not necessarily restricted to a descriptive one. One must, of course, resist the temptation to be overly prescriptive and thus alienate the majority of mathematicians, but I think that a knowledge of and appreciation for formal logic and foundations has the potential to change one’s practice of mathematics for the better, at least in small ways. I also don’t think it’s wrong for a system for computer formalization of mathematics to attempt small, incremental improvements in the way mathematicians write and reason.

Posted by: Mike Shulman on September 19, 2009 4:43 AM | Permalink | PGP Sig | Reply to this

Re: CCAF, ETCS and type theories

MS: over time I came to realize that there is a cleanness and elegance to a well-designed typed language (in which category I am hesitant to include C++), and that by forcing you to specify through the problem precisely, static typing eliminates many subtle errors and guides you to a conceptual solution.

The formal specification part of FMathl will in fact have a kind of type system to make internally formal argueing and checking easy. But:

This type system does not look like math but like programming, and hence is not appropriate at the abstract level. An ordinary user should therefore never notice that it exists.

The FMathL concept of a type is quite different from that of a type theory. I’ll talk about it here once the design of the specification level is reasonably stable to merit discussion. (Maybe around Christmans?)

Posted by: Arnold Neumaier on September 19, 2009 7:15 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I actually don’t see why that’s any different. If \in can be overloaded, why can’t 22 be overloaded? Numeric literals are polymorphic in Haskell.

Posted by: Mike Shulman on September 18, 2009 5:47 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

In FMathL, objects and morphisms can belong very naturally to several categories.

As a category theorist, I find that very worrying! (-:

I expect you are right that many mathematicians think of the real number 22 as “the same” as the natural number 22, but I think there are a fair number who realize that they are, strictly speaking, different. (They are different in material set theory too, of course: whether you define real numbers using Dedekind cuts or Cauchy sequences, in no case will 22 be {,{}}\{\emptyset,\{\emptyset\}\} or whatever you used for your naturals.) But as I said in my previous post, I think that the usage of the former collection of mathematicians is adequately addressed by notational overloading. (In fact, 22 has a meaning more general than that! It also means 1+11+1 in the generality at least of any abelian group. And some people use it to mean a 2-element set, or the interval category. You just don’t have any hope of capturing mathematical usage without heaps of notational overloading, so as long as it’s there, why not make full use of it?)

However, as a category theorist, I feel perfectly within my rights to object that morally, nothing should ever be an object of two categories at the same time. If that were possible, then people could start talking about things like taking the “intersection” of two categories, and that would just be all wrong. The set \mathbb{R} is an object of SetSet, the field \mathbb{R} is an object of FldFld, and so on, but these are different objects.

Posted by: Mike Shulman on September 18, 2009 5:58 AM | Permalink | Reply to this

objects and morphisms can belong naturally to several categories

MS: It also seems to me that all the things about real numbers you are saying that FMathL does right, it only does right specifically for real (or complex) numbers, because you have built them into the system by fiat. If FMathL didn’t include axioms specifying that there was a particular set called \mathbb{R} with particular properties

In fact, it doesn’t. The axioms only defined when a complex number is real. To define the set of real numbers, one needs to add

(1):={xx¯=x}, \mathbb{R}:=\{x\in\mathbb{C}\mid \overline x = x\},

but to be allowed to do that needs later axioms.

AN: In FMathL, objects and morphisms can belong very naturally to several categories.

MS: As a category theorist, I find that very worrying! (-:

But every undergraduate student is very thankful for not having to distinguish between the many incarnations of 2 (and of compound objects that involve 2) in the many different structures it is in! It simplifies life dramatically without sacrificing the slightest bit of rigor!

MS: I think there are a fair number who realize that they are, strictly speaking, different. (They are different in material set theory too, of course: whether you define real numbers using Dedekind cuts or Cauchy sequences, in no case will 2 be {,{}}\{\emptyset,\{\emptyset\}\} or whatever you used for your naturals.)

This is why I regard both foundations as inadequate.

MS: the usage of the former collection of mathematicians is adequately addressed by notational overloading.

Even defining precisely the details of the process of overloading necessary to make mathematics work as usual is a nightmare (of the same magnitude as that for the needed abuse of notation in ZF-based foundations), and nobody has ever done it.

It is clear that this overloading is not intrinsic to mathematics but only to the attempt to give it a clategorial or set-theoretic foundation. After all, real numbers existed long before Dedekind invented the first straitjacket for it!

MS: You just don’t have any hope of capturing mathematical usage without heaps of notational overloading, so as long as it’s there, why not make full use of it?

When talking of “you”, I think you projected your own hopelessness onto me! I not only have hopes but have a good deal of evidence for its realizability within easily formalizable limits - otherwise I wouldn’t have embarked on the FMathL project.

MS: as a category theorist, I feel perfectly within my rights to object that morally, nothing should ever be an object of two categories at the same time.

In FmathL, you can exercise this right by defining your own version of categories, just as those who want to base their mathematics on ZF can define their own version of ZF-numbers.

MS: If that were possible, then people could start talking about things like taking the “intersection” of two categories,

In FMathL you can form it, but it would not automatically be a category, but an object whose only useful property would be the elements it contains.

MS: and that would just be all wrong.

Wrong at best in the current tradition, but traditions can change. Such a change of tradition might have many advantages, once systematically explored: For example, one can express with this naturally (and every mathematician immediately understands without the need for explanations) that ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation RR (suitably defined), and many other similar constructs.

But please justify your claim of wrongness by giving an example of a major categorial result that becomes wrong when one drops the requirement that different categories may not have common objects.

For I haven’t this seen stated as a formal requirement of the concepts of category theory. It seems to be a mere metarequirement of some category theorists only, and at best of a status like that of the implicit overloading generally employed without saying so explicitly.

For example, Wikipedia says that “Any category C can itself be considered as a new category in a different way: the objects are the same as those in the original category but the arrows are those of the original category reversed. This is called the dual or opposite category”

How does this square with you feeling that morally, nothing should ever be an object of two categories at the same time?

But perhaps “morally” is the qualifier that makes the difference to “in practice”. I am interested in describing mathematics as done in practice, since a good automatic research system must understand the practice, but not necessarily any subjective moral associated with it.

… back to the conversion issue…

It takes a lot of training for a mathematician not already immersed into category theory to believe (and feel happy with) the multitude of trivial conversions needed to state rigorously what you want to consider the moral state of affairs.

This is characteristic for any theory that thinks in constructive terms (like ZF or categories) rather than in specification terms (like FMathL).

Generations of students had to be forced into an unnatural ZF (or ZF-like) straitjacket since, for a long time, that was the only respectable foundation. Traditional categorial foundations only exchange this straitjacket for a different one.

FMathL shows that no such straitjacket is needed since actual mathematical practice can be fully rigorously formalized without any need for accidentals that are not actually used after a concept was defined into existence.

Posted by: Arnold Neumaier on September 18, 2009 11:50 AM | Permalink | Reply to this

Re: overloading

If FMathL didn’t include axioms specifying that there was a particular set called \mathbb{R} with particular properties

In fact, it doesn’t. The axioms only defined when a complex number is real.

Yes, but that is completely irrelevant to the point I was trying to make. The point is that the axioms supply, however they do it, a set \mathbb{R} without specifying how it is constructed.

But every undergraduate student is very thankful for not having to distinguish between the many incarnations of 2

So is every Haskell programmer. But that doesn’t mean that the different incarnations of 2 aren’t different at a fundamental level.

It is clear that this overloading is not intrinsic to mathematics but only to the attempt to give it a clategorial or set-theoretic foundation.

It is clear to me that this overloading is fundamental to the way mathematics is spoken and written by mathematicians. I’m guessing that even you will not claim that the real number 22 is the same as the element of /p\mathbb{Z}/p\mathbb{Z} denoted by 22 — especially when p=2p=2 (the integer) because then 2=02=0 in /p\mathbb{Z}/p\mathbb{Z}, which it assuredly does not in \mathbb{R}! So any system which can parse mathematics as it is actually written by mathematicians will have to allow overloading, regardless of how nightmarish it might be, and regardless of the foundations one chooses. (And I thought you already agreed that overloading \in was reasonable! why is overloading 22 different?)

I not only have hopes but have a good deal of evidence for [capturing mathematical usage without heaps of notational overloading]

I would like to see some of this evidence.

It takes a lot of training for a mathematician not already immersed into category theory to believe (and feel happy with) the multitude of trivial conversions needed to state rigorously what you want to consider the moral state of affairs.

But I thought the whole point of this project is so that the mathematician not already immersed in any sort of foundations doesn’t have to believe in or feel happy with those foundations; they can just write mathematics as they usually do and the system will interpret it correctly.

Posted by: Mike Shulman on September 18, 2009 6:21 PM | Permalink | Reply to this

Re: overloading

MS: even you will not claim that the real number 2 is the same as the element of ℤ/pℤ denoted by 2 — especially when p=2 (the integer) because then 2=0 in ℤ/pℤ, which it assuredly does not in ℝ! So any system which can parse mathematics as it is actually written by mathematicians will have to allow overloading.

I call this context-dependent ambiguity. This has nothing to do with types.

Today I was grading a paper where c was defined as a certasin product, and a few lines later was a formula involving c in the denominator, explained to be the speed of light. And the student went on saying: Therefore c is very small and can be neglected. I was very puzzled in the first round of reasding until I noticed that there were two different meanings of the symbol c.

I conclude that the existence of context-dependent ambiguity must be accounted for. But it doesn’t give licence to generate more versions of 2 than are absolutely needed.

MS: I would like to see some of this evidence.

It is difficult to convey direct evidence before we have fixed the formal framework for representing mathematical context, but here is the idea:

I mentioned already elsewhere in this discussion that one can avoid the multiple meanings of \mathbb{N} by interpreting it in each context as the richest structure that can be picked up from the context.

This principle works very generally, and is, I believe, consistent with the way mathematicians work (excepting perhaps category theorists).

As an automatical research system must have anyway the capacity to build up context, this accommodated the principle without problems, without the need of overloading. Each context has its own collection of interpretations.

in some sense, this uses the same idea as the categorial approach to foundations, but to turn each context into a category in order to be able to use the categorial version of this idea for general context changes in mathematics would create another straitjacket…

MS: I thought the whole point of this project is so that the mathematician not already immersed in any sort of foundations doesn’t have to believe in or feel happy with those foundations; they can just write mathematics as they usually do and the system will interpret it correctly.

This is the main point, but not the whole point. The whole point also involves making the system trustworthy in that everyone interested in checking out how the system arrives at its answers should be able to get as much detail as wanted, and also in the simplest possible form. In particular, someone extremely critical should be able to check for himself (if possible without any use of a machine) that the whole system works in a sound way. This means that the basics must be as transparent as possible.

Compare with Coq. I enter a conjecture; Coq works on it for a week and then says: “The conjecture is true, and here are 3749MB of proof text that Coq_V20.57 verified to be correct.

Nobody is going to check that, except perhaps another machine. Humans must take it on trust. But on which basis? At least you’d want to check the implementation of Coq to get assurance. But Coq is a huge package….

The core of FMathL will have to be a small package, with programs transparent enough to be checked by hand. This is possible only if things are kept as simple as possible.

Posted by: Arnold Neumaier on September 18, 2009 7:30 PM | Permalink | Reply to this

Re: overloading

I’m confused; are you saying that 22 should not be used to mean 1+11+1 in /p\mathbb{Z}/p\mathbb{Z}? I think that is a perfectly justifiable notation and, I believe, so do many other mathematicians.

Posted by: Mike Shulman on September 18, 2009 8:45 PM | Permalink | PGP Sig | Reply to this

Re: overloading

MS: are you saying that 2 should not be used to mean 1+1 in ℤ/pℤ?

No. I am saying that 2 has a context-dependent ambiguous meaning.

In a context where ℤ/pℤ (or another ring with 1) appears as a structure whose elements are discussed, 2 should mean 1+1 in this ring, wheres in the absence of such a context, 2 should be considered as the default: a complex number (and at the same time as a real, rational, integral, and natural number).

Thus I accept notation ambiguity - to be able to recognize ambiguity and resolve it from context is essential for any automatic math system.

But I do not regard notation ambiguity as something to be captured by a concept of overloading and subtyping, since not every notation ambiguity can be considered as an instance of the latter.

Therefore I accept that “2 in ℤ/pℤ” and “2 as natural number” are different objects, They are rarely used in the same context.

But I do not accept that “2 as natural number” and “2 as complex number” are different objects. Almost everyone will agree with me. (In algebra I was even taught how to achieve this uniqueness of 2 in ZF using a construction called identification.)

Posted by: Arnold Neumaier on September 18, 2009 9:00 PM | Permalink | Reply to this

Re: overloading

But I do not regard notation ambiguity as something to be captured by a concept of overloading and subtyping, since not every notation ambiguity can be considered as an instance of the latter.

Can you give some examples of the sort of notation ambiguity you are thinking of which can’t be captured by overloading and subtyping?

Posted by: Mike Shulman on September 19, 2009 2:28 AM | Permalink | PGP Sig | Reply to this

Re: overloading

MS: Can you give some examples of the sort of notation ambiguity you are thinking of which can’t be captured by overloading and subtyping?

For example, writing xy*zx \circ y *z without having defined priorities of the operations (and where different priority rules exist in different traditions), and both ways to interpret them make sense. The dangling else is a famous instance of that.

Of course, one can write formal papers that avoid any sort of ambiguity. This is what is done in Mizar, but it accounts for the huge overhead that makes Mizar not a very practical tool for the ordinary mathematician.

FMathL must resolve these issues by reasoning, figuring out which version makes sense in the context. Typing is just the simplest way of doing such reasoning in cases where it applies.

Posted by: Arnold Neumaier on September 19, 2009 5:25 PM | Permalink | Reply to this

Re: overloading

Thanks for the examples, now I know what you mean.

Typing is just the simplest way of doing such reasoning in cases where it applies.

Maybe this is the core of our disagreement. I do not see types as “just” a way of resolving notational ambiguity. Rather, types carry important semantic information in their own right.

It seems unlikely that either of us will convince the other, however, since versions of this argument have been raging for years. But I’m very glad to have had / be having the discussion; I think I’ve learned a lot.

Posted by: Mike Shulman on September 19, 2009 6:50 PM | Permalink | PGP Sig | Reply to this

Re: overloading

MS: types carry important semantic information in their own right.

Yes, they do. But the common type systems are far too inflexible to capture all of the semantics. Since a system like MathML needs anyway a way to represent arbitrary semantics, the more general semantic system will automatically take over the function a typing system would have.

Once one has the semantic system, one can choose to represent things there in any way one likes. It will be easy to create in FMathL contexts that work exactly like HOL and allow you to do everything with overloading if you believe this is the way mathematics should be done on the base level. But the standard context for doing mathematics will most likely be different, since typing introduces a lot of unnecessary representation overhead.

On the other hand, since the semantics of everything is clearly structured, it will not be very difficult to create automatic translators from one sort of mathematical representation to another sort. Indeed, this will be one of the strengths of the FMathL system and its view of many subject levels with a common object level.

Every mathematician can view mathematics in his or her preferred way, asnd still be sure that everything translates correctly to every other mathermatician’s view.

MS: But I’m very glad to have had / be having the discussion; I think I’ve learned a lot.

The same hold for me. It is very interesting and helpful.

Posted by: Arnold Neumaier on September 19, 2009 8:07 PM | Permalink | Reply to this

Re: overloading

But I do not accept that “2 as natural number” and “2 as complex number” are different objects. Almost everyone will agree with me.

I would be very interested to see a wide-ranging survey of professional mathematicians on this point, broken down by field and perhaps age. I’m not saying you’re wrong that “almost everyone” will agree with you, but I don’t really have data to judge one way or the other.

I find it markedly inconsistent and arbitrary to view 2/p2\in\mathbb{Z}/p\mathbb{Z} and 22\in\mathbb{Z} as different, but 22\in\mathbb{Z} and 22\in\mathbb{C} as the same. What about 2 p2\in\mathbb{Q}_p? Or 2¯2\in\overline{\mathbb{Q}}? What’s the general rule?

Posted by: Mike Shulman on September 19, 2009 6:59 PM | Permalink | PGP Sig | Reply to this

Re: overloading

MS: I find it markedly inconsistent and arbitrary to view 2/p2\in\mathbb{Z}/p\mathbb{Z} and 22\in\mathbb{Z} as different, but 22 \in\mathbb{Z} and 22 \in\mathbb{C} as the same. What about 2 p2\in \mathbb{Q}_p? Or 2¯2\in \overline\mathbb{Q}? What’s the general rule?

I didn’t invent that. For Bourbaki (whom I am following in this), the rule is that an object xAx\in A remains xx even when it is considered as an element of BB where BB contains AA.

Bourbaki has a general construction called identification that replaces (for example) the pseudo-rationals in the Dedekind reals by the true rationals from which the Dedekind reals were constructed, and adapts the operations in such a way that the resulting field of reals contains the ordinary rationals.

Thus what (in my understanding of) category theory (maybe I am not using the correct term here) is an embedding functor is for Bourbaki the identity mapping.

Upon having described identification at the first occasion where it occurs, it suffices later (when discussing reals) to say that “by identification, we may take Q to be a subset of R.” Somewhere, an abuse of language is introduced to say that identification will be made silently if the embedding is canonical. This establishes the standard mathematical terminology in a completely rigorous way.

Thus the 2 p2\in \mathbb{Q}_p and the 2¯2\in \overline\mathbb{Q} are the same as the 22\in \mathbb{C}, as all three sets contain \mathbb{Z}, which in turn contains 2.

Posted by: Arnold Neumaier on September 19, 2009 8:08 PM | Permalink | Reply to this

Re: overloading

After a long conversation with a friend this evening, I feel like I have a better understanding of how and why many/most people may think of 22\in\mathbb{Z} and 22\in \mathbb{R} as identical. Perhaps I have unknowingly trained myself to think of them as different, because that is what the structural approach to mathematics requires (and I think of mathematics structurally because that is what I see when I look at mathematics—although I now accept that you see something different).

Honestly, it just feels really messy to me to think about the category RingRing (say) if some elements of some rings might be equal to some elements of other rings, depending on how one constructed them and whether an embedding of one into another is “canonical” or not. But aesthetics certainly differ! (-:

Posted by: Mike Shulman on September 20, 2009 6:26 AM | Permalink | PGP Sig | Reply to this

Re: objects and morphisms can belong naturally to several categories

ordered monoids are the objects in the intersection of Order and Monoid

As far as I can tell, this is not true even in FMathL (and I don’t see how it could ever be true). An order is a set equipped with an order relation, and monoid is a set equipped with a multiplication and unit. An ordered monoid is a set equipped with both. What is true instead is that ordered monoids are the objects of the pullback of the two forgetful functors from OrderOrder and MonoidMonoid to SetSet.

But please justify your claim of wrongness by giving an example of a major categorial result that becomes wrong when one drops the requirement that different categories may not have common objects.

Mathematics is not just about results. Mathematics, and especially category theory, is also about definitions and concepts. The point is that one object being in two categories is foreign to the practice of mathematics and can only lead to confusion. You’ve supplied me with a case in point above: if it were possible for one object to be in two categories, then I can easily see some people thinking that ordered monoids are the intersection of OrderOrder and MonoidMonoid, when in fact this is false.

Your example of opposite categories is also a good one. The opposite of the category of frames is the category of locales, but a frame is not the same as a locale. For instance, the terminal frame is very different from the terminal locale. Suppose I wanted to study “localic frames,” i.e. the point-free version of “topological frames,” which in turn would be frames whose frame operations are continuous. If someone has been told that categories can be intersected, and maybe is still laboring under the misapprehension that ordered monoids are the intersection of OrderOrder and MonoidMonoid, they might immediately try to define localic frames as the interesction of LocaleLocale and FrameFrame. But if LocaleLocale is defined as Frame opFrame^{op} and opposite pairs of categories “have the same objects,” then LocaleFrameLocale \cap Frame would consist just of the frames/locales, a far cry from “localic frames.” Trying to intersect categories should be a type error.

Posted by: Mike Shulman on September 18, 2009 6:38 PM | Permalink | PGP Sig | Reply to this

Re: objects and morphisms can belong naturally to several categories

AN: Such a change of tradition might have many advantages […] ordered monoids are the objects in the intersection of Order and Monoid that ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation R (suitably defined),

MS: As far as I can tell, this is not true even in FMathL (and I don’t see how it could ever be true). An order is a set equipped with an order relation, and monoid is a set equipped with a multiplication and unit.

I was speaking in subjunctive mode (and clearly not about the meaning of the terms in existing foundations) since this is not a reality but only a possibility that might (or might not) work out. (I thought the cafe is also about exloring possibilities, not only about presenting truths.) Since the current FMathL framework does not have a formal way to say what a monoid is, nobody knows yet whether this could be true in FMathL. This brings us to semantical questions that need to be resolved on the next layer of FMathL that we are designing at the moment, but it is not yet ready:

What does it formally mean to speak of “a set equipped with a relation”?

In ZF, it means to have a pair (S,R) where S is a set and R is a relation, and this is easily formalized in ZF itself.

In category theory, its formal meaning must be something different; I don’t know exactly what. How would this be formally expressed, using only terms intrinsic to the language of categories?

FMathl is neither ZF nor category theory, hence can give it again a different meaning. (What it will be, will be decided by the end of the year, I guess.)

Probably, but I haven’t checked the details, one can give the intersection of categories in FMathL a meaning that makes intersections of categories of two algebraic structures inherit the properties of both structures in such a way that it is consistent with Axiom A10 governing intersection.

MS: Your example of opposite categories is also a good one. The opposite of the category of frames is the category of locales, but a frame is not the same as a locale.

There must be something wrong either with your statement or with the definition of opposites in Wikipedia.

Wikipedia said explicitly that the two categories have the same objects. And this seems to be the canonical definition: Adámek et al, Abstract and Concrete Categories say something amounting to the same in Definition 3.5. So does Definition 1.3.2 of Asperti and Longo, Categories, Types and Structures. So says Section 1.4 in Barr and Wells, Toposes, Triples and Theories. So says Section 1.6.1 in Schalk and Simmons, An introduction to Category Theory in four easy movements.

Not being an expert in category theory, I need to rely on trustworthy sources for the accepted meaning of the basic concepts. There seems to be full agreement in the literature (at least that available online) that at least certain distinct categories have the same objects.

MS: But if Locale is defined as Frameop^{op} and opposite pairs of categories “have the same objects,” then Locale \cap Frame would consist just of the frames/locales, a far cry from “localic frames.”

I agree. But backed by the orthodox definition of opposites, and with the usual moral of a mathematician, I’d draw the conclusion that your statement “The opposite of the category of frames is the category of locales, but a frame is not the same as a locale” is incompatible with the definition of opposites when taken literally. (FMathL would raise here a popup window and ask for support or correction or clarifying context.)

But perhaps there is something else wrong with my moral of reading category theory texts besides what Todd Trimble pointed out. (Just as I learn through such a discussion, FMathL would have an internal moral code that learns from the feedback from popup windows.)

Posted by: Arnold Neumaier on September 18, 2009 8:46 PM | Permalink | Reply to this

can objects and morphisms belong to several categories?

I think that making ordered monoids the intersection of OrderOrder and MonoidMonoid, in any formal system, would be a very bad idea. (I also have my doubts about whether it is possible in a consistent way.) Everywhere in mathematics that I have seen “intersection” used, it refers to adding properties, rather than structure as in this case. Furthermore, a category should only be considered as defined up to equivalence, and this sort of “intersection” seems unlikely to be invariant under equivalence.

What does it formally mean to speak of “a set equipped with a relation”?

In ETCS, a set equipped with a relation consists of an object AA, an object RR, and a monomorphism RA×AR\to A\times A.

There must be something wrong either with your statement or with the definition of opposites

Very interesting! I didn’t realize that the basic texts of category theory could be misinterpreted in this way by someone unfamiliar with our way of thinking.

I think that all category theorists, as well as most of the mathematicians they talk a lot to (algebraists, topologists, etc.), are so immersed in a structural way of thinking that an object of a category only has meaning as an object of that category. When you construct one category from another, you might use the “same” set of objects, but once you’ve constructed it, there is no relationship between the objects, because after all any category is only defined up to equivalence.

This phenomenon isn’t special to categories. For instance, any group has an opposite group with “the same elements” obtained by reversing the order of multiplication, but I don’t think any group theorist would then consider it meaningful to take the intersection of a group and its opposite group. Using “the same elements” is a construction of the opposite, with the same status as Dedekind cuts or Cauchy sequences—once the construction is performed, the fact that you used “the same” objects is discarded. And plenty of other constructions of the opposite are possible.

In fact, this same principle applies to basically all constructions I am familiar with in mathematics. The quotient of one group by another, the polynomial algebra of a ring, the Postnikov tower of a topological space—in each case you may give a specific construction in terms of the input (e.g. maybe an element of R[x]R[x] “is” a function from {0,,n}\{0,\dots,n\} to RR, thought of as the coefficients of a polynomial), but in each case once the construction is performed, its details are forgotten.

I always assumed, without really thinking much about it, that all modern mathematicians thought in this way, except for maybe ZF-theorists (on some days of the week). But apparently not!

Posted by: Mike Shulman on September 18, 2009 9:18 PM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

MS: In ETCS, a set equipped with a relation consists of an object A, an object R, and a monomorphism R→A×A.

This only goes halfway towards answering my question. You reduced it to another informal construct.

What does it formally mean that something consists of three typed things?

Posted by: Arnold Neumaier on September 18, 2009 10:02 PM | Permalink | Reply to this

what are structures, structurally?

“A set equipped with a binary relation” is not a single object in the discourse of structural set theory the way a pair (A,R)(A,R) is a single object in the discourse of material set theory. But that doesn’t make it informal. If I want to make a statement about all sets equipped with binary relations, I can construct a formal sentence of the form

A.R:P(A×A).\forall A. \forall R:P(A\times A). \dots

Posted by: Mike Shulman on September 19, 2009 5:03 AM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

MS: a category should only be considered as defined up to equivalence, and this sort of “intersection” seems unlikely to be invariant under equivalence.

This appears to be part of the unspoken moral code of categorist. But all categories they write down are defined as particular categories, and equivalence of categories is a concept that doesn’t appear on page 1, where it should figure if categories are really defined only up to equivalence.

Let’s not further discuss intersection; maybe this is still unbaked. But clarifying the moral code so that it is intelligible to an outsider who learns through self-study (ultimately I am thinking of the FMathL system to be that outsider) seems quite important!

AN: There must be something wrong either with your statement or with the definition of opposites

MS: Very interesting! I didn’t realize that the basic texts of category theory could be misinterpreted in this way by someone unfamiliar with our way of thinking.

I have aquired an eye for all these subtleties (where an automatic system would stumble upon) because I’ve been observing myself closely the last few years how I read math, in order to ba able to teach it to the FMathL system.

I can’t see how the definitions can be read in any other way by someone reared in the tradition of Bourbaki. If you take ZF as the metatheory then none of the usual rules for identifying abuses of terminology give you the slightest clue that something else could have been intended!

MS: I think that all category theorists, as well as most of the mathematicians they talk a lot to (algebraists, topologists, etc.), are so immersed in a structural way of thinking …

… that hey have lost contact with those mathematicians whose daily work is a bit less abstract?

I was taught structural thinking but in the way of Bourbaki, rather than in the categorial way. There one clearly distinguishes between equality and isomorphism, although one knows that in many cases only the isomorphism-invariant propoerties are relevant. But being able to distinguish the two modes has lots of advantages; in particular, there is no problem of evil.

One knows that Alt(5) and PSL(2,5) are isomorphic groups, but as individuals the two groups are naturally distinguishable by their construction. Onene distinguishes clearly between the group Alt(5) with its intrinsic action on 5 elements (though we allowed abuse of notation to label these elements arbitrarily, but idf pressed, we’d have undone that) and a group Alt(5), which is just an arbitrary group isomorphic to Alt(5). Thus one could say that “Alt(6) contains the groups Alt(5) and PSL(2,5), and therefore two conjugacy classes of alternating groups of 5 elements”, with a perfectly clear meaning.

I know that this can be reformulated in categorial terms, but if we want to maintain the same degree of precision, it becomes clumsy.

MS: Using “the same elements” is a construction of the opposite, with the same status as Dedekind cuts or Cauchy sequences - once the construction is performed, the fact that you used “the same” objects is discarded.

I find this a weakness of the constructive approach to math…

MS: And plenty of other constructions of the opposite are possible.

It is interesting, though, that all books use the same construction.

MS: I always assumed, without really thinking much about it, that all modern mathematicians thought in this way, except for maybe ZF-theorists (on some days of the week). But apparently not!

Maybe I am not a modern mathematician. I was still taught the old virtues of precise definitions, and the advantage of having two different words (such as “same” and “isomorphic”) for concepts whose confusion causes confusion.

Posted by: Arnold Neumaier on September 18, 2009 10:41 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

This reminds me of a question about forgetful functors that I’ve had for a while.

The definition of “forgetful function” is (in every account I’ve seen) specified in terms of removing structure from a set equipped with some structure.

This very much depends on the presentation of the relevant categories. How can we define “forgetful functor” in a way that’s invariant under equivalence?

Posted by: Tom on September 18, 2009 11:21 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

How can we define “forgetful functor” in a way that’s invariant under equivalence?

I'd say (and Mike did say, at [[forgetful functor]]) that any functor can be a forgetful functor, depending on your point of view; calling it that simply establishes a (perhaps temporary) point of view.

Posted by: Toby Bartels on September 19, 2009 12:14 AM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

I have aquired an eye for all these subtleties (where an automatic system would stumble upon)

I would humbly suggest that an automatic system need only stumble upon them if it had been designed by a human mathematician who was unconversant with structural ways of thinking.

Thus one could say that “Alt(6) contains the groups Alt(5) and PSL(2,5), and therefore two conjugacy classes of alternating groups of 5 elements”, with a perfectly clear meaning.

Not very clear to me; perhaps our linguistic conventions are equally mutually unintelligible. Do you mean that Alt(6), considered together with its canonical action on a 6-element set, contains the group Alt(5) with its canonical action on a 5-element set and the group PSL(2,5) with its canonical action on some other set? If so, I don’t see why it should become clumsy in typed language; we just overload Alt(5) so that it can mean either “the group Alt(5)” or “the group Alt(5) with its canonical action.”

But being able to distinguish the two modes has lots of advantages; in particular, there is no problem of evil.

I am puzzled by this statement; it seems to me that only if there is a notion of equality, in addition to a notion of isomorphism, can the problem of evil even be posed.

MS: Using “the same elements” is a construction of the opposite, with the same status as Dedekind cuts or Cauchy sequences - once the construction is performed, the fact that you used “the same” objects is discarded.

I find this a weakness of the constructive approach to math…

But this seems to me exactly what FMathL is doing, when you first construct something, then define the class of things isomorphic to it, and redefine that object to be something arbitrarily chosen from that class. You use a particular construction, then you discard its details.

And actually, I don’t know what you mean by the “constructive approach” here, nor what other approach you are contrasting it with.

MS: And plenty of other constructions of the opposite are possible.

It is interesting, though, that all books use the same construction.

Well, of course. It’s the simplest one. I expect that basically all books use the same construction of the rational numbers in terms of the integers.

Maybe I am not a modern mathematician. I was still taught the old virtues of precise definitions

I’m having trouble not being insulted by that. Everything we’ve said here is perfectly precise.

Posted by: Mike Shulman on September 19, 2009 3:02 AM | Permalink | PGP Sig | Reply to this

Precise definitons

AN: Maybe I am not a modern mathematician. I was still taught the old virtues of precise definitions

MS: I’m having trouble not being insulted by that. Everything we’ve said here is perfectly precise.

The first was not intended; I just was explaining my trainig, not makin a comparison with yours. (Please do not take anything I say personal.) The second is simply false.

I was specifically referring to the fact that all authoritative definitions I could find on the web on the definition of opposite categories say explicitly that they have the same objects, while you said equally explicitly (and now even supposedly perfectly precisely) that “morally, nothing should ever be an object of two categories at the same time”.

Please give me the perfectly precise meaning of the term “morally” that you had in mind when writing “Everything we’ve said here is perfectly precise.” It is the only point of dispute here. For without the moral part, my interpretation is correct.

Not a single word in the definitions of Wikipedia (or any of the other sources quoted earlier) tells me that two different categories must have disjoint object classes. If this were really part of category theory, it would be trivial to state it as part of the definition of caategories, and surely some author would have cared enough about precision to do so.

But no author I know of has done it, and for good reasons. For it would make the standard definition contradictory, since it is easy to construct two different categories in the sense of the standard definition that share some objects.

Posted by: Arnold Neumaier on September 19, 2009 6:10 PM | Permalink | Reply to this

Re: Precise definitons

How about this, Arnold: two categories may have an object in common, but you should never use that fact. The construction that shows that opposite categories exist (inside a model of set theory as foundation) uses the same objects as the original category (so we’re sure that there are “enough” objects around) but from that point we never actually use that fact.

Here’s an analogous situation not using categories: the von Neumann construction showing that a model of the Peano axioms exists within set theory defines 4 as the set {0,1,2,3}. Thus within this construction it happens that 3 is an element of 4, but from this point we never use this fact, since numbers being elements of other numbers is a property of the model, not of the structure. Technically you can make this statement within the model, but “morally” you shouldn’t.

Similarly, it’s easy to come up with an equivalent category which also behaves like the opposite category, but which shares no components at all with the original category. But it really doesn’t matter, since equality of components of two distinct categories is not part of the structure.

Posted by: John Armstrong on September 19, 2009 7:15 PM | Permalink | Reply to this

Re: Precise definitons

Thanks, that’s about what I meant.

I wasn’t including statements prefixed by “morally” in my analysis of what is precise and what isn’t; perhaps I should have said something like “every mathematical definition we’ve given has been precise.”

Posted by: Mike Shulman on September 19, 2009 7:54 PM | Permalink | PGP Sig | Reply to this

Re: Precise definitons

JA: two categories may have an object in common, but you should never use that fact. The construction that shows that opposite categories exist (inside a model of set theory as foundation) uses the same objects as the original category (so we’re sure that there are “enough” objects around) but from that point we never actually use that fact.

You are contradicting yourself.

The construction that shows that opposite categories exist must use the fact that two categories may have an object in common, otherwise it loses its simplicity.

The same holds in many other instances in current category theory texts.

The point is that in constructions, one freely uses (and needs) the definition of categories in the Bourbaki sense (i.e., as ordinary algebraic structures, allowing categories to share objects).

But after one has constructed an instance of the desired category, one forms its isomorphy class and chooses an anonymous element from it (in the sense of Bourbaki’s - actually Hilbert’s - choice operator used also in FMathL) to get rid of the accidentals of the construction.

Thus one needs both interpretations of categories, the concrete one (an instance) and the generic one (an anonymous instance).

FMathL naturally provides both views, and does not need to impose on the user a moral (“you should not”) that mathematics never had, beyond sticking to what is set down in the axioms.

Posted by: Arnold Neumaier on September 19, 2009 8:35 PM | Permalink | Reply to this

Re: Precise definitons

The construction that shows that opposite categories exist must use the fact that two categories may have an object in common.

No more than the von Neumann construction shows that the number 3 must be an element of the number four. The usual construction of the opposite category is one of many, but the nature of the opposite category is not defined by this construction or any artifacts of this construction.

And I even said that it’s possible to construct another version of the opposite category whose components are completely disjoint from those of the original category. Please to go back and read what I wrote.

On another note, when did mathematics not regard the syntactic validity of statements like “3 is an element of 4” as problematic at best? You can ask “is 3 an element of 4?”, or “is this object from one category ‘the same as’ that object from another category?”, or “does a baseball shortstop have red hair?”, but these questions are all meaningless because they’re not part of the relevant structures. That’s what “moral” means here. You can ask whether two categories share objects all you like, but the question is as completely beside the point as either of the other two.

Posted by: John Armstrong on September 19, 2009 11:08 PM | Permalink | Reply to this

Re: Precise definitons

JA: two categories may have an object in common, but you should never use that fact.

AN: The construction that shows that opposite categories exist must use the fact that two categories may have an object in common.

JA: No more than the von Neumann construction shows that the number 3 must be an element of the number four.

If you define the number 3 as a von Neumann numeral, it must have this property. If you define the number 3 in a different way it doesn’t. It depends how you formulate your definition.

I was referring to the definiton as found in any of the standard sources quoted.

I think a “should” has no place in mathematics, apart from the requirement that one should take as true only what axioms, definitions, and proved theorems say.

Thus any should in mathematics must be formalized in such terms. This is the only way to keep the semantics of mathematics precise.

JA: “You can ask “is 3 an element of 4”, […] but these questions are all meaningless because they’re not part of the relevant structures. That’s what “moral” means here.

I am insisting on giving this a formal meaning since otherwise it is not possible to teach it to an automatic system. But I don’t see any way to make the moral you want to impose formally precise in any system at all without violating this morality somewhere.

Posted by: Arnold Neumaier on September 20, 2009 12:12 PM | Permalink | Reply to this

Re: Precise definitons

I am insisting on giving this a formal meaning since otherwise it is not possible to teach it to an automatic system.

You have this precisely backwards. It’s perfectly simple to teach a formal system not to ask whether one number “is an element of” another number. Just don’t define “is an element of” to have any meaning for numbers.

Posted by: John Armstrong on September 20, 2009 3:36 PM | Permalink | Reply to this

Re: Precise definitons

JA:two categories may have an object in common, but you should never use that fact.

JA: “You can ask “is 3 an element of 4”, […] but these questions are all meaningless because they’re not part of the relevant structures. That’s what “moral” means here.

AN: I am insisting on giving this a formal meaning since otherwise it is not possible to teach it to an automatic system.

JA: You have this precisely backwards. It’s perfectly simple to teach a formal system not to ask whether one number “is an element of” another number. Just don’t define “is an element of” to have any meaning for numbers.

My “this” referred to “the categorial moral”, not to this specific example.

In a typed system, the example already has the precise meaning of “undefined”. But the moral also contains your statement at the top of this message.

How do you make this precise enough that an automatic system does not feel entitled after having seen the definition of a category (Definition 1.1.1 in Asperti and Longo) to make Definition 1.3.1 (which uses \subseteq, which is defined only in terms of equality between objects)?

And why should one follow your injunction when the standard textbooks don’t follow it? (Definition 1.3.1 is completely standard.)

Posted by: Arnold Neumaier on September 21, 2009 2:46 PM | Permalink | Reply to this

Re: Precise definitons

And why should one follow your injunction when the standard textbooks don’t follow it? (Definition 1.3.1 is completely standard.)

You’re completely (intentionally?) missing the distinction I drew between a construction demonstrating the existence of a model of a structure and the subsequent use of the properties of a structure. As I said before, “moral” (which was someone else’s term) refers to the latter segment, not the former.

Von Neumann numerals are also “completely standard”, and in their construction some numbers are elements of other numbers, but once we’ve constructed this model (to show that a NNO exists) we never again use that fact because it’s a property of the model and not of the structure.

I’m done here.

Posted by: John Armstrong on September 21, 2009 3:51 PM | Permalink | Reply to this

Re: Precise definitons

I think a “should” has no place in mathematics, apart from the requirement that one should take as true only what axioms, definitions, and proved theorems say.

That’s very different from my philosophy, and from my experience doing mathematics and talking about it with other mathematicians. Some “should”s that I can think of off the top of my head: one should use enough generality but not too much, one should not use confusing variable names (even if they are formally correct), one should not use redundant or unnecessary axioms, one should choose the names of defined terms in a consistent way, and one should not invent and study concepts that have no motivation or relation to the rest of mathematics. Of course, as always, not everyone agrees on what one should and shouldn’t do, but I think mathematics is rife with normative judgements beyond truth and falsity.

Posted by: Mike Shulman on September 21, 2009 5:37 AM | Permalink | PGP Sig | Reply to this

Re: Precise definitons

AN: I think a “should” has no place in mathematics, apart from the requirement that one should take as true only what axioms, definitions, and proved theorems say.

MS: That’s very different from my philosophy, and from my experience doing mathematics and talking about it with other mathematicians. Some “should”s that I can think of off the top of my head: one should use enough generality but not too much […]

I don’t think these are shoulds that make a difference in mathematical understanding but in the quality of the resulting mathematics. Of course good mathematics is governed by lots of shoulds.

But mathematics itself is not. Shoulds have no place in the interpretation of the meaning of a well-formed piece of (even irrelevant, too abstract, too special, or too longwinded) mathematical text, and on what a mathematician is allowed to do with it without leaving the realm of the theory.

In our present discussion you had wondered about how much misunderstanding is possible by not knowing the shoulds.

I still think that I interpreted the definitions in an impeccable way, and indeed in the way the definitions are used (at least at times) by category theorist. I even believe that they cannot be interpreted in any other way from a strictly formal perspective. The moral seems to lie only in the labeling of some of it as evil or unnatural.

Anyway, I see signs of slow convergence towards a consensus!

Posted by: Arnold Neumaier on September 21, 2009 3:20 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

MS: I would humbly suggest that an automatic system need only stumble upon them if it had been designed by a human mathematician who was unconversant with structural ways of thinking.

I have not the slightest idea how a system should be constructed that, without having any prior notion of category theory, being exposed to the Wikipedia article quoted would not be able to construct categories satisfying the axioms that have common objects.

MS: I don’t see why it should become clumsy in typed language; we just overload Alt(5) so that it can mean either “the group Alt(5)” or “the group Alt(5) with its canonical action.”

Yes, of course. I didn’t say that one can solve many of these problems by overloeading. I just explained the way such things were treated in the tradition I grew up with, and that it worked well.

MS: it seems to me that only if there is a notion of equality, in addition to a notion of isomorphism, can the problem of evil even be posed.

But there must be such a notion of equality intrinsic in any definition of category theory (as opposed to categories). One cannot purge this sort of evil.

One must know that if you take two objects A,B from a category C that any two mentions of A are identical objects (not only the same up to isomorphy), while a mention of A and a mention of B are possibly identical. Otherwise one cannot embed category theory into standard logic, and structural concepts such as automorphisms would not make sense.

One also needs it in order to be able to define opposite categories in the standard way, even if one subsequently forgets how they were constructed.

MS: this seems to me exactly what FMathL is doing, when you first construct something, then define the class of things isomorphic to it, and redefine that object to be something arbitrarily chosen from that class. You use a particular construction, then you discard its details.

Of course, this was frequently done in mathematics, already before categories were born. But thwe point is that FMathL can choose when to do it, and indicates it (as Bourbaki would have done), while in category theory, one is forced to do it, even when one does not want to do it.

For example, if one treats posets arising in practical programming as categories, one almost always needs them in their concretely defined form, and not only up to isomorphism. And it is clear that categories, ad defined everywhere, may have this concrete from.

Thus FMathL preserves an important freedom of mathematicians that an equality-free categorial approach tries to forbid for purist reasons. But it also accommodates a purist categorial approach as you propose it, just because of your observation.

MS: I don’t know what you mean by the “constructive approach” here, nor what other approach you are contrasting it with.

I had contrasted the constructive approach that defines quaternions via a construction and the specification approach that defines quaternions via a characterization. You had then mentioned that initially, a characterization might not be available, and I had replied why this does not make the FMathL approach unattractive.

MS: I expect that basically all books use the same construction of the rational numbers in terms of the integers.

Here is an exception: Arnold Neumaier, Analysis und lineare Algebra, Lecture Notes in German

Posted by: Arnold Neumaier on September 19, 2009 7:13 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

I have not the slightest idea how a system should be constructed that, without having any prior notion of category theory, being exposed to the Wikipedia article quoted would not be able to construct categories satisfying the axioms that have common objects.

How about a system which is structural, and therefore which regards the question of whether two different structures (such as two different categories) have common elements as a type error?

I didn’t say that one can[‘t] solve many of these problems by overloeading.

No, but what you did say was:

I know that this can be reformulated in categorial terms, but if we want to maintain the same degree of precision, it becomes clumsy.

I was pointing out that we can maintain the same degree of precision in a structural theory with overloading, without it becoming clumsy.

One must know that if you take two objects A,B from a category C that any two mentions of A are identical objects (not only the same up to isomorphy)

Actually, it’s good enough if any two mentions of A are connected by a specified isomorphism in a coherent way. But there is a distinction between naming a given object and asking whether two given objects are identical.

in category theory, one is forced to do it, even when one does not want to do it.

This is not true. You can always keep the extra structure around which you used to construct the gadget; you don’t have to forget about it. If you construct the reals as a subset of P()P(\mathbb{Q}) using Dedekind cuts, you don’t have to then forget about the \in relation relating them to \mathbb{Q}.

Posted by: Mike Shulman on September 19, 2009 8:03 PM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

MS: How about a system which is structural, and therefore which regards the question of whether two different structures (such as two different categories) have common elements as a type error?

It would report a type error in the standard definition of the opposite category, if written down formally.

MS: I was pointing out that we can maintain the same degree of precision in a structural theory with overloading, without it becoming clumsy.

Not on the surface, because of the overloading, but inside the system, which must track all that overloading. One needs to consider both aspects - efficiency for the user and efficiency for the system.

MS: Actually, it\u2019s good enough if any two mentions of A are connected by a specified isomorphism in a coherent way

So with nn mentions, you created an internal overhead of O(n 2)O(n^2). In a lengthy proof, nn can be large. Thus at least an efficient implementation may not pretend that it adheres to the categorial moral.

But a solid foundation must also be able to describe what happens on the implementation level.

MS: But there is a distinction between naming a given object and asking whether two given objects are identical.

I don’t think this is good enough. For often one may want to derive results that hold for general A, B, and later one may want to use this result for the special case where A and B are the same.

MS: .If you construct the reals as a subset of P*()P*(\mathbb{Q}) using Dedekind cuts, you don’t have to then forget about the \in relation relating them to mathbbQmathbb{Q}.

How do I then refer to the reals with \in as opposed to the reals without \in? They are no longer the same objects, although I constructed the latter to be the former.

Such a schizophrenic state of affairs is avoided when using Bourbaki’s choice operator.

Posted by: Arnold Neumaier on September 19, 2009 8:38 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

It would report a type error in the standard definition of the opposite category, if written down formally.

It wouldn't, because we never ask (then or afterwards) if an object of CC is the same as an object of C opC^op.

In formalising mathematics without a fundamental global equality, it's important to distinguish the external judgement that two terms are syntactically identical from the internal proposition that two terms refer to the same object. If AA is an object of CC, then you may interpret AA as an object of C opC^op, without even an abuse of language. (I'm not sure that the last clause is correct in Mike's favourite foundations, but it is in mine, which are more type-theoretically oriented.) But you can't introduce AA as an object of CC and BB as an object of C opC^op and ask whether A=BA = B, or even whether ABA \cong B; that's a type error.

[…] often one may want to derive results that hold for general A, B, and later one may want to use this result for the special case where A and B are the same.

I would handle this by substitution, just as I would if I wanted the result for the special case where AA is x 2+2x^2 + 2 and BB is x+yΣx + y - \Sigma (in a context where those terms make sense —and have the right type). I know that people write ‘If A=BA = B, then […]’, but I take this as abuse of language (or syntactic sugar) for ‘Setting AA to BB, […]’ or ‘Setting BB to AA, […]’ (depending on which symbol is used in the sequel).

Posted by: Toby Bartels on September 19, 2009 9:22 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

Thanks, Toby, that’s exactly what I meant.

If AA is an object of CC, then you may interpret AA as an object of C opC^op, without even an abuse of language. (I’m not sure that the last clause is correct in Mike’s favourite foundations

It is. At least, insofar as I have a favorite foundation.

Posted by: Mike Shulman on September 20, 2009 5:42 AM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

TB (in an earlier mail): I’d also like to see a formalisation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), as literally true! It seems doubtful to me; I know what they mean, but I need to translate it.

I’d like to see a formulation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), and rewrites it in a form that it can be taken as literally true. The moral of mathematics requires that a definition can be read as literally true in the sense that that any abuses of language are explained somewhere in sufficient detail that they can be undone.

Looking at other treatises, I can see nothing in Definitions 1.1.1 and 1.3.1 that is not standard. (At least two of the other sources I had quoted have identical requirements.)

So please provide a reading that has no unexplained abuses of language.

MS: How about a system which is structural, and therefore which regards the question of whether two different structures (such as two different categories) have common elements as a type error?

AN: It would report a type error in the standard definition of the opposite category, if written down formally.

TB: It wouldn’t, because we never ask (then or afterwards) if an object of C is the same as an object of Cop^{op}. […] If A is an object of C, then you may interpret A as an object of Cop^{op}, without even an abuse of language. But you can’t introduce A as an object of C and B as an object of Cop^{op} and ask whether A=B, or even whether A\congB; that’s a type error.

On the surface, this looks like a solution. Your suggestion amounts to having equality on the metalevel but forbidding it on the object level. But I think this does not hold water. I don’t think your suggestion can be implemented consistently in a fully formalized way without producing contradictions.

For to say that A is an object of C is formalized as AOb CA\in Ob_C, and to say that B is an object of C opC^{op} is formalized as BOb C opB\in Ob_{C^{op}}. Now Definition 1.3.2 in Asperti and Longo, Categories, Types and Structures implies, using Definition 1.1.1, that Ob C=Ob C op=:XOb_C=Ob_{C^{op}}=:X, say. Now X is a collection, and (at least when identifying collections with certain ETCS-sets or SEAR-sets, say) one can compare elements from X for equality.

But A and B are elements from X, so they can be compared for equality. A formal theorem explorer has no way to avoid this conclusion. Once it draws this conclusion it produces a type error, and exits, not being able to continue to explore the current context.

One cannot escape here to a metalevel since there is no way to feed a theorem explorer unformalized stuff.

The same problem appears with Definition 1.3.1 of a subcategory. Once you have Ob DOb COb_D\subseteq Ob_C, nothing may forbid to compare an element of Ob DOb_D with an element of Ob COb_C without an inconsistency.

Note that this book was written for readers not exposed to categories before. It is difficult for any such reader who takes these definitions seriously to arrive at any other conclusion.

Posted by: Arnold Neumaier on September 20, 2009 11:35 AM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

I'd also like to see a formalisation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), as literally true! It seems doubtful to me; I know what they mean, but I need to translate it.

I’d like to see a formulation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), and rewrites it in a form that it can be taken as literally true.

Yes, I see that this is your quest. If FMathL can do that with this passage, then that would show its power.

So please provide a reading that has no unexplained abuses of language.

Honestly, I think that (especially since this is an introductory text) the phrasing of the Exercise was probably a mistake. They might just try this:

Set\mathbf{Set} is a subcategory of Rel\mathbf{Rel}. Is it a full subcategory?

Of course, that's not the same exercise, and it should go immediately after Definition 1.3.1.

The real meaning of Exercise 1.3.2 is

Set op\mathbf{Set}^{\mathbf{op}} is obviously equivalent to a subcategory of Rel\mathbf{Rel}. Is it a full subcategory?

But since they haven't defined equivalence of categories yet, this is an inappropriate exercise at that point. (Actually, the relation of Set op\mathbf{Set}^{\mathbf{op}} to the desired subcategory of Rel\mathbf{Rel} is stricter than equivalence, but still not anything that they've defined yet.)

Another way out is to interpret Definition 1.3.1, particularly the requirement that D[a,b]C[a,b]\mathbf{D}[a,b] \subseteq \mathbf{C}[a,b], in a structural way to mean that D[a,b]\mathbf{D}[a,b] is equipped with an injection to C[a,b]\mathbf{C}[a,b]. I don't think that this is how they intended it, since in an introductory book you ought to explain that sort of thing. But if the authors are deep into the structural framework, then they might have been thinking this without realising it.

I'm interested in how you interpret the claim that Set op\mathbf{Set}^{\mathbf{op}} is a subcategory of Rel\mathbf{Rel}. Is it easy to understand what it means, and what exactly does it mean? Should FMathL accept it, and how should it interpret it?


If AA is an object of CC, then you may interpret AA as an object of C opC^{op}, without even an abuse of language. But you can't introduce AA as an object of CC and BB as an object of C opC^{op} and ask whether A=BA = B, or even whether ABA \cong B; that's a type error.

On the surface, this looks like a solution. Your suggestion amounts to having equality on the metalevel but forbidding it on the object level. But I think this does not hold water. I don’t think your suggestion can be implemented consistently in a fully formalized way without producing contradictions.

You're right; what I've said here is contradictory. If (as I said) you may interpret an object AA of CC as an object of C opC^{op}, then you can compare it to the object BB of C opC^{op}, since any two objects of C opC^{op} may be compared (for isomorphism at least). I wasn't thinking about carefully enough, and I apologise.

(I stand by what I said about distinguishing identity judgements from equality propositions, although this does not appear to be a place that it applies. In fact, I think that it must be irrelevant to what we're discussing, since it's a criticism of ETCS\mathbf{ETCS} as much as of anything else. So never mind.)

Since we're worrying here about typing errors when comparing two objects of two categories, maybe I should go back to the beginning and say what I think about that, without suggesting that Mike or anybody else would agree with me. (I know that there are differences between Mike's and my philosophy, and we are getting close to some of them.)

Given two arbitrary types (where a type might be the type of elements of a set, or the type of objects of a category, or something else) XX and YY, and given AA of type XX and BB or type YY, it doesn't normally make sense to ask whether A=BA = B. However, it is not good design for a mathematical formaliser to throw up an error whenever anybody writes A=BA = B in this context. First it should try to reduce the expressions for XX and YY (especially if this is something that can always be efficiently strongly normalised) to see if they come out the same. Even if that fails (and especially if reduction is not confluent or was not completed), then it should give the user an opportunity to specify a type ZZ (which might be either XX or YY) and operations to ZZ from XX and YY respectively. If this works, then AA and BB may now be interpreted as having the same type, and A=BA = B presumably makes sense.

In particular, if GG and HH have just been introduced as two groups in a context appropriate for group theory, with XX and YY the types of elements of GG and HH respectively, then there is no way to avoid the type error. But if instead YY is the type of elements of G opG^{op}, then it is easy to avoid the type error; probably the system can do it automatically (in fact, probably one has YY defined directly as XX).

I have used groups here instead of categories to avoid the evil of asking whether two objects in a single given category are equal, which is a different issue (related to that stuff about identity judgements). Of course, the elements of a group correspond to the morphisms of a category, but I don't think that this makes a difference here.

Posted by: Toby Bartels on September 20, 2009 7:42 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

TB: If FMathL can do that with this passage, then that would show its power.

To teach FMathL this power, I first need to understand what “should” be understood after having read Definition 1.1.1 and what after Definition 1.3.1.

You explained only how then the exercise should be understood, in terms of concepts not yet introduced.

How can an automatic system understand things at the very introduction of a theory when the intentions are formulated so poorly?

This is why I had asked. So let me rephrase my request:

I’d like to see a few pages of text that introduce in a formally precise way the full supposed content of what the trained category theorist understands that these two definitions and the exercise should have conveyed to the reader, including any moral an automatic system should follow in interpreting the remainder of category theory, and explaining any abuse of notation or language that is apparent from the presentation of these two definitions in the standard textbooks.

If I get such a description, and if it is logically consistent, I will guarantee that FMathL will be provided with a generic mechanism for interpreting the definitions as written by Asperti and Longo in the correct way, and for identifing a meaningful interpretation of the exercise (together with raising a flag for having discovered a sloppiness.)

But first I need to understand myself clearly enough what you read into the text morally although it is not written there formally.

Posted by: Arnold Neumaier on September 21, 2009 2:43 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

I first need to understand what “should” be understood after having read Definition 1.1.1 and what after Definition 1.3.1.

I think that what “should” be understood at this point is that the authors made a mistake in stating the exercise. Seasoned categorists can guess what the authors might have meant, but I would not expect an undergraduate without experience in category theory to be able to.

Posted by: Mike Shulman on September 21, 2009 6:03 PM | Permalink | PGP Sig | Reply to this

Equality between objects of different types

TB: given A of type X and B or type Y, it doesn’t normally make sense to ask whether A=B. […] it should give the user an opportunity to specify a type Z (which might be either X or Y) and operations to Z from X and Y respectively. If this works, then A and B may now be interpreted as having the same type, and A=B presumably makes sense.

I think something like that is feasible in FMathL. I’ll keep it in mind in its design.

In this connection, would you consider each category to be a separate type? Each object of a category? Each Homset? (I am not sure whether all three simultaneously may be required consistently.)

Posted by: Arnold Neumaier on September 21, 2009 2:52 PM | Permalink | Reply to this

Re: Equality between objects of different types

I would consider the class of objects of each category to be a separate type, and each homset in each category to be a separate type. In general, I don’t think of a single object of a category as a type (in general, there’s no way for it to have “elements”), although for some particular categories such as SetSet they can be interpreted that way.

Posted by: Mike Shulman on September 21, 2009 6:13 PM | Permalink | PGP Sig | Reply to this

Re: Equality between objects of different types

In general, I don’t think of a single object of a category as a type (in general, there’s no way for it to have “elements”), although for some particular categories such as SetSet they can be interpreted that way.

For students and curious lurkers, see [[concrete category]].

Posted by: Eric Forgy on September 21, 2009 7:54 PM | Permalink | Reply to this

Reflection

TB: I should have asked if there was an easy user-friendly way to import it.

What can be more user-friendly than typing “import file.con”, or dragging an icon for file.con into the current context window? If you accept everything the imported context is regulating, this is enough. Otherwise you need to edit file.con to suit your needs before importing it. Just deleting something is easy; other things depend on what you want to change, and how.

TB: I’m worried about an analogue of the mismatch between the internal and external languages of a topos that is not well-pointed.

I can’t tell exactly what you are aiming at, but there is always a kind of mismatch between a subject level (probably your external language) and the object level (probably your internal language).

Because of Goedel’s theorem, a subject level is always strictly stronger in proving power than the object level (unless both are inconsistent).

However, there is no analogue of Goedel’s theorem for the descriptive power of a system. A system with weak proving power can still have a descriptive power sufficient to represent all mathematics including their proofs. The reason is that while finding proofs is undecidable in general, checking proofs is constructively possible under quite modest assumptions about the logic.

Thus even though the current FMathL framework supports only a truncated set theory, having power sets only for sets of size up to the continuum, the reflected level (in which all formal reasoning happens) can check all proofs in axiomatic set theory, even those involving inaccessible cardinals. It is only required that you specify the latter in the FMathL specification language, which is based on the truncated set theory.

And it can check all proofs presented in Bishop-type constructive mathematics, if you specify the latter in the FMathL specification language, although the logic in which the specification language is defined is classical.

TB: By Excluded Middle, a set is either inhabited or not; an uninhabited set is (by definition) empty; hence a nonempty set is inhabited. So if there is a global choice function for inhabited sets, then there is one for uninhabited sets.

I see. So one would have to restrict the choice somehow, and change that part of the context. I am not very familiar with the various ways of defining restricted choice, though; maybe you can help me in saying how much choice you want to allow. In any case, I’ll give it thought.

To teach the system the moral of categories, it seems to me necessary and probably sufficient to have choice for elements of equivalence classes.

TB: Give me a paper whose source is available (say from the arXiv), formalised in ZFC (or whatever), and I’ll rewrite it to be formalised in ETCS.

AN: I do not have the patience to undo all the abuses of notation traditionally used in the context of ZFC.

TB: I don’t understand what you’re asking of categorial foundations, then, if no other foundation does it.

You were asking for something formalized in ZFC. I can’t supply that, since I regard abuses of notation as lack of formalization. I mentioned here what I consider a sufficient level of formality.

Thus I am satisfied if you can translate a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available) into a document starting from scratch with the axioms for a category, at a comparable level of rigor and only need a comparable number of pages.

TB: Or do you claim that FMathL does this? Can I take http://www.mat.univie.ac.at/~neum/ms/fmathl.pdf as my text?

The reflection part is only promised there and outlined, not realized. We are working on a prototype version, and hope to have one by the end of the year. At least, FMathL will not have abuses of notation since whatever remains of them will be valid notation rather than abuse. Thus the level or rigor will be higher than in a typical textbook.

TB: we still don’t need a full-fledged set theory (or other foundation of all mathematics), just a way to talk about recursively enumerable sets of natural numbers.

Yes, something like this is sufficient. But if you think that this saves a lot of work, you are mistaken. You can relax the axiom of power sets and delete the axiom of foundations, but you need equivalents of all the rest to be able to define recursively enumerable sets of natural numbers. And you need to build up quite a lot of conceptual machinery before you have all the concepts and properties one is using there without thinking.

The complexity of complete formal foundations do not become apparent before one hasn’t tried to find sone!

TB: In particular, the requirement of a countable set of variables can be replaced with a single variable x and the requirement that X’ is a variable whenever X is; there is no need to say Cantor’s word “countable”.

True. But then you must stick to that convention later on in whatever you do since you do not have anyhing else. This will make your foundation nearly incomprehensible to a human reader. For example, if you agree with the typing paradigm, it will make type checking extremely tedious. But foundations should be easily checkable by hand and teachable in the classroom!

Posted by: Arnold Neumaier on September 19, 2009 9:53 PM | Permalink | Reply to this

Re: Reflection

So one would have to restrict the choice somehow, and change that part of the context. I am not very familiar with the various ways of defining restricted choice, though; maybe you can help me in saying how much choice you want to allow.

I really only want (as the default, until I choose something stronger) the axiom of unique choice: When AA is an inhabited set such that any two elements of AA are equal, then we have an element Choice(A)Choice(A) of AA. Of course, this isn't going to work for the uses to which you put the global choice operator.

Another possibility is to refuse to allow, from the hypothesis that A=BA = B, the conclusion that Choice(A)=Choice(B)Choice(A) = Choice(B) (even given that AA and BB are inhabited). This should prevent the proof of Excluded Middle (in a framework with intuitionistic logic), while still allowing a definition like Choice(CompOrdFld)Choice(Comp Ord Fld) for \mathbb{R}. I don't think that you should ever need to deduce, say, Choice(CompOrdFld)=Choice(CompArchFld)Choice(Comp Ord Fld) = Choice(Comp Arch Fld) from CompOrdFld=CompArchFldComp Ord Fld = Comp Arch Fld (if you ever even want to say the latter).

Posted by: Toby Bartels on September 19, 2009 11:04 PM | Permalink | Reply to this

Re: Reflection

TB: I really only want (as the default, until I choose something stronger) the axiom of unique choice: Of course, this isn’t going to work for the uses to which you put the global choice operator.

Yes, this is too weak for FMathL purposes. But note that there is no default on the reflection level. The user decides which parts of the FMathL framework reflection should be used.

The FMathL default defined in the framework paper only controls the meaning of the specification language in which everything formal is then represented.

TB: Another possibility is to refuse to allow, from the hypothesis that A=B, the conclusion that Choice(A)=Choice(B)

This is against the basic principle of FMathL that substitution of equals is unrestricted.

This means that someone would have to write a UniqueChoice context and create workarounds of all the uses of standard Choice in the FMathL reflection.

Once this is done, working formally in your setting is as easy as working in the standard context.

But the library of theorems in that contexts would have to be recreated from the standard library by checking which proofs still hold in the new context, and by trying to repair the others.

However this is not more difficult than what anyone has to do who changes something in established foundations. In FMathL it is probably even easier since the precise semantics will help in automatic translation and checking.

Posted by: Arnold Neumaier on September 20, 2009 11:55 AM | Permalink | Reply to this

Re: Reflection

Thus I am satisfied if you can translate a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available) into a document starting from scratch with the axioms for a category, at a comparable level of rigor and only need a comparable number of pages.

As I feel that I’ve said numerous times, a structural/type-theoretic foundation does not need to start with the axioms for a category.

I feel sure that I could do this, but unfortunately I don’t have anywhere near the time it would take at the moment. (-: Sorry if that sounds like a cop-out, but actually I don’t really even have the time to be engaging in this discussion…

Posted by: Mike Shulman on September 20, 2009 6:26 AM | Permalink | PGP Sig | Reply to this

Re: Reflection

MS: a structural/type-theoretic foundation does not need to start with the axioms for a category.

Yes, and then there are fewer problems. I’ll look at SEAR from this point of view, once I have more time to study it more deeply.

MS: I feel sure that I could do this, but unfortunately I don’t have anywhere near the time it would take at the moment. (-: Sorry if that sounds like a cop-out, but actually I don’t really even have the time to be engaging in this discussion…

I understand. This rhetorical request was thought to be an explication of what would revise my current judgment on the overhead of categorial foundations rather than as a request that you or TB should actually do this; especially since I know that you prefer structural non-categorial foundations.

Posted by: Arnold Neumaier on September 20, 2009 12:22 PM | Permalink | Reply to this

Re: Reflection

I am satisfied if you can translate a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available) into a document starting from scratch with the axioms for a category, at a comparable level of rigor and only need a comparable number of pages.

For the first one, how about Sections 1.3 and 2.1–5 of http://www.math.uchicago.edu/~mileti/teaching/math278/settheory.pdf?, which I found by Googling "axiomatic set theory". For the second one, I'm not sure what kind of work you're thinking of, so an example would help.

I reserve the right for lack of time not to complete these, but I'll try to at least indicate how they would be done.

Posted by: Toby Bartels on September 20, 2009 7:44 PM | Permalink | Reply to this

Re: Reflection

AN: a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available)

TB: For the first one, how about Sections 1.3 and 2.1–5 of http://www.math.uchicago.edu/~mileti/teaching/math278/settheory.pdf?

I think to be able to reflect the notion of an expression one also needs finte products and recursive definitions; thus you’d add Sections 2.7 and 2.8. Section 2.9.1 is also used in the standard descriptions of logic.

Thus 17 pages comprising elementary axiomatic set theory are sufficient to reflect logic.

TB: For the second one, I’m not sure what kind of work you’re thinking of, so an example would help.

I have no good online reference (although there might well be some - didn’t search thoroughly). But Section 1-13 of Chapter 5 of the book “The mathematics of metamathematics” by Rasiowa and Sikorski is a suitable template. It presents in about 50 verbose pages (one may skip Section 13A-D) intuition and formality about reflecting mathematical theories and in particular reflects ZF in Section 13E, using on the informal level informal notions of sets, functions, sequences (i.e., what axiomatic set theory tells us can be encoded as such).

Thus 50 pages of elementary logic are sufficient to reflect axiomatic set theory.

Therefore, self-reflective (and hence fully self-explaining) foundations of traditional set-based mathematics can be developed in about 70 pages of mathematical text, at the usual level of mathematical rigor and in a language easily understood by anyone who passed the undergraduate math stage.

Posted by: Arnold Neumaier on September 21, 2009 11:01 AM | Permalink | Reply to this

Re: Reflection

All right, I'll do Section 1.3 and start on Chapter 2 of Mileti, and hopefully the point at which it is clear that I could continue indefinitely will come before the point at which it is tiresome to continue. (^_^)

I should be able to look at Rasiowa & Sikorski later this week. (It's been recommended to me before, in other contexts, so it's about time that I did!) Hopefully, that will just be a matter of checking that the informal notions used have already been successfully formalised in the translation of Mileti to categorial foundations.

Posted by: Toby Bartels on September 21, 2009 11:00 PM | Permalink | Reply to this

Re: Reflection

Temporarily at http://tobybartels.name/settheory.ps does this through Section 2.5. The first parts are completely redone, but afterwards it becomes almost a matter of cutting and pasting. The rest (except Section 2.6, not in the assignment) would be even more like that, although I could do it if there are still questions as to how it would go.

I tried to make the development as tight as possible, although I couldn't help but point out a few things, like the universal property of the Cartesian product and which proofs require classical logic. As Mike did with SEAR, I took axioms that focus on elements rather than arbitrary functions. However, since the assignment was to start with the elementary axioms of a category, I defined an element to be a certain sort of function, as in ETCS.

Personally, I would start with something more like SEAR (or SEPS) and take a leisurely pace, pointing out all of the sights along the way. But that would not meet the assignment.

Posted by: Toby Bartels on October 15, 2009 11:45 AM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

One needs to consider both aspects - efficiency for the user and efficiency for the system.

That’s a fair point, although I feel that types are important enough that if I were designing a system, I would want to put in whatever extra effort is necessary to include them.

BTW, I didn’t mean my mention of the possibility of connecting occurrences of AA by isomorphisms as a serious suggestion that one implement; my preferred solution is what Toby said in response.

How do I then refer to the reals with ∈ as opposed to the reals without ∈? They are no longer the same objects

They are the same objects. \in is extra structure on the same set of elements.

Posted by: Mike Shulman on September 20, 2009 6:13 AM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

AN: How do I then refer to the reals with ∈ as opposed to the reals without ∈? They are no longer the same objects

MS: They are the same objects. ∈ is extra structure on the same set of elements.

This makes sense only if you agree that two categories based on the same set of objects with different extra structure on it have the same objects. But I thought this is something you wanted to avoid!

The problem is that an automatic system must translate these sorts of statements into a well-defined fully formalized statement that can be fed to a theorem prover for checking proofs involving them.

I still don’t see how this can be done, given the standard axioms for category theorem and its derived conceptual basis as given in standard works.

Posted by: Arnold Neumaier on September 20, 2009 8:51 AM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

They are the same objects. \in is extra structure on the same set of elements.

This sounds so strange to me that I'm sure that there must be a misunderstanding here, either of Mike or by Mike.

We have the complete ordered field \mathbb{R} of real numbers, and we have \mathbb{R} equipped with a binary relation \in from \mathbb{Q}. (Presumably, \in is either <\lt, >\gt, \leq, or \geq, but we haven't specified.) These are different things, not in the sense that we would put \ne between them (which would be a typing error), but in the sense that we would not accept == between them (which would also be a typing error).

However, there is also an obvious forgetful operation (I won't say ‘functor’ since I haven't specified categories, although it would be easy to do that) from the latter to the former. A user-friendly system for mathematics should even be able to detect this and put it in automatically wherever it's needed.

Posted by: Toby Bartels on September 20, 2009 6:18 PM | Permalink | Reply to this

Re: objects and morphisms can belong naturally to several categories

Definition 1.3.2 of Asperti and Longo, Categories, Types and Structures

I'd also like to see a formalisation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), as literally true! It seems doubtful to me; I know what they mean, but I need to translate it.

Posted by: Toby Bartels on September 18, 2009 10:32 PM | Permalink | Reply to this

Re: objects and morphisms can belong naturally to several categories

I’d also like to see a formalisation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), as literally true!

This is a really good example of the sort of mistakes you can end up making when you do anything with categories non-structurally. Category theory wants to be purely structural with all its heart ♥. (-:

Posted by: Mike Shulman on September 19, 2009 2:33 AM | Permalink | PGP Sig | Reply to this

Re: objects and morphisms can belong naturally to several categories

ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation RR (suitably defined)

I would like to see you formalise this! Or the (related but perhaps simpler) statement that GroupGroup is a subcategory of SetSet. I think that a lot of mathematicians do think that way, but it is difficult to formalise; the difficulties have to do with the difference between structure and properties. I usually see it as a hallmark of sophisticated understanding to drop these ideas, but sometimes I also wonder how far they can be maintained.

Posted by: Toby Bartels on September 18, 2009 9:55 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

It also seems to me that all the things about real numbers you are saying that FMathL does right, it only does right specifically for real (or complex) numbers, because you have built them into the system by fiat. If FMathL didn’t include axioms specifying that there was a particular set called \mathbb{R} with particular properties, then you’d have to construct it just like in any other set theory, and in particular you’d have to choose an implementation (thereby making 121\in\sqrt{2} either true or false), and it would also no longer be true that 22\in\mathbb{N} and 22\in\mathbb{R} were the same thing.

But \mathbb{R} and \mathbb{C} are by no means the only mathematical object that has such problems! For example, suppose I want to study the quaternions \mathbb{H} in FMathL. I could define them as ordered pairs of complex numbers, or as 2×22\times 2 complex matrices, or as 4×44\times 4 real matrices, and in each case different “accidental” things would be true about them just as in ZF, and in no case would \mathbb{C} actually be a subset of \mathbb{H}. And so on.

Posted by: Mike Shulman on September 18, 2009 9:24 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: you’d have to choose an implementation (thereby making 121\in\sqrt{2} either true or false)

No. I can define the domain COFCOF of all complete ordered fields, show by some construction that COFCOF is not empty, and then specify \mathbb{R} as Choice(COF)Choice(COF). This selects a unique, implementation-dependent copy of the real numbers without any accidental properties. (Maybe this will be the form of the actual later implementation. These things will be decided only after we have done enough experiments.)

MS: But \mathbb{R} and \mathbb{C} are by no means the only mathematical object that has such problems! For example, suppose I want to study the quaternions \mathbb{H} in FMathL. I could define them as ordered pairs of complex numbers, or as 2×2 complex matrices, or as 4×4 real matrices, and in each case different “accidental” things would be true about them just as in ZF, and in no case would \mathbb{C} actually be a subset of \mathbb{H}.

Indeed, if you define them in this way, this is what happens. But I would call this not definitions but constructions, and constructions usually have accidental properties.

To get things right without any artifacts, one needs to think more categorially, and define \mathbb{H} as Choice(X)Choice(X), where XX denotes the domain of all skew fields that contain \mathbb{C} as a subfield of index 2. Any of the constructions you gave shows that XX is nonempty, so that this recipe defines a unique existing object whose only decidable properties are those that one can derive from the assumptions made. (It has in addition lots of undecidable, implementation-dependent properties, though.)

The use of Bourbaki’s global choice operator is essential for this golden road to the essence of mathematics.

Posted by: Arnold Neumaier on September 18, 2009 1:10 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

It sounds like you’re saying that in order to construct anything in FMathL without extraneous details I need to find a way to describe it in terms that characterize it uniquely. That seems to me like a mighty tight straightjacket!! Suppose I’m Hamilton inventing the quaternions, and maybe I’m ahead of my time and I’ve realized that they could be constructed from the complex numbers in several different ways, but I don’t yet have any idea how to characterize them uniquely. I didn’t set out to study skew fields containing \mathbb{C} as a subfield of index 2, I set out to study a particular thing, which could be constructed in several ways, and only later discovered that it was the unique skew field containing \mathbb{C} with index 2. It seems to me that uniqueness theorems such as these generally come later, after a new object has been studied in its own right for a while and its essential properties isolated.

Here’s a more modern example: what about the stable homotopy category? It has lots of different constructions; you can start with lots of different kinds of point-set-level spectra. But although all these constructions give the same result, off the top of my head I’m not at all sure how to characterize the result uniquely without referring to the specific constructions of it.

Posted by: Mike Shulman on September 18, 2009 6:17 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: It sounds like you’re saying that in order to construct anything in FMathL without extraneous details I need to find a way to describe it in terms that characterize it uniquely.

No. Of course one can construct in FMathL a skewfield via complex 2 x 2 matrices, and call it the quaternions. Then one can construct another skew field vial real 4x4 matrices and call it the tetranions. Then one discovers the theorem that tetranions are isomorphic to quaternions. At this point, it makes sense to revise notation (a cvommon process in mathematics), consider the domain XX of all skewfields isomorphic to these particular skew fields, and to define the quaternions as Choice(XX). When later the characterization theorem is discovered, it just gives a simpler description for XX, but the definition is already stable once you know that you want to abstract from the accidentals of the construction. One can do this even if only a single construction exists (e.g., for the Monster simple group).

Posted by: Arnold Neumaier on September 18, 2009 6:57 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

one can avoid the multiple meanings of \mathbb{N} by interpreting it in each context as the richest structure that can be picked up from the context.

This isn’t “avoiding” the multiple meanings of \mathbb{N}, it is merely inferring which meaning is meant, which I am all in favor of. Category theorists do this all the time just like other mathematicians. But the fact that one has to do an interpretation means that the multiple meanings still exist.

The fact that \mathbb{N} can indicate different levels of structure depending on the context is precisely what I mean by “overloading.”

this uses the same idea as the categorial approach to foundations, but to turn each context into a category

I don’t think I ever advocated turning every context into a category. And, as I’ve said, I think that calling this the “categorial” approach to foundations is misleading; it doesn’t necessarily have anything to do with categories. The point I’m trying to make is that mathematics is typed, and that’s just as true in type theory and non-categorial structural set theory as it is in ETCS or CCAF.

Posted by: Mike Shulman on September 18, 2009 8:52 PM | Permalink | PGP Sig | Reply to this

Re: CCAF, ETCS and type theories

MS: The point I’m trying to make is that mathematics is typed, and that’s just as true in type theory and non-categorial structural set theory as it is in ETCS or CCAF.

I made some first comments on your SEAR page, but need to look at the concepts more thoroughly.

The point I’m trying to make is that typing does not solve many of the disambiguation problerms that an automatic math system must be able to handle. it solves only some of them. Since a more flexible disambiguation system is needed anyway, it can as well replace the typing. By doing so in an FMathL like fashion, it automatically eliminates the multiplicity of typed instances of the same thing.

Posted by: Arnold Neumaier on September 18, 2009 10:43 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

The use of Bourbaki’s global choice operator is essential for this golden road to the essence of mathematics.

That's too bad, because I would like to do mathematics in which the axiom of choice is optional, without having to code it all myself. I think that I can work well in a system where \mathbb{N} has a maximal structure (at least in an ever-growing, potential infinity sort of way), even though that's not exactly how I would normally think of \mathbb{N}, but I really won't find it useful if choice is essential.

I hope that it isn't really essential for what you're doing here. After all, you don't care which particular object Choice(X)Choice(X) is, and the user won't even have access to those details. So I hope that there is a way to implement this that avoids anything that proves the axiom of choice.

Posted by: Toby Bartels on September 18, 2009 10:07 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

AN: The use of Bourbaki’s global choice operator is essential for this golden road to the essence of mathematics.

TB: That’s too bad, because I would like to do mathematics in which the axiom of choice is optional, without having to code it all myself.

The way FMathL will be implemented is fully reflective. The current paper essentially serves to define a common metalevel, within which one can objectively (i.e., with the same meaning in all subjective implementations) talk about a constructive description of the FMathL implementation. The latter is what is actiually carried out. Thus, one must trust the axiom of choice to trust the system, but what is proved inside the system is fully configurable, since you can simple build your context without importing all the modules needed for a full reflection of FMathL. Thus you’d only have to create your own constructive restricted version of Choice (if this hasn’t been done already by someone else), most likely that Choice is defined only for inhabited sets rather than for all nonempty sets, and things will work as before. You need some construction to know that you can choose, but you have that in a constructive approach anyway.

Actually, after we have successfully reflected the whole FMathL framework, we’ll take stock to see what is really needed for a minimal part that can already reflect the whole framework. Then we redefine this as the core, and there is a possibility that this will be constructive. Only the core need to be trusted and checked for correctness, since the remainder will be definable in terms of the core.

Of course, consistency of the core does not imply consistency of the theories built later with the help pof the core and user specifications. Thus, even in a weak, but fully reflective core it will be possible to specify ZFC with all sorts of inaccessible cardinals, say, but their consistency is left to the user.

AN: ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation R (suitably defined)

TB: I would like to see you formalise this! Or the (related but perhaps simpler) statement that Group is a subcategory of Set.

I’ll try to do that sooner or later, but in my experience this may take days or weeks to crystallize into something practical (if it is at all possible). So don’t expect a quick reply.

Posted by: Arnold Neumaier on September 18, 2009 11:22 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Thus you'd only have to create your own constructive restricted version of Choice […] and things will work as before.

And I can import everything that's already been defined your way? That might work.

most likely that Choice is defined only for inhabited sets rather than for all nonempty sets,

For the record, that won't work. (At least, if it did, then you'd have Excluded Middle implies Choice, which I wouldn't want either!)

Actually, after we have successfully reflected the whole FMathL framework, we’ll take stock to see what is really needed for a minimal part that can already reflect the whole framework. Then we redefine this as the core, and there is a possibility that this will be constructive.

That sounds good!

I would like to see you formalise this! Or the (related but perhaps simpler) statement that Group is a subcategory of Set.

I’ll try to do that sooner or later, but in my experience this may take days or weeks to crystallize into something practical (if it is at all possible). So don’t expect a quick reply.

I understand. But I would be very interested to see it.

Posted by: Toby Bartels on September 19, 2009 12:15 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I reply here to several of your mails.

TB: And I can import everything that’s already been defined your way?

Yes, since that is what reflection is all about. A self-reflective foundation (and only that is a real foundation) can explain formally all the stuff its talking about informally. To explain it formally means to have a module that defines its syntax and semantics. Of course, in a well-designed package, you have access to that and can use, combine, and modify it in any way you like. (In the latter case, of course, the trust certificates will be reset to trusted by you only.)

AN: most likely that Choice is defined only for inhabited sets rather than for all nonempty sets,

TB: For the record, that won’t work.

Actually, I just noticed that Axiom A19, is alredy formulated as an axiom of global constructive choice.

TB: At least, if it did, then you’d have Excluded Middle implies Choice, which I wouldn’t want either!

I don’t understand; please indicate the argument.

However, it is well-known that Choice implies Excluded Middle, and this remains true in FMathL; see Section 3.1. If you don’t like this, you’d have to relax the axioms for sets (Section 2.14) in a way that makes this proof invalid.

TB: Give me a paper whose source is available (say from the arXiv), formalised in ZFC (or whatever), and I’ll rewrite it to be formalised in ETCS. (The paper can include its own specification of ZFC too, and mine will include its own specification of ETCS.)

I will not be able to meet that challenge since I do not have the patience to undo all the abuses of notation traditionally used in the context of ZFC. I argue that precisely this should not be demanded from a really good foundation of mathematics.

However, Bourbaki’s Elements of Mathematiks at least clarify each form of abuse of notation on first use, so that an automatic system going through it in linear order will pick up all the language updates along the way. So I believe that, in principle, Bourbaki meets your request, though not in a minimally short way since they aimed at completeness, not at minimal reflection.

TB: Are you telling me that even ZFC needs a set theory to serve as its foundation? I can’t think of any way to interpret this to make it true.

Yes, of course. This is done in books on logic. They start with an informal set theory, then introduce the machinery to formally talk about first order logic, and then produce at some stage the definition of ZFC.

To interpret it without circularity, logicians distinguish between the metalevel (the informal set theory) and the object level (the formal theory), and then build hierarchies for reflection. They build them upwards to metametalevels etc..

FMathL rather reflects downwards to objectobject levels, etc., which is more appropriate from an implementation point of view.

TB: You have to say something like this to reflect first-order logic in some mathematical foundation. But not to do first-order logic.

To do it, you just need a mind in which the logic is already implemented.

But to communicate objectively what you do, you need to reflect it: You need to understand what it means that you can choose any indexed letter as a variable, and what it means to have a substitution algorithm, etc.. Once you try to communicate this to a novice who hasn’t already an equivalent implementation, you find yourself teaching him informal concepts of sets and functions with their properties.

A foundation is just supposed to have that done rigorously.

Posted by: Arnold Neumaier on September 19, 2009 5:23 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I reply here to several of your mails.

That's fine.

And I can import everything that’s already been defined your way?

Yes, since that is what reflection is all about. A self-reflective foundation (and only that is a real foundation) can explain formally all the stuff its talking about informally. To explain it formally means to have a module that defines its syntax and semantics. Of course, in a well-designed package, you have access to that and can use, combine, and modify it in any way you like. (In the latter case, of course, the trust certificates will be reset to trusted by you only.)

I should have asked if there was an easy user-friendly way to import it. Also I should learn more about the practical problems of reflection, because I have another question, which I don't know how to ask; but I'm worried about an analogue of the mismatch between the internal and external languages of a topos that is not well-pointed. And I'm concerned about the trust certificates; since I'm reflecting in order to use a weaker foundation (that is, fewer assumptions, stricter requirements), I would like FMathL to verify for me those results that go through.

most likely that Choice is defined only for inhabited sets rather than for all nonempty sets,

For the record, that won't work. At least, if it did, then you'd have Excluded Middle implies Choice, which I wouldn't want either!

I don’t understand; please indicate the argument.

By Excluded Middle, a set is either inhabited or not; an uninhabited set is (by definition) empty; hence a nonempty set is inhabited. So if there is a global choice function for inhabited sets, then there is one for uninhabited sets. Conversely, the argument that Choice implies Excluded Middle needs only the version of choice for inhabited sets (in fact, only for quotient sets of {0,1}\{0,1\}). I would like this to be optional.

Give me a paper whose source is available (say from the arXiv), formalised in ZFC (or whatever), and I’ll rewrite it to be formalised in ETCS.

I will not be able to meet that challenge since I do not have the patience to undo all the abuses of notation traditionally used in the context of ZFC. I argue that precisely this should not be demanded from a really good foundation of mathematics.

I don't understand what you're asking of categorial foundations, then, if no other foundation does it. Or do you claim that FMathL does this? Can I take http://www.mat.univie.ac.at/~neum/ms/fmathl.pdf as my text?

Are you telling me that even ZFC needs a set theory to serve as its foundation? I can't think of any way to interpret this to make it true.

Yes, of course. This is done in books on logic. They start with an informal set theory, then introduce the machinery to formally talk about first order logic, and then produce at some stage the definition of ZFC.

I would like to see a book on logic (syntactic logic, not model theory) that does this. To describe the logic, we need a metalanguage (as you say), but we still don't need a full-fledged set theory (or other foundation of all mathematics), just a way to talk about recursively enumerable sets of natural numbers. (But I think that it's more common to talk about finite lists from a fixed finite language than natural numbers.)

In particular, the requirement of a countable set of variables can be replaced with a single variable xx and the requirement that XX' is a variable whenever XX is; there is no need to say Cantor's word ‘countable’.

Posted by: Toby Bartels on September 19, 2009 8:58 PM | Permalink | Reply to this

Re: CCAF vs ETCS

I (perhaps wrongly) assumed that “CCAF” meant the same thing as Lawvere originally meant by it, and I don’t think this included ETCS. But I can’t find my copy of Lawvere’s original paper at the moment, so I could be wrong.

Posted by: Mike Shulman on September 16, 2009 9:01 PM | Permalink | Reply to this

Re: CCAF vs ETCS

http://138.73.27.39/tac/reprints/articles/11/tr11abs.html Longer version of the 1964 paper

“Philosophers and logicians to this day often contrast “categorical” foundations for mathematics with “set-theoretic” foundations as if the two were opposites. Yet the second categorical foundation ever worked out, and the first in print, was a set theory—Lawvere’s
axioms for the category of sets, called ETCS, (Lawvere 1964). These axioms were written soon after Lawvere’s dissertation sketched the category of categories as a foundation, CCAF, (Lawvere 1963). They appeared in the PNAS two years before axioms for CCAF were published (Lawvere 1966). The present longer version was available since April 1965 in the Lecture Notes Series of the University of Chicago Department of Mathematics.1 It gives the same definitions and theorems, with the same numbering as the 5 page PNAS version, but with fuller proofs and explications.”

Posted by: Stephen Harris on September 17, 2009 1:01 AM | Permalink | Reply to this

Re: CCAF vs ETCS

The open question was: Does his original version of CCAF have an axiom saying that there is a category of sets satisfying ETCS?

Posted by: Arnold Neumaier on September 17, 2009 4:04 PM | Permalink | Reply to this

Re: CCAF vs ETCS

Posted by: Arnold Neumaier
“The open question was: Does his original version of CCAF have an axiom saying that there is a category of sets satisfying ETCS?”
————————————

SH: Mike said “original paper” which I wasn’t sure meant Lawvere’s thesis which was also published 40 years later. Precisely speaking, Lawvere doesn’t present formalized axioms but more informally.

http://138.73.27.39/tac/reprints/index.html [tr5]
From the Author’s Comments on his (Lawvere) 1963 PhD. thesis Describing January, 1960
————————
“My dream, that direct axiomatization of the category of categories would help in overcoming alleged set-theoretic difficulties, was naturally met with skepticism by Professor Eilenberg when I arrived (and also by Professor Mac Lane when he visited Columbia).”
————————–

From the thesis Introduction
“One so inclined could of course view all mathematical assertions of Chapter I as axioms.” …
“Since all these notions turn out to have first-order characterizations (i.e. char acterizations solely in terms of the domain, codomain, and composition predicates and the logical constants =, ∀, ∃, ⇒, ∧, ∨, ¬ ), it becomes possible to adjoin these characterizations as new axioms together with certain other axioms, such as the axiom of choice, to the usual first-order theory of categories (i.e. the one whose only axioms are associativity, etc.) to obtain the first-order theory of the category of categories. Apparently a great deal of mathematics (for example this paper) can be derived within the latter theory. We content ourselves here with an intuitively adequate description of the basic operations and special objects in the category of categories, *leaving the full formal axioms to a later paper*. We assert that all that we do can be interpreted in the theory ZF3, and hence is consistent if ZF3 is consistent. By ZF3 we mean the theory obtained by adjoining to ordinary Zermelo-Fraenkel set theory…”
—————————

SH: I think the formal axioms were presented later in
Lawvere, F. William (1966)*, The category of categories as a foundation for mathematics, in S.Eilenberg et al., eds, ‘Proceedings of the Conference on Categorical Algebra, La Jolla, 1965’, Springer-Verlag, pp. 1–21.

Colin McLarty said [tr11], “Lawvere’s axioms for the category of sets, called ETCS, (Lawvere 1964). These axioms were written soon *after Lawvere’s dissertation sketched the category of categories as a foundation, CCAF, (Lawvere 1963). They appeared in the PNAS two years before axioms for CCAF were published (Lawvere 1966)*. [cited above]

SH: So in my inexpert opinion, the original version of CCAF would be Lawvere’s dissertation (1963) and there are no formal axioms of ETCS so perhaps they could be called assertions, although that is not how Lawvere thought about them.

For other non-experts: I’m including some of my notes from the thesis which include ideas tantamount to informal axioms.

Seven ideas introduced in the 1963 thesis
(1) The category of categories is an accurate and useful framework for algebra, geometry, analysis, and logic, therefore its key features need to be made explicit.
(2) The construction of the category whose objects are maps from a value of one given functor to a value of another given functor makes possible an elementary treatment of adjointness free of smallness concerns and also helps to make explicit both the existence theorem for adjoints and the calculation of the specific class of adjoints known as Kan extensions.
(3)* Algebras (and other structures, models, etc.) are actually functors
to a background category from a category which abstractly concentrates the essence of a certain general concept of algebra, and indeed homomorphisms are nothing but natural transformations between such functors. Categories of algebras are very special, and explicit *axiomatic characterizations of them can be found, thus providing a general guide to the special features of construction in algebra.
(4) The Kan extensions themselves are the key ingredient in the unification of a large class of universal constructions in algebra (as in [Chevalley, 1956]).
(5) The dialectical contrast between presentations of abstract concepts and the abstract concepts themselves, as also the contrast between word problems and groups, polynomial calculations and rings, etc. can be expressed as an explicit construction of a new adjoint functor out of any given adjoint functor. Since in practice many abstract concepts (and algebras) arise by means other than presentations, it is more accurate to apply the term “theory”, not to the presentations as had become traditional in formalist logic, but rather to the more invariant abstract concepts themselves which serve a pivotal role, both in their connection with the syntax of presentations, as well as with the semantics of representations.
(6) The leap from particular phenomenon to general concept, as in the leap from cohomology functors on spaces to the concept of cohomology operations, can be analyzed as a procedure meaningful in a great variety of contexts and involving functorality and naturality, a procedure actually determined as the adjoint to semantics and called extraction of “structure” (in the general rather than the particular sense of the word).
(7) The tools implicit in (1)–(6) constitute a “universal algebra” which should not only be polished for its own sake but more importantly should be applied both to constructing more pedagogically effective unifications of ongoing developments of classical algebra, and to guiding of future mathematical research.
In 1968 the idea summarized in (7) was elaborated in a list of solved and unsolved problems, which is also being reproduced here.”
——————————-

“My stay in Berkeley tempered the naive presumption that an important
preparation for work in the foundations of continuum mechanics would
be to join the community whose stated goal was the foundations of
mathematics.”
——————————-

I read the Preface to Yves Bertot’s book on Coq and it took about 20 years to develop which makes me think that your time frame of 5 years isn’t very long.

Posted by: Stephen Harris on September 18, 2009 1:37 AM | Permalink | Reply to this

Re: CCAF vs ETCS

SH: I read the Preface to Yves Bertot’s book on Coq and it took about 20 years to develop which makes me think that your time frame of 5 years isn’t very long.

Fortunately, I can build upon all this previous work rather than having to develop it all again. I have it easier with lessns that Coq had to learn the hard way.

Nevertheless, the 5 years are based on the assumption that 10 people work full-time on it for 5 years. I am trying to get financial support to achieve this, but it isn’t easy. At the moment I have only 2 people for the next two years, and one more for 1/2 a year.

The 1964 paper you adress in your next email only talks about ETCS, not about CCAF.

SH: It seems Mike is right about ETCS and CCAF being quite distinct.

This is undisputed. But my claim was that CCAF contains ETCS, whereas his claim was that CCAF is completely independent of ETCS. (If it were so, how could CCAF be a foundation of mathematics, where sets are necessary?)

Posted by: Arnold Neumaier on September 18, 2009 9:48 AM | Permalink | Reply to this

Re: CCAF vs ETCS

his claim was that CCAF is completely independent of ETCS. (If it were so, how could CCAF be a foundation of mathematics, where sets are necessary?)

I don't understand this. How can ETCS be a foundation of mathematics when it's independent of ZFC?

Posted by: Toby Bartels on September 18, 2009 9:39 PM | Permalink | Reply to this

Re: CCAF vs ETCS

AN: his claim was that CCAF is completely independent of ETCS. (If it were so, how could CCAF be a foundation of mathematics, where sets are necessary?)

TB: I don’t understand this. How can ETCS be a foundation of mathematics when it’s independent of ZFC?

I don’t understand how your question relates to my remark, which had no reference to ZFC.

ETCS is a set theory based on category theory, which is based on a set theory. Thus one can claim that ETCS is a different set-theoretic foundation.

CCAF is a category theory based on category theory. It needs some way to create sets, otherwise it cannot serve as its own metatheory. (For eexample, tt needs to be able to formalize things as “there is a countable set of variables” before it can talk about predicate logic. McLarty’s version of CCAF does this by explicitly requiring that CCAF contains a copy of ETCS.

I see now way to avoid something like this, but Mike Schulman had proposed from memory that Lawvere’s CCAF had no ETCS inside it. I have no access to Lawvere’s CCAF paper, so I can’t check.

Posted by: Arnold Neumaier on September 18, 2009 10:54 PM | Permalink | Reply to this

Re: CCAF vs ETCS

I don’t understand how your question relates to my remark, which had no reference to ZFC.

Sorry, I didn't express myself very well. I meant that the question that you asked me makes no more sense in that context (to me) than the question that I asked you.

ETCS is a set theory based on category theory, which is based on a set theory.

And now I don't understand the second clause of this sentence! Or rather, I think that I understand, but if so, then it's wrong. The category theory that ETCS is based is not based on a set theory; it's elementary.

Posted by: Toby Bartels on September 18, 2009 11:02 PM | Permalink | Reply to this

Re: CCAF vs ETCS

TB: The category theory that ETCS is based is not based on a set theory; it’s elementary.

Even elementary stuff expressed in first order logic needs a set theory to be able to serve as foundation. For example, it needs to be able to formalize statements such as “there is a countable set of variables” before it can talk about predicate logic.

TB: I should clarify further that CCAF certainly has a set theory in it, in the same way that any set theory has a category theory in it. That is, you can define sets in terms of categories, and go on from there.

This is precisely what the ETCS inside McLarty’s CCAF does. And I’d be surprised if Lawvere had taken a different road.

Posted by: Arnold Neumaier on September 18, 2009 11:33 PM | Permalink | Reply to this

Re: CCAF vs ETCS

Even elementary stuff expressed in first order logic needs a set theory to be able to serve as foundation.

What??? Are you telling me that even ZFC needs a set theory to serve as its foundation? I can't think of any way to interpret this to make it true.

“there is a countable set of variables”

You have to say something like this to reflect first-order logic in some mathematical foundation. But not to do first-order logic.

Posted by: Toby Bartels on September 19, 2009 12:26 AM | Permalink | Reply to this

Re: CCAF vs ETCS

Even elementary stuff expressed in first order logic needs a set theory to be able to serve as foundation.

Everything needs to be based on set theory, therefore no matter what alternate foundations you consider, they must be based on set theory, therefore the only foundation for mathematics is set theory.

Wait, what?

Posted by: John Armstrong on September 19, 2009 2:32 AM | Permalink | Reply to this

Re: CCAF vs ETCS

It seems that, again, the notion needed is some kind of reflection (or maybe simulation?); since Set Theory untidily encapsulates “all” of mathematics, any competing foundational system ought to allow a reasonable reflection or simulation of Set Theory.

Maybe we need a category of foundational systems, simulations/reflections as morphisms between them, (higher-dimensional stuff?) … ?

Posted by: some guy on the street on September 19, 2009 5:38 AM | Permalink | Reply to this

Re: CCAF vs ETCS

TB: I should clarify further that CCAF certainly has a set theory in it, in the same way that any set theory has a category theory in it. That is, you can define sets in terms of categories, and go on from there.

AN: This is precisely what the ETCS inside McLarty’s CCAF does. And I’d be surprised if Lawvere had taken a different road.
—————————————

Category theory used to have a set-theoretic background. Lawvere provided an alternative. The Category = Set is fundamental. But I don’t think that ETCS is fundamental to CCAF because ETCS is just one formulation of Cat = Set.

http://plato.stanford.edu/entries/category-theory/

“An alternative approach, that of Lawvere (1963, 1966), begins by characterizing the category of categories, and then stipulates that a category is an object of that universe.

Identity, morphisms, and composition satisfy two axioms:
Associativity xxx…
Identity xxx…
This is the definition one finds in most textbooks of category theory. As such it explicitly relies on a set theoretical background and language.

An alterative, suggested by Lawvere in the early sixties, is to develop
an adequate language and background framework for a category of categories.”

SH: The Cat named Set is just one category in the category of categories. I don’t think that Set, has to be ETCS and so ETCS is not foundational to CCAF. I don’t think the Cat = Set is rigid in the axioms that it allows, so is not limited to ETCS. After all ETCS, was just invented for the benefit on a two semester course, and later the axioms in it were simplified.

Posted by: Stephen Harris on September 19, 2009 9:13 AM | Permalink | Reply to this

Re: CCAF vs ETCS

I should clarify further that CCAF certainly has a set theory in it, in the same way that any set theory has a category theory in it. That is, you can define sets in terms of categories, and go on from there. Depending on exactly how CCAF works, you would define a set as a discrete category, or perhaps a discrete skeletal category, or something like that.

Posted by: Toby Bartels on September 18, 2009 11:12 PM | Permalink | Reply to this

Re: CCAF vs ETCS

AN wrote: “I see now way to avoid something like this, but Mike Schulman
had proposed from memory that Lawvere’s CCAF had no ETCS inside it. I have no access to Lawvere’s CCAF paper, so I can’t check.” ———————

SH: From the quote below you will see that Lawvere’s version of Category Theory does not require a set-theoretical background. I think this is assumed on this forum. But the Category of Categories, CCAF, can have the category of Set, as one of its objects. That doesn’t make set theory foundational to CT in Lawvere’s approach. The axioms in ETCS might well qualify as axioms in the Cat = Set. Remember that Lawvere invented ETCS for a simplification of set theory for a one year class. Later, he simplified the axioms which ETCS contained. I don’t think ETCS is intrinsic to CCAF, but it is (I think) one of the possible axiomatic formulations for the Category of Set which is intrinsic to CCAF in the same sense that all possible categories in the category of categories are in some sense intrinsic, at least after being identified. Even if Lawvere had used ETCS as his example for the Cat = Set, in his 1963 thesis or 1966 paper, I think other set axiomatic systems could have replaced it; I don’t think cat = Set has a rigid definition in terms of axioms.
If I’m wrong about this last part, Toby be sure to tell me!;-) So I think you, AN, have overvalued ETCS because you thought it was foundational to a category theory which must have a set-theoretic foundation. But the Cafe mostly uses Lawevere’s not-set-theoretic foundation AFAIK.

http://plato.stanford.edu/entries/category-theory/
“An alternative approach, that of Lawvere (1963, 1966), begins by characterizing the category of categories, and then stipulates that a category is an object of that universe.

Identity, morphisms, and composition satisfy two axioms:
Associativity xxx…
Identity xxx…
“This is the definition one finds in most textbooks of category theory. As such it explicitly relies on a set theoretical background and language.
An alterative, suggested by Lawvere in the early sixties, is to develop an adequate language and background framework for a category of categories.”

Posted by: Stephen Harris on September 19, 2009 8:57 AM | Permalink | Reply to this

Re: CCAF vs ETCS

SH: Even if Lawvere had used ETCS as his example for the Cat = Set, in his 1963 thesis or 1966 paper, I think other set axiomatic systems could have replaced it

I agree. But some notion of set is needed in any categorial foundations of mathematics deserving its name. I took ETCS as the standard notion, as I took ZF as the standard notion of set although people successfully used alternative set theories as foundation, too.

Posted by: Arnold Neumaier on September 19, 2009 7:33 PM | Permalink | Reply to this

Re: CCAF vs ETCS

AN wrote: “This is undisputed. But my claim was that CCAF contains ETCS, whereas his claim was that CCAF is completely independent of ETCS. (If it were so, how could CCAF be a foundation of mathematics, where sets are necessary?)” ——————-

SH: I haven’t noticed a rapid meeting of the minds yet, so I thought I would look into some outside opinion. I too thought that maybe Category Theory might no be such a good basis for FMathL. I now think my opinion is wrong. FMathL is an algorithm and this idea seems to be a key point: “This means that we can view category theory as a collection of algorithms.” FMathL will be an algorithm. This post is my research and notes.

TT wrote:
“However, it looks like Mike Shulman has written down a very interesting program
of study, different to the categories-based approach (which is extremely hands-on and bottom-up) but which is very smooth and top-down (invoking for example a very powerful comprehension principle) while still being faithful to the structuralist POV. He calls it SEAR (Sets, Elements, and Relations).”

AN wrote:
“The main reason I cannot see why category theory might become a foundation for a system like FMathL (and this is my sole interest in category theory at present) is that a systematic, careful treatment already takes 100 or more pages of abstraction before one can discuss foundational issues formally, i.e., before they acquire the first bit of self-reflection capabilities. …
Show me a paper that outlines a reasonably short way to formally define
all the stuff needed to be able to formally reflect in categorial language
a definition that characterizes when an object is a subgroup of a group.”

SH: This paper seems to fit the bill?
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.9846
“Herein we formalize a segment of category theory using the implementation of Calculus of Inductive Construction in Coq. Adopting the axiomatization proposed by Huet and Saibi we start by presenting basic concepts, examples and results of category theory in Coq. Next we define adjunction and cocartesian lifting and establish some results using the Coq proof assistant.”

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.9846
“This paper aims to bring the benefits of the use of Category Theory to the field of Semantic Web, where the coexistence of intrinsically different models of local knowledge makes difficult the exchanging of information. The paper uses categorical limit and colimit to define operations of breaking and composing ontologies, formalizing usual concepts in ontologies (alignment, merge, integration, matching) and proposing a new operation (the hide operation). The presented set of operations form a useful framework that makes easier the manipulation and reuse of ontologies.”

“Computational Category Theory”
“Another reason why computer scientists might be interested in category theory is that it is largely constructive. Theorems asserting the existence of objects are proven by explicit construction. *This means that we can view category theory as a collection of algorithms*. These algorithms have a generality beyond that normally encountered in programming in that they are parameterized over an arbitrary category and so can be specialized to different data structures.

One writes expressions to denote mathematical entities rather than defining the transitions of an abstract machine. ML also provides types which make a program much more intelligible and prevent some programming mistakes. ML has polymorphic types which allow us to express in programs something of the generality of category theory.
However, the type system of ML is not sufficiently sophisticated to prevent the illegal composition of two arrows whose respective source and target do not match.”

Posted by: Stephen Harris on September 19, 2009 5:45 AM | Permalink | Reply to this

Re: CCAF vs ETCS

AN: Show me a paper that outlines a reasonably short way to formally define all the stuff needed to be able to formally reflect in categorial language

SH: This paper seems to fit the bill?

It fits half of the bill. The missing half is to give Coq a categorial foundation. Only then is the reflection complete.

SH: FMathL is an algorithm and this idea seems to be a key point: “This means that we can view category theory as a collection of algorithms.”

I don’t see how a theory can be an algorithm. A theory consists of concepts, their semantic interpretation, and algorithms for manipulating the concepts consistent with that interpretation. All three parts are needed.

If the world consists of algorithms only, they perform meaningless tasks.

Posted by: Arnold Neumaier on September 19, 2009 7:31 PM | Permalink | Reply to this

Re: CCAF vs ETCS

SH: FMathL is an algorithm and this idea seems to be a key point:
“This means that we can view category theory as a collection of algorithms.”
———————————-
AN replied
I don’t see how a theory can be an algorithm. A theory consists of concepts, their semantic interpretation, and algorithms for manipulating the concepts consistent with that interpretation. All three parts are needed.

If the world consists of algorithms only, they perform meaningless tasks.”

SH: I provided the full context of the basis for my quote below. I think the analogy is very precise. As far as meaning goes, the difference is between human “original intentionality” and a program which is considered to have “derived intentionality”. That means that the programmer provides the meaning that the program carries for other observers or users. Likewise, the mathematician defines or constructs some category and the purpose or meaning of that category that is communicated to other mathematicians. It flows from the mind of the mathematician into an abstract symbolic language which means something to the reader. This is pretty much the same definition of natural language which operates by a shared, agreed upon meaning.
“An alternative approach, that of Lawvere (1963, 1966), begins by characterizing the category of categories, and then stipulates that a category is an object of that universe.”
“The point is that the category of categories is not just a category, but what is known as a 2-category; that is, its arrows are functors, but two functors between the same two categories in turn form a category, the arrows being natural transformations of functors. Thus there are 1-arrows (functors) between objects (categories), but there are also 2-arrows (natural transformations) between 1-arrows.”

Encyclopedia of Computer Science and Technology By Allen Kent, James G. Williams
Computational Aspects
“In this section we indicate something of the computational nature of category theory that has attracted the interest of computer scientists and led to applications which we describe in the Applications section.

Observation one is that category theory, like logic, operates on the same level of generality as computer programming. It is *not tied to specific structures like sets or numbers* but as in programming, where we may define types to represent a wide range of structures, so in category theory, objects may range over many kinds of structures. This generality is exploited in describing the semantics of computation and can also be used to write highly generic codes.

Being based upon arrows and their compositions, category theory is an abstract theory of typed functions, with objects corresponding to types and arrows to functions. Notice that composition rather than application is the primitive operation. Through this identification, features of functional programming find categorical analogues. Type constructors
correspond to maps between categories (called functors), higher order functions are described in categories with additional structure (cartesian closed categories) and polymorphic types similarly. This correspondence between programming constructs and category theory formalizes structural properties of programs in an elementary equational language. Models of programming languages are then categories with suitable internal structure.

There is something more going on here. This correspondence translates functional language (like typed lambda-calculus) into an arrow-theoretic language: that is, translates a language with variables, where we can substitute values for names, into a variable-free combinator language.
Languages with variables seem more appropriate for programming and other
descriptive activities, whereas combinator languages are more suited to
algebraic manipulation and possibly more efficient evaluation. These ideas have been used to build abstract machines for implementing functional languages based upon categorical combinators.

Somewhat belying the abstraction of category theory is the fact that it is largely constructive-theorems asserting existence are proven by explicit construction. These constructions provide algorithms which can be coded as computer programs-programs with an unusual degree of generarality. They are abstracted over categories and so apply to a range of different data types, the same program performing analogous operations on types such as sets, graphs, and automata. In a sense, the core of category theory is
just a *collection of algorithms*.

Posted by: Stephen Harris on September 21, 2009 5:40 AM | Permalink | Reply to this

Re: CCAF vs ETCS

I haven’t read the following paper so I’m not certain what it includes about axioms.
Lawvere, F. William (1966), The category of categories as a foundation for mathematics, in S.Eilenberg et al., eds, ‘Proceedings of the Conference on Categorical Algebra, La Jolla, 1965’, Springer-Verlag, pp. 1–21.
—————————-

SH: I don’t think the formal axioms were presented in Lawvere’s 1963 thesis. I think they can be found in this 1964 paper
www.pubmedcentral.nih.gov/articlerender.fcgi?artid=300477

“We adjoin eight first-order axioms to the usual first-order theory of an abstract Eilenberg-Mac Lane category’ to obtain an elementary theory with the following properties:
(a) There is essentially only one category which satisfies these eight axioms together with the additional (non-elementary) axiom of completeness, namely, the category of sets and mappings. Thus our theory distinguishes 8 structurally from other complete categories, such as those of topological spaces, groups, rings, partially ordered sets, etc.”

This paper is also online with extended commentaries (long version) added later.
http://138.73.27.39/tac/reprints/articles/11/tr11abs.html

Posted by: Stephen Harris on September 18, 2009 2:37 AM | Permalink | Reply to this

Re: CCAF vs ETCS

It seems Mike is right about ETCS and CCAF being quite distinct. However, Lawvere doesn’t appear to see the merits to be quite as disconnected as apparently Mike does.
———————-

Colin McLarty wrote:
“Yet the second categorical foundation ever worked out, and the first in print,
was a set theory —- Lawvere’s axioms for the category of sets, called ETCS,
(Lawvere 1964).”

Lawvere, ETCS paper, page 34, wrote:
“However, it is the author’s feeling that when one wishes to go substantially
beyond what can be done in the theory [ETCS] presented here, a much more
satisfactory foundation for practical purposes will be provided by a theory
of the category of categories.”

“Part of the summer of 1963 was devoted to designing a course based on the
axiomatics of Zermelo-Fraenkel set theory (even though I had already before
concluded that the category of categories is the best setting for “advanced”
mathematics).”

“This elementary theory of the category of sets arose from a purely practical
educational need. …
But I soon realized that even an entire semester would not be adequate for
explaining all the (for a beginner bizarre) membership-theoretic definitions
and results, then translating them into operations usable in algebra and
analysis, then using that framework to construct a basis for the material I
planned to present in the second semester on metric spaces.
However I found a way out of the ZF impasse and the able Reed students could
indeed be led to take advantage of the second semester that I had planned.
The way was to present in a couple of months an explicit axiomatic theory of
the mathematical operations and concepts (composition, functionals, etc.) as
actually needed in the development of the mathematics.”

Posted by: Stephen Harris on September 18, 2009 7:14 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

When we write maths on a blackboard what do we actually do? This has to be analysed satisfactorily if we are ever to have decent graphical mathematical editors - a necessity if we want the communication of maths by computer to approximate the ease of a blackboard and coloured chalk.

What we do has two components, one visible and the other invisible. The visible component is a tree-structured graphic. The nodes of the tree represent the meaningful subexpressions. The editor must be able to ‘explode’ this tree, for example by a perspective representation, or by using a separate window to show different levels, so that the user can select subtrees for editing, dragging and dropping into other windows, etc.

The invisible component is the semantic content of the tree, rather than its visual display. This is the formalization that is in the back of the user’s mind. It may not be a complete formalization, so this is the tricky part. What formalization is appropriate for the computer? I suspect that there may be many answers to this.

The visible component needs software that may possibly already exist, but I do not know of it. The !Draw application for RISC OS is a structured vector-graphics editor, that has been around for a quarter of a century, and it has some of the features of what is required.

Posted by: Gavin Wraith on September 12, 2009 7:56 AM | Permalink | Reply to this

Ideocosm puzzles; Re: Towards a Computer-Aided System for Real Mathematics

“The invisible component is the semantic content of the tree, rather than its visual display. This is the formalization that is in the back of the user’s mind.”

Yes, but that’s where it gets infinitely tricky.

As I’ve said before, on other threads, we don’t know much at all about the Topology of the Ideocosm – the space of all possible ideas.

Here we admit that we don’t know how to formalize, illustrate, or automate the subspace of all possible “Mathematical ideas.”

I’m not sure we can even define it, given that Mathematics is at any time partly instantiated by what is in the heads of all Mathematicians, a society that changes over time both locally and globally.

I am not clear on how we might even define a hyperplane that separates the “Mathematical” ideas from the “nonmathematical” ideas within the Ideocosm.

Nor, for that matter, how we might even define a hyperplane that separates the “Physical” ideas from the “Nonphysical” ideas within the subspace of the Ideocosm of Theories of Mathematical Physics.

Posted by: Jonathan Vos Post on September 12, 2009 4:23 PM | Permalink | Reply to this

Re: Ideocosm puzzles

GW: The invisible component is the semantic content of the tree, rather than its visual display. This is the formalization that is in the back of the user’s mind.

JVP: Yes, but that’s where it gets infinitely tricky. As I’ve said before, on other threads, we don’t know much at all about the Topology of the Ideocosm - the space of all possible ideas.

It is just what can be represented on the semantic web. What is not known is only the part of it that is potentially useful. But the subspace of well-defined mathematical statements can be delineated up to semantic equivalence. It will just be what can be processed by FMathL, since the latter is designed to be able to process all mathematics. (Already Coq and Isabelle/Isar can do that, though not really conveniently.)

Posted by: Arnold Neumaier on September 14, 2009 4:10 PM | Permalink | Reply to this

Godel-numbering to “game” the metasystem; Re: Ideocosm puzzles

“the subspace of well-defined mathematical statements can be delineated up to semantic equivalence.”

I understand. But in a dynamic metasystem, where “new” mathematical ideas can be introduced coherently, as Category Theory historically came after Set Theory, it is not obvious to me that the semantics and pragmatics (“potentially useful”) are guaranteed always to be well-defined, once gadgets such as Godel-numbering are used to “game” the metasystem. But I’m eager to know more.

Posted by: Jonathan Vos Post on September 14, 2009 8:09 PM | Permalink | Reply to this

Re: Godel-numbering to “game” the metasystem; Re: Ideocosm puzzles

JVP: in a dynamic metasystem, where “new” mathematical ideas can be introduced coherently, as Category Theory historically came after Set Theory, it is not obvious to me that the semantics and pragmatics (“potentially useful”) are guaranteed always to be well-defined

Web ontology languages like RDF/OWL have a very general representation for semantics that can handle the way arbitrary concepts were looked at from antiquity till today, and hence probably far into the future. Semantical content is represented as a collection of triples of names.

For FMathL, it turned out to be more convenient to consider only collections of triples where the first two entries determine the third, leading to a partial binary dot operation that associates to certain pairs of objects a third one:

f . is_continuous = true

customer_1147 .f irst_name = Otto

etc.. As is easi to see, this still can hold arbitrary semantical relations between objects. The operation table of the dot operation is an infinite matrix that we call a semantic matrix.

The collective knowledge about mathematics can be considered as a huge and growing semantic matrix, of which the FMathL system is to capture the most important part.

JVP: … well-defined, once gadgets such as Godel-numbering are used to “game” the metasystem.

Consistency depends on the context, and is maintained in the usual way. Goedel’s results put limits on what is achievable constructively, but do not threaten well-definedness.

Posted by: Arnold Neumaier on September 15, 2009 12:23 PM | Permalink | Reply to this

What mathematicians carry around in their heads; Re: Godel-numbering to “game” the metasystem; Re: Ideocosm puzzles

Excellent answer. This suggests to me how carefully you’ve thought through your system, metasystem, metametasystem…

“The collective knowledge about mathematics can be considered as a huge and growing semantic matrix…”

Cf.:

Programming Language and Logic Links.

“Most mathematicians can go through their entire careers without learning anything about proof theory and intuitionistic logic, and I think the reason is that both undermine the naive model of mathematical foundations that most mathematicians carry around in their heads. Mathematicians hate thinking about foundations.”

Posted by: Jonathan Vos Post on September 16, 2009 6:25 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

GW: What we do has two components, one visible and the other invisible.

This not only holds for the blackboard case (which I leave to the OCR people, who do not too badly in turning formulas into latex) but also to the typed case.

Thus the visible component gives the expression tree (syntwx, e.g., presentation MathML), while the invisible component gives the intentions (e.g., Content MathML). The latter is the more difficult, poorly understood on the formal level thing.

Of course, when we write math on the blackboard, there are also our voice and gestures that convey helpful information for the semantical interpretation, e.g., a broad smile while saying “Let ϵ<0\epsilon\lt 0”.

On the other hand, for the syntactical side - do you really think that a visual view of the syntax tree of a formula such as a(b+c)=ab+aca(b+c)=ab+ac is more helpful than the equation itself? The longer the formula the less intelligible is the tree. (Consider a sequence of equalities A=B=C=D+E=FA = B = C = D + E = F with long expressions A,,FA,\dots,F that are the same apart from successive substitutions.)

Posted by: Arnold Neumaier on September 20, 2009 12:52 PM | Permalink | Reply to this

Herding cats

I’m someone who spends a fair amount of time trying to tease as much semantics as possible out of the LaTeX that people type.

“Presentational” MathML contains far more semantics than one typically finds in the LaTeX people actually generate, “in the wild.”

As a consequence, the output of itex2MML generally falls short of what would be possible in hand-crafted Presentational MathML (let alone Content MathML or the fancy-schmancy system discussed above).

Take as an example, the expression

f(a+b)^2

Of course, the exponent is not applied to the right parenthesis – even though that is literally how the expression is written.

Rather, what the author probably means is either

{ f(a+b) }^2

or, perhaps,

f { (a+b) }^2

Which it is, depends on what the expression

f(a+b)

is supposed to mean. In MathML, there are entities, &InvisibleTimes; and &ApplyFunction;, which would normally be placed between the <mi>f</mi> and the <mo>(</mo> to indicate whether we mean “the variable f times (a+b)” or “the function f applied to a+b.”

But, of course, there’s nothing like that in LaTeX.

What an ordinary LaTeX user can do is use brace brackets, as above, to — at least — indicate the desired grouping.

When I use itex (in the comments here on golem, or in Instiki), I try to always use braces to indicate grouping. That gets translated into the placement of <mrow> elements, which produces the right semantics (and not merely the right visual appearance) in the resulting MathML.

As far as I can tell, I am the only person who does that.

The problem of getting people to insert semantic information, into the stuff they write, was eloquently shown to be hopeless, long ago, in a famous essay by Cory Doctorow.

Posted by: Jacques Distler on September 14, 2009 4:25 AM | Permalink | PGP Sig | Reply to this

Re: Herding cats

Just a short technical followup, for those unfamiliar with LaTeX and/or MathML.

In LaTeX, grouping is indicated by brace brackets and the matching of left- and right-braces is strictly enforced. Parentheses do not indicate grouping (and the matching of left- and right-parentheses is not enforced).

In Presentational MathML, grouping is indicated by the <mrow> element. itex2MML translates matched pairs of braces into <mrow>s.

Posted by: Jacques Distler on September 14, 2009 5:27 AM | Permalink | PGP Sig | Reply to this

Re: Herding cats

Here you hit a problem in “worldwide inference”: one feature of brace pairs is that they turn TeX mathop’s into normal characters (without the wider mathop spacing), and years of having to deal with two-column proceedings styles have trained me to stick braces around any plus signs, etc, in all displayed equations to increase the likelihood of it fitting on one line (due to tight page limits). (Stylists will say I shouldn’t do this, but I’ve never received a referee report that indicates they’ve ever noticed, let alone object.) I don’t know what the itex parser would infer about the equations from this habit :-)

I’ve read a lot of the papers that Knuth wrote about the design and implementation of TeX and, whilst what he writes makes it clear he cares deeply about “exact” reproducibility of typesetting both in different installations and years later, I can’t recall any statements about the direct electronic use of TeX documents, particularly by algorithms. So it’s a reasonable hypothesis that he just didn’t think this would be relevant to documents produced in the lifetime of the system. But the TeX family has clearly lived far longer than expected and issues in it’s design are starting to affect new uses for the documents.

Posted by: bane on September 14, 2009 9:56 AM | Permalink | Reply to this

Re: Use of braces

Wouldn’t that effect be better achieved with

\everydisplay={\mathcode`\+="002B}

(forcing ++ to be a mathord instead of a mathbin in displayed equations)?

Posted by: Mike Shulman on September 14, 2009 8:04 PM | Permalink | Reply to this

Re: Use of braces

That’s probably a higher level way to do it, although it obviously needs extending to all the other mathethatical operations and relations I tend to use in displayed equations. (In case anyone’s wondering, there’s a greater tendency to have “word” variable names and subscripts in CS, which combined with 2 column means most displayed equations take just over a line in the “natural” spacing.)

My reason for doing it this way is only that of my knowledge came from the TeXbook with a bit of LaTeX knowledge bolted on.

The bigger point was that curly braces don’t always have no effect on appearance, and hence using them to denote structure will run in to some corner cases.

Posted by: bane on September 15, 2009 10:09 AM | Permalink | Reply to this

Re: Herding cats

itex2MML translates matched pairs of braces into <mrow>s.

I didn’t know that; where is it documented? Now that I know it, maybe I’ll make an effort.

However, the way I read the proposal, the idea was for a system that would be able to infer this sort of missing information from the context. It seems to me that in many cases, such as your example, this is just a matter of type-checking. If aa, bb, and ff are all variables representing numbers, then f(a+b)f(a+b) can only mean ff times (a+b)(a+b), whereas if aa and bb are numbers and ff is a function, then f(a+b)f(a+b) probably means ff applied to (a+b)(a+b) (unless you’re multiplying functions by numbers pointwise–but that’s often written only with the number on the left). In cases where more than one interpretation type-checks, it seems plausible to me that a computer could still sometimes infer the probable intent from the context, just as a human does. For example, if later on one encounters the statement g(f(a+b))=(gf)(a+b)g(f(a+b)) = (g\circ f)(a+b), it is a good bet that f(a+b)f(a+b) meant function application and not pointwise multiplication.

Posted by: Mike Shulman on September 14, 2009 6:39 AM | Permalink | Reply to this

Re: Herding cats

itex2MML translates matched pairs of braces into <mrow>s. I didn’t know that; where is it documented?

Alas, there isn’t any technical documentation on how itex2MML is implemented.

You might guess that this is how it works, based the description of how the \color command works. But, really, that would only occur to you if you knew what an <mrow> element was, in the first place. And that would put you in a very small minority indeed.

Now that I know it, maybe I’ll make an effort.

Great! Welcome to a very elite club.

Next thing you know, you’ll be using the \tensor{}{} and \multiscripts{}{}{} commands.

(Jason Blevins and I worked quite hard to write the LaTeX macros to implement those commands. You can see them in the TeX export in Instiki.)

However, the way I read the proposal, the idea was for a system that would be able to infer this sort of missing information from the context.

If you have sufficient context, you may be able to guess (humans, after all, manage to). Otherwise, you have to rely on people entering (correct!) metadata about what all the symbols mean (e.g., whether ff is a function or a variable).

That’s when you run into (some of) the problems mentioned in Cory Doctorow’s article.

Posted by: Jacques Distler on September 14, 2009 7:23 AM | Permalink | PGP Sig | Reply to this

Re: Herding cats

MS: However, the way I read the proposal, the idea was for a system that would be able to infer this sort of missing information from the context.

JD: If you have sufficient context, you may be able to guess (humans, after all, manage to). Otherwise, you have to rely on people entering (correct!) metadata about what all the symbols mean (e.g., whether f is a function or a variable).

A typical mathematical document together with the background of a trained reader contains everything needed to understand the paper. FMathL is therefore supposed to guess the interpretation from the context and from past experience, just as any mathematician does.

If this is not enough, a mathematician decides that the formula (or sentence, or article, or book) is too poorly written to merit understanding, and skips to the next formula (or sentence, or article, or book), perhaps coming back later, when the context has become richer. FMathL will be taught to do the same.

But since you seem to know MathML well, I wonder what you say to our study Limitations in Content MathML.

Posted by: Arnold Neumaier on September 14, 2009 4:25 PM | Permalink | Reply to this

Re: Herding cats

A typical mathematical document together with the background of a trained reader contains everything needed to understand the paper. FMathL is therefore supposed to guess the interpretation from the context and from past experience, just as any mathematician does.

Sounds like you’re trying (among other things) to develop a knowledge representation for mathematics.

Good luck with that!

But since you seem to know MathML well, I wonder what you say to our study Limitations in Content MathML.

I’ll take a look. But, off the top of my head, I’d say that one’s view of the (in)adequacy of CMML, depends on what you think its purpose is.

I see the primary use of CMML as a common data-interchange format between symbolic manipulation programs. For that purpose, I think it works passably well.

If you have some fancier use-case in mind, your answer may be different …

Posted by: Jacques Distler on September 14, 2009 5:18 PM | Permalink | PGP Sig | Reply to this

Re: Herding cats

JD: But, off the top of my head, I’d say that one’s view of the (in)adequacy of Content MathML depends on what you think its purpose is.

We were looking for what we could use to support FMathL activities (in particular, the representation of common formulas in mathematics, including block matrices in linear algebra and index notation for tensors) and simply recorded our failure to find it in Content MathML.

The Conten MathML document MathML2 of course takes a much more modest view on what it wants to achieve:

“It would be an enormous job to systematically codify most of mathematics - a task that can never be complete. Instead, MathML makes explicit a relatively small number of commonplace mathematical constructs, chosen carefully to be sufficient in a large number of applications. In addition, it provides a mechanism for associating semantics with new notational constructs. In this way, mathematical concepts that are not in the base collection of elements can still be encoded”

Unfortunately, the mechanism provided turned out to be almost useless.

Fortunately, the outlook is not as pessimistic as this disclaimer lets one guess, and we are close to a good solution (but not using MathML).

JD: I see the primary use of CMML as a common data-interchange format between symbolic manipulation programs. For that purpose, I think it works passably well.

I never tried to use automatic symbolic manipulation involving the definition of a covariant derivative in index notation. But if there is a differential geometry package that can do that, it will not be able to use Content MathML.

Posted by: Arnold Neumaier on September 15, 2009 12:04 PM | Permalink | Reply to this

Re: Herding cats

{ f(a+b) }^2 f { (a+b) }^2

From a strictly presentational point of view (which is, in this case, the point of view of LaTeX), these are wrong, since they put the superscript on the wrong element (the group instead of simply the right parenthesis.

The difference in these cases is tiny, but it exists; replace a with \sum_{i=1}^n a_i in a displayed equation to see better how it works. Of course, in that case, you probably want to use larger parentheses, so go ahead and use \left and \right; the spacing works differently. (But now the effect will be tiny again and in fact too subtle for iTeX2MathML.)

The problem is that grouping has meaning for TeX that may or may not match the semantic meaning that you intend to convey to MathML. It would be better (at least theoretically) to have a grouping command that is ignored by TeX but interpreted in MathML.

{(a+b)}^2_2

(a+b)^2_2

{(\sum_{i=1}^n a_i+b)}^2_2

(\sum_{i=1}^n a_i+b)^2_2

{\left(\sum_{i=1}^n a_i+b\right)}^2_2

\left(\sum_{i=1}^n a_i+b\right)^2_2

(a+b) 2 2{(a+b)}^2_2

(a+b) 2 2(a+b)^2_2

( i=1 na i+b) 2 2{(\sum_{i=1}^n a_i+b)}^2_2

( i=1 na i+b) 2 2(\sum_{i=1}^n a_i+b)^2_2

( i=1 na i+b) 2 2{\left(\sum_{i=1}^n a_i+b\right)}^2_2

( i=1 na i+b) 2 2\left(\sum_{i=1}^n a_i+b\right)^2_2

Posted by: Toby Bartels on September 14, 2009 10:23 AM | Permalink | Reply to this

Re: Herding cats

Here’s a screenshot of the same examples in TeX.

three equations, with and without braces

To me, the first and second and the fifth and sixth examples look nearly identical.

The third and fourth, of course, look radically different. But I’d say that’s because the parentheses are too small, and neither one looks “right” to me.

In each case, at least in my browser, the MathML, generated by itex2MML, looks pretty darned close to the TeX output.

Posted by: Jacques Distler on September 14, 2009 3:26 PM | Permalink | PGP Sig | Reply to this

Re: Herding cats

To me, the first and second and the fifth and sixth examples look nearly identical.

To tell the difference between the fifth and the sixth, I have to stack two of one on top of one of the other and look carefully at the vertical positions where the subscript of one line comes near the superscript of the next line. I can tell the difference between the first and second simply by looking at the gap between the multiscripts, but I still agree that they look nearly identical.

If you buy TeX's philosophy about how mathematical typesetting should be built out of boxes, then one is technically right and the other technically wrong, despite the small size of the practical effect. But if you think that CMML or something like it is the wave of the future, then this shouldn't matter to you, and putting grouping in iTeX is a good idea, even it produces technically incorrect TeX. Someday we should be able to print MathML just as nicely as we can now print TeX (and it's already close enough for the screen, at least when MathML supports everything), and then there will never be a need to use the TeX export.

Now here's a more practical consideration. I don't like the size of the delimeters produced by \left and \right; I've written macros that replace them with something slightly smaller. To keep things simple (even though it produces something slightly smaller yet than what I would like), let's do it with \bigg:

{\bigg(\sum_{i=1}^n a_i+b\bigg)}^2_2

\bigg(\sum_{i=1}^n a_i+b\bigg)^2_2

( i=1 na i+b) 2 2{\bigg(\sum_{i=1}^n a_i+b\bigg)}^2_2

( i=1 na i+b) 2 2\bigg(\sum_{i=1}^n a_i+b\bigg)^2_2

Actually, the difference in the MathML here is still pretty subtle, at least on my browser. I can see it better in actual TeX (with displayed equtions).

If MathML is the future, then fudging the heights of the delimiters like this would be a job for a stylesheet. I have no idea to do such a thing, however.

Posted by: Toby Bartels on September 14, 2009 9:45 PM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

Category Theory began — I’m talking about Aristotle here — with the observation that signs, symbols, syntax, and so on … are inherently equivocal, which means that we must refer them to the right categories of interpretation if we want to resolve their ambiguities.

In other words — to borrow a word that Peirce borrowed from Aristotle — there is an inescapably abductive element to the task of interpretation.

See, for example, Interpretation as Action : The Risk of Inquiry

Posted by: Jon Awbrey on September 14, 2009 6:30 PM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

Category Theory began — I’m talking about Aristotle here — […]

Although our term ‘category’ does come from Aristotle (via Kant), I would say that what Aristotle discusses is more type theory than category theory. He's still correct, however!

Posted by: Toby Bartels on September 14, 2009 9:44 PM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

Although our term ‘category’ does come from Aristotle (via Kant), I would say that what Aristotle discusses is more type theory than category theory.

It might be nice to have a paragraph with discussion of this terminology issue at category theory.

Unless we have already…

Posted by: Urs Schreiber on September 15, 2009 7:36 AM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

We’ve already aired the Kant connection.

Posted by: David Corfield on September 15, 2009 8:51 AM | Permalink | Reply to this

Philosophical Excavations

I cited my favorite locus classicus from Aristotle here:

Peirce’s first cut — it’s the deepest — is here:

Posted by: Jon Awbrey on September 15, 2009 2:12 PM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

I wrote:

It might be nice to have a paragraph with discussion of this terminology issue [the two meanings of “category”] at category theory.

David reacted:

We’ve already aired the Kant connection.

I don’t see any of this at nnLab:category theory

At least the MacLane-quote that Jon points to should be copied there.

I’ll take care of that now. But I won’t try to talk about Aristotle et al. That’s not my job.

Posted by: Urs Schreiber on September 15, 2009 5:52 PM | Permalink | Reply to this

Technical questions on the input interface

I have some technical questions on the blogging software. Maybe some things can be improved and/or explained.

1. When I use the text filter “itex to MathML with parbreaks”, how do I quote a piece of text from a previous mail? Trying to copy it with the mouse produces unintelligible output.

2. The interface provides the possibility to “Remember personal info”, but why can’t it remember the text filter used last time? I regularly forget to set it in the first preview to what I want.

3. It would be nice if the switch between “view chronologically” and “view threaded”, at present at the bottom of the whole page, would appear at the bottom of each message.

4. Why are the “Previous Comments and Trackbacks” repeated in each response window? It only makes navigating in the window more difficult (tiny motions have large consequences for long discussions like this one). I’d prefer to have a larger comment window.

5. The options are visible only before the first preview, which I found a bit of a nuisance. Also, it would be nice if they (nd the name information) appeared after the command window rather than before it, since this saves scrolling in the first round.

Posted by: Arnold Neumaier on September 16, 2009 8:34 PM | Permalink | Reply to this

Re: Technical questions on the input interface

When I use the text filter “itex to MathML with parbreaks”, how do I quote a piece of text from a previous mail? Trying to copy it with the mouse produces unintelligible output.

Copy it with the mouse and put a > character in front of it. This doesn’t work for math symbols, however. Several of us have been complaining about this for a while, but no one has fixed it yet.

The interface provides the possibility to “Remember personal info”, but why can’t it remember the text filter used last time?

I’ve complained about this a couple times also, but never got any answers.

Posted by: Mike Shulman on September 16, 2009 8:45 PM | Permalink | Reply to this

Re: Technical questions on the input interface

When I use the text filter “itex to MathML with parbreaks”, how do I quote a piece of text from a previous mail? Trying to copy it with the mouse produces unintelligible output.

Copy it with the mouse and put a > character in front of it.

No, that doesn't work with that filter; instead see the stuff about ‘blockquote’ at this FAQ.

Or try using the ‘Markdown with itex to MathML’ filter; it's a lot more powerful and includes this feature.

Posted by: Toby Bartels on September 16, 2009 9:52 PM | Permalink | Reply to this

Re: Technical questions on the input interface

There's a thread for this stuff … not that you'll find answers to your questions there.

Posted by: Toby Bartels on September 16, 2009 9:47 PM | Permalink | Reply to this

Re: Technical questions on the input interface

There's a thread for this stuff …

I've copied it there.

Posted by: Toby Bartels on September 16, 2009 10:16 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Just a note from the sidelines…

As a layman, I’m thoroughly enjoying this conversation. I’m starting to see an image of language itself being explained by category theory, which is mind boggling but makes perfect sense.

I always half-jokingly described mathematics as a “foreign language” similar to the way some organizations recognize fluency in a programming language as a “foreign language”.

Mathematics (maybe at the undergraduate level?) then seems like a perfect progression in the attempt to understand language and communication “arrow theoretically”.

I obviously don’t know what I’m talking about, but that is the hazy picture beginning to form in my mind.

By the way, since the goal seems to be to formalize the language of mathematics and eventually implement a computer system, then it also seems like it would make sense to develop the most fundamental mathematics concepts using the most fundamental computational concepts. Think about how computers encode information. Bits. Packets.

Now I’m just thinking out loud…

Posted by: Eric Forgy on September 18, 2009 3:49 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

After posting this, my mind started racing and I remembered that probably the most primordial object in category theory is the (-2)-category “True”. There are two (-1)-categories “True” and “False”.

Is it a coincidence that the most fundamental concept in a computer is the bit?

It would be fun to trace the development information content via bits on a computer with the development of information content via category theory beginning with the (-2)-category True.

Posted by: Eric Forgy on September 18, 2009 4:42 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Eric wrote “Is it a coincidence that the most fundamental concept in a computer is the bit?”

Just to note that modern digital computers are not the only kinds of computational devices. Eg, there were the old-time analogue computers, there are “multi-layer perceptrons”, there’s cellular automata (which, whilst having discrete states, binary states don’t seem “specially nice”). So, whilst there’s a very good case to be made that binary digital circuitry is the fundamental idea of computation, it’s not completely obvious that this is the case.

(And that’s without discussing if quantum mechanics changes things.)

Posted by: bane on September 18, 2009 5:19 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Let me change my question to:

Is it a coincidence that the most fundamental concept in a digital computer is the bit?

I find it somehow compelling that the building block of the periodic table is also the building block of the digital computer.

Maybe I am giddy because it is Friday, but that somehow seems profound to me.

Posted by: Eric Forgy on September 18, 2009 5:42 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I wasn’t trying to dismiss the concept of a connection; indeed given that digital computers are created by human beings who fond of very of True and False there’s almost certainly a deep connection. I was just pointing out that it’s currently unclear whether the “fundamental human approach to computation” is close to the “fundamental approach to computation”.

In terms of other connections, there’s obviously relations of information theory to: the most basic questions you can ask are ones with True or False as answers. But that’s another difficult to formalise connection.

Posted by: bane on September 18, 2009 6:21 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Eric, I think you would enjoy this paper which packs a lot of thinking into 15 pages. I like the idea of category theory describing structures, and then structures within those structures as an organization of rather than a foundation of mathematics.

philosophy.ucdavis.edu/landry/2CategoryTheoryTheLanguage.pdf
CATEGORY THEORY: THE LANGUAGE OF MATHEMATICS by Elaine Landry

“Rather, ‘like’ in the sense that just as mathematics, in virtue of its ability to classify empirical and/or scientific objects according to their structure, presents us with those generalized structures which can be variously interpreted. Likewise, then, a specific category, in virtue of its ability to classify mathematical concepts and their relations according to their structure, presents us with those frameworks which can be variously interpreted.7 It is in this sense that specific categories act as “linguistic frameworks” for concepts: they allow us to organize our talk of the content of various theories in terms of structure, because “[i]n this description of a category, one can regard “object,” “morphism,” “domain,” “codomain,” and “composites” as undefined terms or predicates” (Mac Lane 1968, 287).

In like manner, a general category, in virtue of its ability to classify mathematical theories and their relations according to their shared structure, presents us with those frameworks which can be variously interpreted.8 It is in this sense that general categories act as “linguistic frameworks” for theories: they allow us to organize our talk of the common structure of various theories in terms of structure, because in this description of a category once can regard “object,” “functor,” etc., as undefined terms or predicates. That is, general categories allow us to organize our talk of the structure of various theories in the same manner in which the various theories of mathematics are used to talk about the structure of their objects, viz., as “positions in structures.”

We say that category theory is the language of mathematical concepts and relations because it allows us to talk about their specific structure in various interpretations, that is, independently of any particular interpretation. Likewise, our talk of the relationship between mathematical theories and their relations is represented by general categories. We say that category theory is the language of mathematical theories and their relations because it allows us to talk about their general structure in terms of “objects” and “functors,” wherein such terms are likewise taken as ‘syntactic assemblages waiting for an interpretation of the appropriate sort to give their formulas meaning’.

Posted by: Stephen Harris on September 22, 2009 2:26 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Thanks for the reference Stephen! I’ve been traveling lately, hence the slow response, but managed to read this on a plane.

You certainly don’t owe me a coffee, but I’ll take it as an invitation and if you ever find yourself near LA with some free time, let me know as well :)

The thing that I find fascinating about this whole conversation is the glimmers of “arrow theoretic” formulation of communication itself. For example, what is “w” arrow theoretically?

Then a close cousin (or ancestor) of communication is information. How does category theory encode information? Can that be quantified?

In principle, it would seem to be possible to formulate a complete “arrow theoretic” means of communication.

Posted by: Eric Forgy on October 1, 2009 7:09 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’m starting a new thread because this is a reply to several different comments and because the nesting level in threaded view is getting ridiculous.

I think there is one fundamental problem that is cropping up repeatedly in several guises in this discussion, which I mentioned up here. It appears to have led me astray in a few places as well (as it’s done before in the past; argh!). Namely, in unaugmented structural set theory (including ETCS and SEAR and type theory), a structured set is not a single object in the domain of discourse. For instance, in none of these cases does the formal language allow one to talk about a group. A group in SEAR (to be specific) consists of a set GG and an element eGe\in G and a function m:G×GGm:G\times G \to G, such that certain axioms are satisfied. By contrast, in ZF, a group can be defined to consist of an ordered triple (G,e,m)(G,e,m) such that eGe\in G, mm is a function G×GGG\times G\to G, and certain other axioms are satisfied.

This is what I meant when I said here that the relation \in between rationals and (Dedekind) reals is extra structure on the same set of reals. The ordered field of reals consists of the set \mathbb{R}, some elements 0,10,1\in \mathbb{R}, some functions +,:×+,\cdot:\mathbb{R}\times\mathbb{R}\to \mathbb{R}, a relation ():({\le}): \mathbb{R} \looparrowright \mathbb{R}, etc. The relation ():({\in}): \mathbb{Q} \looparrowright \mathbb{R} is one more piece of structure on the same set \mathbb{R}, which you can use or not use as you please.

Likewise, this is what I think is going on with opposites. If I’m working with a group as above, then I can construct its opposite group to consist of the same set GG, the same element ee, and a reversed function mm. Now of course it makes sense to ask whether an element of GG and an element of G opG^{op} are equal, since they are elements of the same set. On the other hand, if I am just given two groups GG and HH, it makes no sense to ask whether an element of GG is equal to an element of HH. So what I said here is not quite right, and I apologize: what I should have said is that it would be a type error to compare elements of (sets underlying) two different structures unless those structures are built on the same underlying set(s).

(If you’re going to object to the notion of sets being “the same,” I think the answer was provided by Toby: we mean the external judgment that two terms are syntactically equal, rather than a (disallowed) internal proposition that two terms refer to the same object.)

So I think it is misleading to speak about two different groups “having elements in common”—either we are talking about two different group structures on the same set, in which case the two have exactly the same elements by definition, or we are talking about group structures on different sets, in which case asking whether an element of one is equal to an element of the other is a type error. Therefore, I was also not quite right when I said that a structural system would not be able to construct categories having common objects: it can construct pairs of categories (such as a category and its opposite) that have the same collection of objects, but no more.

Now going back to the original subject of intersections, in a structural theory the operation of “intersection” does not apply to arbitrary pairs of sets, but rather to pairs of subsets of the same fixed ambient set. (One might argue that the intersection of a set with itself should be defined (and equal to itself), but I think this probably derives from a misconception that distinct sets in structural set theory are “disjoint,” rather than it just not being meaningful to ask whether their elements are equal.) In particular, it is not meaningful to speak of the intersection of the sets of objects of two categories.


It never really occurred to me before to consider this peculiarity (that structured sets are not single things) as a problem, since “for any group, …” can always be interpreted as shorthand for “for any set GG, element eGe\in G, and function m:G×GGm:G\times G\to G such that <blah>, …”, and similar sorts of interpretations happen in other foundations like ZF. But I guess that this sort of implicit interpretation is part of the “compilation from high-level language to assembly language,” and what you want is a formalization of the higher-level language that (among other things) includes “a group” as a fundamental object of study.

I admit that this is definitely something I have found frustrating about existing proof assistants: they do not seem to really understand that “a group” is a thing. But somehow I never really isolated the source of my frustration before.

One way to deal with this (which I believe is adopted by many type theorists?) is to introduce one set called a “universe” UU whose elements are (interpreted as) sets in some way. Then a triple of sets can be modeled by an element of U×U×UU\times U\times U, and so on. But a really structural approach would insist that sets are not the elements of any set, but rather the objects of a category—so what we really need is structural category theory, which doesn’t quite exist yet. Possibly this is what Lawvere was thinking of when he said that “when one wishes to go substantially beyond [ETCS], a much more satisfactory foundation… will be provided by a theory of the category of categories”—although I’ve always felt that one should really be thinking about the 2-category of categories.

Regardless, it does seem that existing structural frameworks fall short here. Now I feel all fired up to improve them! But that’s a whole nother kettle of worms.

I’m also feeling bad about hijacking this discussion with a long branching argument about the merits of structural/categorial set theory. If people have the energy, let’s also go back to FMathL and your larger proposal and see what other constructive things we can discuss about it. As I said way back at the beginning, I am really excited about the overall idea—which is perhaps what is driving me to be overly critical of FMathL, since I would like such a system to “get it right” (and for better or for worse, I usually seem to think I know what’s “right”). But as I’ve said, I feel like I understand a little better now where you are coming from and what problems you are trying to solve.

Posted by: Mike Shulman on September 21, 2009 7:26 AM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Namely, in unaugmented structural set theory (including ETCS and SEAR and type theory), a structured set is not a single object in the domain of discourse. For instance, in none of these cases does the formal language allow one to talk about a group.

One reason why material (ZFC\mathbf{ZFC}-like) foundations want to package things up into a tuple is that one might want to make these tuples elements of some other set. In structural foundations, you couldn't do that anyway; if you want a family of groups, then you need that to be parametrised by some index set II, and you have a group G kG_k for every element kk of II. You could do that in material foundations too, of course, but if instead you want to allow a family to ‘parametrise itself’, then you need each of the objects in the family (groups, in this case) to formally be single objects that can be elements of a set.

I admit that this is definitely something I have found frustrating about existing proof assistants: they do not seem to really understand that “a group” is a thing.

In Coq, you can do this with Records. Formally, this is based on having a Type of Sets and all that (Records are just user-friendly sugar), but you never need to use anything about that type (such as the equality predicate on its elements, which would be evil).

But a really structural approach would insist that sets are not the elements of any set, but rather the objects of a category—so what we really need is structural category theory, which doesn’t quite exist yet.

I'd like to see structural \infty-groupoid theory; I think that I could do a lot with that, possibly everything that I want. Of course, there are already \infty-groupoids hidden in ordinary intensional type theory, but it's not clear that there are enough; I would really want types that are explicit \infty-groupoids, and nothing more.

Posted by: Toby Bartels on September 21, 2009 7:59 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

MS: it would be a type error to compare elements of (sets underlying) two different structures unless those structures are built on the same underlying set(s).

You need to make more exceptions to account for subcategories. There may be a category with object set C and two subsets A and B such that there are subcategories with object sets A and B. In this case, one must be able to compare their elements, too, because of the definition of a subcategory. With ZF or NBG as metalanguage (as usual), this already implies that one can compare objects from any two categories.

With SEAR as metatheory it might be different since I have not yet a good intuition about SEAR. (But I added a number of comments on the SEAR page that make me doubt that it is a mature enough theory.)

MS: let’s also go back to FMathL and your larger proposal and see what other constructive things we can discuss about it. As I said way back at the beginning, I am really excited about the overall idea

Yes, I’d appreciate that.

Posted by: Arnold Neumaier on September 21, 2009 3:25 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

You need to make more exceptions to account for subcategories. There may be a category with object set C and two subsets A and B such that there are subcategories with object sets A and B. In this case, one must be able to compare their elements, too, because of the definition of a subcategory.

Structurally, a “subcategory” of CC means a category equipped with an injective functor to CC, just like a “subset” means a set equipped with an injective function (except in SEAR, where subsets are technically distinguished from their tabulations—but even there, it is only the tabulation which is itself a “set” and therefore can be the set of objects of a category). Therefore, if AA is a subcategory of CC, then objects of AA cannot be compared directly to objects of CC, but only after applying the inclusion functor.

Posted by: Mike Shulman on September 21, 2009 5:56 PM | Permalink | PGP Sig | Reply to this

still on objects common to different categories

MS: if A is a subcategory of C, then objects of A cannot be compared directly to objects of C, but only after applying the inclusion functor.

Then please tell me why, formally, the following reasoning is faulty.

I am using Definitions 1.1.1 (category) and 1.1.3 (subcategory) taken from Asperti and Longi, modified to take account of your statement above. If you think these are faulty, please give me a reliable replacement that I may take as authoritative.

But I do not accept any moral injunction unless it is presented as a formal restriction to what a theorem prover would be allowed to do.

Let C abcdC_{abcd} be the category whose objects are the symbols a,b,c,da,b,c,d, with exactly one morphism between any two objects, composing in the only consistent way. Let the categories C abcC_{abc} and C abdC_{abd} be defined similarly. Clearly, these are both subcategories of C abcdC_{abcd}, with the identity as the inclusion functor. But I can compare their objects for equality.

[related snippets of other mails]

MS: I would consider the class of objects of each category to be a separate type,

Would this work consistently in the above example?

AN: I first need to understand what “should” be understood after having read Definition 1.1.1 and what after Definition 1.3.1.

MS: I think that what “should” be understood at this point is that the authors made a mistake in stating the exercise.

This requires having also read the exercise. But my example above seems to indicate that something nontrivial and unstated should already be understood at the point these two definitions have been read.

What we do in this whole discussion is in fact the typical process of how mathematicians growing up in different traditions learn to align their language so that they may speak with each other without generating (permanent) misunderstandings. As a result, the language and awareness of all participating parties is sharpened, and then applicable to a wider share of mathematical documents.

Posted by: Arnold Neumaier on September 22, 2009 2:39 PM | Permalink | Reply to this

Re: still on objects common to different categories

if A is a subcategory of C, then objects of A cannot be compared directly to objects of C, but only after applying the inclusion functor.

Then please tell me why

I am not a mathematician and I think this has nothing to do with category theory and even with mathematics:

You assume that you can talk about objects (in the lay meaning of the word, not categorically) from A and C without taking care of defining the proper “universe of discourse” in which the denotations for objects in A and objects in C can validly appears in the same sentence.
This is a platonist stance, the objects “exist” independently of the discourse about them.
I would say, not so, Platonism has been poisoning mathematics and philosophy for more than two millenia.

Posted by: J-L Delatre on September 22, 2009 8:49 PM | Permalink | Reply to this

Re: still on objects common to different categories

Let C abcdC_abcd be the category whose objects are the symbols a,b,c,da,b,c,d, with exactly one morphism between any two objects, composing in the only consistent way. Let the categories C abcC_abc and C abdC_abd be defined similarly. Clearly, these are both subcategories of C abcdC_abcd, with the identity as the inclusion functor. But I can compare their objects for equality.

Yes, this is like the example of opposite categories. You started with the four objects a,b,c,da,b,c,d and constructed things out of them. It should be possible to force you to use only copies of these four objects when you construct new things out of them, so that equality between them would not make literal sense, but I don't see the point in doing so. The constructions will still come equipped with operations to the type {a,b,c,d}\{a,b,c,d\} from the types of objects of the various categories, and we could compare them for equality along those operations. So why not formalise those operations as identity? That's how I would do it.

I don't know if that's how Mike would do it. One could not do that in SEAR\mathbf{SEAR} (although one could for the example of opposite categories).

I think that it was John Armstrong who first wrote

But it really doesn’t matter, since equality of components of two distinct categories is not part of the structure.

I would not put it quite that way. I would say that equality of objects (or morphisms, etc) of two arbitrary categories is not meaningful; that is, in a context where all that is said about two categories CC and DD is that they are categories, then equality of their objects is not meaningful. However, if CC and DD are given to us in some more complicated way (such as D:=C opD := C^{op}, or even D:=CD := C for that matter), then that might give some meaning to equality of their objects.

There is nothing particularly special about categories in this respect; the same goes for elements of groups, for example.

Posted by: Toby Bartels on September 22, 2009 8:59 PM | Permalink | Reply to this

Re: still on objects common to different categories

TB: I would say that equality of objects (or morphisms, etc) of two arbitrary categories is not meaningful; that is, in a context where all that is said about two categories C and D is that they are categories, then equality of their objects is not meaningful. However, if C and D are given to us in some more complicated way, then that might give some meaning to equality of their objects.

In the standard interpretation of mathematical language that everyone learns as an undergraduate, the standard definition of a category implies the following:

Equality between two objects of two arbitrary categories is undecidable, while that of two categories given by some explicit construction may be decidable. (Something analogous holds for elements of groups, etc. in place of objects of categories.)

With this modification, I agree with you. Indeed, this is what you have in FMathL. But undecidable and meaningless are different notions - the first says that you cannot assign any definite truth value to it, the second says that it is not well-formed. With your formulation (using meaningless), one does not get something consistent.

Posted by: Arnold Neumaier on September 23, 2009 10:20 AM | Permalink | Reply to this

Re: still on objects common to different categories

In the standard interpretation of mathematical language that everyone learns as an undergraduate,

And which I unlearnt as a graduate (^_^)

But undecidable and meaningless are different notions - the first says that you cannot assign any definite truth value to it, the second says that it is not well-formed.

Right.

With your formulation (using meaningless), one does not get something consistent.

I can't imagine what you mean by this. What is inconsistent?

Posted by: Toby Bartels on September 25, 2009 6:50 PM | Permalink | Reply to this

Re: still on objects common to different categories

From another point of view, the construction {a,b,c,...}C a,b,c,...\{a,b,c,...\} \mapsto C_{a,b,c,...} is a functor from the Category of Sets to the Category of Categories. The Category of Categories doesn’t itself have a privileged functor from C a,b,cC_{a,b,c} to C a,b,c,dC_{a,b,c,d}. The one that looks natural comes from the natural-looking map from {a,b,c}\{a,b,c\} to {a,b,c,d}\{a,b,c,d\} — which is precisely the absent thing in SEAR or (i.i.r.c.) ETCS.

But it gets worse (or better!): we often want to consider the 2-category of categories, functors, and natural transformations (or of groupoids, functors and natural (automatically) isomorphisms). And in this setting, there isn’t much reason to privilege the natural-looking functor from C a,b,cC_{a,b,c} to C a,b,c,dC_{a,b,c,d} over the functor that sends each object in {a,b,c}\{a,b,c\} to dd! It’s a naturally isomorphic functor, and in this case all of them are equivalences, anyways.

But you asked about what’s wrong “formally”. Since I haven’t read all of what you mean by “formal”, I similarly don’t know if this addresses that question at all.

Posted by: some guy on the street on September 23, 2009 6:43 AM | Permalink | Reply to this

Re: still on objects common to different categories

sg: But you asked about what’s wrong “formally”. Since I haven’t read all of what you mean by “formal”, I similarly don’t know if this addresses that question at all.

Formally = in a way that it is clear how to teach it to an automatic system like FMathL or Coq.

How do you prevent such a system from drawing the conclusions I draw when the only context given are a definition of a category and of a subcategory, but the system already knows how to handle the language of naive set theory (with {x in A | property} but not {x | property}) and of elementary algebra as taught in undergraduate courses?

None of the three answers given so far resolves this. My conclusions are perfectly allowed according to the usual conventions of reading mathematics.

Thus in order to unambiguously defining the intended meaning, one needs to specify (without using the concept of a category since this is not yet born) either a different way of interpreting the same wording, or one needs to give a different wording to the standard definitions.

Posted by: Arnold Neumaier on September 23, 2009 9:42 AM | Permalink | Reply to this

Re: still on objects common to different categories

How do you prevent such a system from drawing the conclusions I draw when the only context given are a definition of a category and of a subcategory, but the system already knows how to handle the language of naive set theory (with {x in A | property} but not {x | property}) and of elementary algebra as taught in undergraduate courses?

A strongly typed system would not conclude that there exists an xx such that xx is an object of C a,b,cC_{a,b,c} and xx is an object of C a,b,c,dC_{a,b,c,d}, because none of that can be expressed in the language. Quantification requires a domain (which we can fix here), and being an object of some category is not a predicate (which we can't fix without changing what it says a bit).

I imagine that a less strongly typed system might be able to conclude something like that, while still rejecting 121 \in \sqrt 2 as meaningless (at least in the default context). You would probably do this through subtyping. But I don't have much experience with subtyping.

A strongly typed system can still handle {x in A | property}. Assuming that property uses the variable x only where an element of A makes sense, then the system should accept this as specifying a subset of A (however the notion of subset is formalised).

So if you start with C a,b,c,dC_{a,b,c,d}, then you can construct the set of objects of C a,b,cC_{a,b,c} as [the underlying set of] the subset {x in A | property}, where A is the set of objects of C a,b,c,dC_{a,b,c,d} and property states that an element of AA equals aa, bb, or cc. Then you can continue to get the entire category C a,b,cC_{a,b,c}. You can also give C a,b,cC_{a,b,c} the structure of a subcategory of C a,b,c,dC_{a,b,c,d}. A system that knows about category theory could helpfully construct all of this for us as soon as we write down property and ask it to construct the corresponding full subcategory.

Now, any system (if it's any good for category theory) should be able to conclude this: There exists an object xx of C a,b,c,dC_{a,b,c,d} that belongs to the subcategory C a,b,cC_{a,b,c}. (There is a formal distinction between C a,b,cC_{a,b,c} as a subcategory of C a,b,c,dC_{a,b,c,d} and C a,b,cC_{a,b,c} as a category in its own right, which we normally ignore by abuse of language.) You already know about the distinction between being an element of a set —a typing declaration— and belonging to a subset —a relation between elements and subsets of a given set—; the same holds for being an object of a category and belonging to a subcategory.

Posted by: Toby Bartels on September 25, 2009 8:04 PM | Permalink | Reply to this

Re: still on objects common to different categories

From another point of view, the construction {a,b,c,}C a,b,c,\{a,b,c, \ldots\} \mapsto C_{a,b,c, \ldots} is a functor from the Category of Sets to the Category of Categories. The Category of Categories doesn’t itself have a privileged functor from C a,b,cC_{a,b,c} to C a,b,c,dC_{a,b,c,d}. The one that looks natural comes from the natural-looking map from {a,b,c}\{a,b,c\} to {a,b,c,d}\{a,b,c,d\} — which is precisely the absent thing in SEAR or (i.i.r.c.) ETCS.

I’m not sure what this is supposed to mean. The desired inclusion {a,b,c}{a,b,c,d}\{a, b, c\} \hookrightarrow \{a, b, c, d\} in SetSet is constructed in ETCS by interpreting these sets as a 3-fold and 4-fold coproduct of copies of a terminal set 11 and invoking universal properties of coproducts. So I have no idea what is meant by saying it’s “absent” from ETCS (or SEAR for that matter).

There’s a meta-theorem that ETCS and Bounded Zermelo set theory with Choice are bi-interpretable in one another, so that anything you can express in one is expressible in the other. This might be helpful in realizing what can and cannot be said in ETCS. Mike has also written down bi-interpretability statements in this vein in the SEAR article.

Posted by: Todd Trimble on September 23, 2009 1:35 PM | Permalink | Reply to this

Re: still on objects common to different categories

OK, I stand corrected.

Posted by: some guy on the street on September 23, 2009 4:09 PM | Permalink | Reply to this

Re: still on objects common to different categories

To be fair, though, I didn’t say quite what I should have. Better would have been: in ETCS, given a 4-fold coproduct of copies of 11, whose four coproduct inclusions 11+1+1+11 \to 1 + 1 + 1 + 1 are given names aa, bb, cc, dd, we can think of those inclusions as providing subsets, and then construct the union of the subsets a,b,ca, b, c. If we give the 4-element set the name “{a,b,c,d}\{a, b, c, d\}”, this union gives a subset which interprets what is standardly meant by the inclusion {a,b,c}{a,b,c,d}\{a, b, c\} \hookrightarrow \{a, b, c, d\} in naive set-theoretic language.

Continuing the bridge between naive language and more formal language (and keeping in mind that in structural set theory, elements of SS are defined to be morphisms 1S1 \to S), we go on to define a membership relation between elements of a set like S={a,b,c,d}S = \{a, b, c, d\} and subsets of SS, like the one we just named {a,b,c}{a,b,c,d}\{a, b, c\} \hookrightarrow \{a, b, c, d\}: we say an element xx of SS is a member of a subset TST \hookrightarrow S if x:1Sx: 1 \to S factors through the subset inclusion. Then the members of the subset {a,b,c}{a,b,c,d}\{a, b, c\} \hookrightarrow \{a, b, c, d\} are indeed the elements of SS we called a,b,ca, b, c, and everything is as it should be. But as you can see, some slight care is needed to give the naive language rigorous meaning in ETCS.

Posted by: Todd Trimble on September 23, 2009 5:05 PM | Permalink | Reply to this

Re: still on objects common to different categories

in structural set theory, elements of SS are defined to be morphisms 1S1\to S

This is true in ETCS, but not in SEAR.

Posted by: Mike Shulman on September 23, 2009 5:18 PM | Permalink | PGP Sig | Reply to this

Re: still on objects common to different categories

Thanks. That’s what I meant.

Posted by: Todd Trimble on September 23, 2009 7:20 PM | Permalink | Reply to this

Re: still on objects common to different categories

Let C abcdC_{abcd} be the category whose objects are the symbols a,b,c,da,b,c,d

I’m going to assume we’re talking about small categories, so that we can make formal sense of them in any set theory. As we’ve said repeatedly, there is nothing special about categories here and the presence of evil can muddy the waters, but since you’re insisting on talking about categories instead of, say, groups, let’s go on that way.

In structural set theory you can’t just pull things out of the air and make them into a set. They have to be given to you as elements of some other set. So where are those symbols coming from? I’m guessing that you have in mind some infinite set of symbols, which could be represented by \mathbb{N}, so that you can construct its subset {a,b,c,d}\{a,b,c,d\} (using, for example, an encoding a=0,b=1,c=2,d=3a=0,b=1,c=2,d=3), which is (or, in SEAR, its tabulation is) another set equipped with a specified injection into \mathbb{N}. Now you can of course construct a further subset {a,b,c}\{a,b,c\} with a further injection into {a,b,c,d}\{a,b,c,d\}. Those injections are then how you then compare their elements.

Posted by: Mike Shulman on September 23, 2009 3:41 PM | Permalink | PGP Sig | Reply to this

What is a structured object?

MS: either we are talking about two different group structures on the same set, in which case the two have exactly the same elements by definition, or we are talking about group structures on different sets, in which case asking whether an element of one is equal to an element of the other is a type error.

Are we allowed in SEAR to talk about group structures on different subsets of the same set? Then we can again ask these questions.

Or is this question meaningless? You haven’t defined how to create in SEAR objects such as groups or group structures. Are the latter sets, elements, or relations?

Or are they a new type of formal objects that were not present before? Then how did they come into existence? (If you want to have objects of each category to be of a different type, you better first allow for a countable set of types in SEAR.)

Or are they only metaconcepts without a formal version, just a way of talking? (But on the metalevel, one seems to be able to compare arbitrary semantical constructs. Or do you want to impose restrictions on what qualifies for valid metastatements?)

MS: I guess that this sort of implicit interpretation is part of the “compilation from high-level language to assembly language,” and what you want is a formalization of the higher-level language that (among other things) includes “a group” as a fundamental object of study.

This seems necessary in order that a theorem prover can understand all conventions.

But it seems that in category theory one has “groups” as formal objects (of the category of groups), while “group structure on a set” is a meta-object only, consisting of a set SS and a group GG with set(G)=Sset(G)=S, where setset is the forgetful functor that removes the operations.

So part of the problem appears to lie in that you switch between different points of view (formal object or only a way of speaking tha can be formalized only by eliminating the concept) about what a group is.

Asking the system to rewrite all occurences of groups (and other structural concepts that in ZF would be tuples) by elimination of these concept on the formal level probably may create a huge overhead in view of the nested object constructions we often have in mathematics.

I commented on a related issue at the pure set entry of the nLab (under Membership trees). [I just see that the double opening apostrophe lead to an unsuspected result there. Unfortunately, the nLab editing has no preview facitly that would allow one to see things before posting.]

Too much is fuzzy for me to see what you really want to have.

Posted by: Arnold Neumaier on September 22, 2009 3:42 PM | Permalink | Reply to this

Re: What is a structured object?

You haven’t defined how to create in SEAR objects such as groups or group structures. Are the latter sets, elements, or relations?

Given a set GG, a group structure on GG is a function from G×GGG \times G \to G such that …. Functions and products have already been discussed, and the condition … can be stated in the language of SEAR\mathbf{SEAR}. I claim, as a partisan of structural set theory, that all definitions and proofs in ordinary mathematics are like this, modulo abuses of language (such as suppressing the inclusion function XYX \to Y when XX is defined as a subset of YY) that are no worse than the abuses used with ZFC\mathbf{ZFC}.

(If you want to have objects of each category to be of a different type, you better first allow for a countable set of types in SEAR.)

This is already present; SEAR\mathbf{SEAR} is a theory in first-order logic, so it already has a countable set of variables. It is a dependent type theory, and each pair of variables for a set gives a type of relations; the only other type in it is the type of sets. As 1+2 0= 01 + 2 \aleph_0 = \aleph_0, that is the number of types. (Of course, this is all on the metalevel.)

Asking the system to rewrite all occurences of groups (and other structural concepts that in ZF would be tuples) by elimination of these concept on the formal level probably may create a huge overhead in view of the nested object constructions we often have in mathematics.

On the contrary, a series of definitions like ‘A group is a set equipped with a function ….’, ‘A ring is a group equipped with a function ….’, and ‘An ordered field is a ring equipped with a relation ….’ leads in the ‹A structure is a tuple.› view to an ordered field being a pair with a pair with a pair (((S,+),),<)(((S,+),\cdot),\lt); quite a mess! While in the ‹A structure consists of several objects.› view, it simply leads to a set, two functions, and a relation.

On the other hand, one can take the structure as tuple view in a structural foundation, using something like Coq's Records, if one wants to. But this requires a richer ground type theory than SEAR\mathbf{SEAR} has.

[I just see that the double opening apostrophe lead to an unsuspected result there. Unfortunately, the nLab editing has no preview facitly that would allow one to see things before posting.]

[Yes, many others have asked for a Preview; the downside is that one of the biggest complaints about MediaWiki is that it's too easy to lose your edit since you forgot that the Preview was not a Save! The philophy in Instiki is that your Sumbit is a preview; if you don't like what you see, then you edit again, and it counts as only one edit in the history if your Submits are all within 30 minutes of each other no other editor slips in between. See discussion here.]

Posted by: Toby Bartels on September 22, 2009 9:45 PM | Permalink | Reply to this

Re: What is a structured object?

AN: You haven’t defined how to create in SEAR objects such as groups or group structures. Are the latter sets, elements, or relations?

TB: Given a set G, a group structure on G is a function from G×G to G such that ….

OK. Thus a group structure on a set G is an object of type relation(GxG,G).

Now what is a group? Does one have to eliminate the concept of group in favor of that of a group structure when going from informal SEAR to formal SEAR as a first order logic with dependent types?

AN: Asking the system to rewrite all occurences of groups (and other structural concepts that in ZF would be tuples) by elimination of these concept on the formal level probably may create a huge overhead in view of the nested object constructions we often have in mathematics.

TB: On the contrary, a series of definitions like “A group is a set equipped with a function …” leads in the “A structure is a tuple” view to an ordered field being a pair with a pair with a pair (((S,+),\cdot),<\lt); quite a mess!

At present, every formalization of a piece of mathematics is a mess; this was not the point.

What I was referring to was the overhead in the length of the formalization. With ZF, you can formalize a concept once as a tuple, and then always use the concept on a formal level.

But with a composite thing that exist only as a way of speaking, the formalization must replace this thing in each occurrence by the defining way of speaking. If this happens recursively (and much of mathematics is deep in the sense of data structures), the size of the formal expression may explode to the point of making the automatic verification of simple high-level statements a very complex task.

TB: On the other hand, one can take the structure as tuple view in a structural foundation, using something like Coq’s Records, if one wants to. But this requires a richer ground type theory than SEAR has.

This is what I was aiming at. For reflection purposes, one cannot work in pure SEAR, while one can do that in pure ZF.

TB: The philophy in Instiki is that your Sumbit is a preview; if you don’t like what you see, then you edit again, and it counts as only one edit in the history if your Submits are all within 30 minutes of each other no other editor slips in between.

good to know.

Posted by: Arnold Neumaier on September 23, 2009 1:58 PM | Permalink | Reply to this

Re: What is a structured object?

Now what is a group? Does one have to eliminate the concept of group in favor of that of a group structure when going from informal SEAR to formal SEAR as a first order logic with dependent types?

I already answered this up here: “A group in SEAR consists of a set GG and an element eGe\in G and a function m:G×GGm:G\times G \to G, such that certain axioms are satisfied.”

A group in SEAR is not a single thing in the universe of discourse. This is not a problem for formalization at a low level, but it may be undesirable when trying to formalize at a higher level, for all the reasons that you’ve given. But it doesn’t prevent SEAR from reflecting on itself formally.

In Isabelle, at least, “a group” is really defined as follows: given a type 'a, one constructs the type 'a group of group structures on 'a, and then defines a group (of type 'a) to be an element of 'a group. I presume this is what Coq’s Records are like as well. I don’t see why this couldn’t be done in SEAR just as well, though: given a set AA we can define the set grp(A)grp(A) of group structures on AA as a subset of A A×AA^{A\times A}. Of course just knowing an element of grp(A)grp(A) doesn’t give you anything unless you remember that grp(A)grp(A) was constructed from AA in a particular way.

Posted by: Mike Shulman on September 23, 2009 3:33 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

Arnold Neumaier wrote:

OK. Thus a group structure on a set GG is an object of type relation(GxG,G)relation(GxG,G).

Now what is a group? Does one have to eliminate the concept of group in favor of that of a group structure when going from informal SEAR to formal SEAR as a first order logic with dependent types?

In SEAR, yes.

I would not found FMathL directly on SEAR, if I were you. Besides any efficiency problems, it's simply more user-friendly to treat a group as a single object. I would probably give any computer system for abstract mathematics a simple dependent type theory with support for dependent sums (and probably only depedent sums, unless I really want to found the whole thing on type theory) and implement the interface similarly to Coq's Records.

For reflection purposes, one cannot work in pure SEAR, while one can do that in pure ZF.

I don't see what reflection has to do with it. It's a matter of convenience and (if you say so) efficiency. Writing SEAR in a dependent type theory with direct sums doesn't add any strength to it (as long as you don't add quantification over types), since you can eliminate them (down to the base types).

Mike Shulman wrote in response:

In Isabelle, at least, “a group” is really defined as follows: given a type 'a, one constructs the type 'a group of group structures on 'a, and then defines a group (of type 'a) to be an element of 'a group. I presume this is what Coq’s Records are like as well. I don’t see why this couldn’t be done in SEAR just as well, though: given a set AA we can define the set grp(A)grp(A) of group structures on AA as a subset of A A×AA^{A \times A}. Of course just knowing an element of grp(A)grp(A) doesn’t give you anything unless you remember that grp(A)grp(A) was constructed from AA in a particular way.

What seems to be missing is the concept of just having a group simple, rather than a group of type 'a or an element of a set constructed from AA in a particular way.

Here is how you would define the type of groups in Coq:

Record Group: Type := {uGroup: Set; sGroup: GroupStructure uGroup}.

That is, a group consists of a set and a group structure on that set. If you have G of type Group, then uGroup G is of type Set and sGroup G is of type GroupStructure uGroup G. I'm assuming that we've already defined GroupStructure, since that is not a problem even in SEAR, but it might be more user-friendly not to do this but instead to put everything into the Record:

Record Group: Type := {uGroup: Set; mGroup: uGroup -> uGroup -> uGroup; aGroup: forall (x y z: uGroup), mGroup (mGroup x y) z = mGroup x (mGroup y z);

and so on.

Posted by: Toby Bartels on September 25, 2009 8:11 PM | Permalink | Reply to this

Re: What is a structured object?

Let me say more explicitly the following: I am not asserting (any more) that structural set theory is sufficient for what you want to do. I can’t speak for anyone else, but I accept that structural set theory, whether ETCS or SEAR or whatever, has flaws preventing it from being used for the high-level computerized mathematical tool you want to create. They are different flaws from the flaws of ZF, and they are different from the flaws that it sounded to me like you were ascribing to it in your introduction to FMathL, but they are flaws nonetheless.

I do assert that structural set theory is a sufficient low-level foundation for mathematics on a par with ZF, and I believe that it is closer to the way mathematicians treat sets in everyday practice. (Although the language they use, in particular words like “subset,” tends to evoke material set theory. This can certainly be a barrier to understanding structural set theory; I blame it on the accident of history and the current ascendancy of ZF as a foundation.) Thus, I would like it if a higher-level language could be based on, or at least more in line with the ideas of, structural set theory. You clearly disagree with these assertions as well, and I’m more than happy to continue discussing them, but let’s not confuse them with the difficulty of implementing things on a computer.

Posted by: Mike Shulman on September 23, 2009 4:06 PM | Permalink | PGP Sig | Reply to this

Material vs. structural foundations of mathematics

MS: I do assert that structural set theory is a sufficient low-level foundation for mathematics on a par with ZF, and I believe that it is closer to the way mathematicians treat sets in everyday practice.

I had a few days off to do things neglected while occupied with this time-consuming discussion. Instead of replying individually to the single contributions, let me summarize how things look form my perspective.

My main conclusion from the present discussion and from reading the nLab pages on SEAR and pure sets is the following:

In a material theory, structural objects are constructed as anonymous objects chosen from the equivalence classes of mathematical structures of some kind with respect to isomorphism. Then one can do all structural mathematics inside suitable such collections of equivalence classes. However, to do so for nontrivial mathematics requires numerous abuses of language and notation; otherwise the description becomes tedious and virtually incomprehensible.

In a structural theory, material objects are constructed as the rigid objects in some category, with being isomorphic as equality. Then one can do all material mathematics inside suitable such collections of rigid objects. However, to do so for nontrivial mathematics requires numerous (but different) abuses of language and notation; otherwise the description becomes tedious and virtually incomprehensible.

Thus from a descriptive point of view, the material interpretation and the structural interpretation form equivalent approaches, both not describing the essence of mathematics but only straightjackets into which this essence can be forced in a Procrustean way, and in which one feels better is a matter of taste. My taste is that neither of these should be used. I want to be free of straitjackets. Thus I favor a declarative theory similar to FMathL, which accounts for the actual mathematical language and needs no abuses of language.

From a logical point of view, there is the additional question of proof power of the two views. I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC, but I’d find it plausible that there are hierarchies of structural theories and hierarchies of material theories such that each of the former has a proof strength inferior to one of the latter, and conversely. Thus form the logical point of view, the structural and the material approach are still equivalent but in a weaker sense.

So what ultimately counts is the practical point of view. Here the advantage of the material point of view is very clear. After all, we already need a material free monoid to communicate mathematics. Then, the material point of view is nearly obvious to any newcomer, making for a simple entrance and plenty of very elementary exercises that lead to mathematical insight, while the structural point of view emerges only after having digested enough of more elementary material mathematics.

On the other hand, for many problems, both the material and the structural perspective offer insights. Therefore a good foundation of mathematics should offer both views.

Thus for me the priorities are clear: Describe mathematics in a declarative way that allows naturally both material and structural constructs, but support it constructively with a material mathematical universe in which the structural realm is constructible in a transparent way that can be easily used.

Posted by: Arnold Neumaier on September 30, 2009 12:34 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC

As has already been mentioned several times, there are real theorems here. The book by Mac Lane and Moerdijk proves that ETCS is equiconsistent with a fragment of ZFC (Bounded Zermelo with Choice), but as shown by Colin McLarty, one can easily strengthen ETCS with structural axioms so that ETCS+ is equiconsistent (bi-interpretable) with full ZFC. Mike and others (Toby, David Roberts, and there may be others too) have written down details of similar equiconsistency statements involving SEAR.

In each case, the idea is the same; see Mac Lane-Moerdijk for details. To reflect ZFC in a structural set theory like ETCS+, one reflects material sets using well-founded rooted trees; “elements” of such a tree with root rr are subtrees rooted at the children of rr. (There’s also the example of algebraic set theory – see the book by Joyal-Moerdijk, where models for ZFC are constructed as certain types of initial algebras.)

Certainly in the case of ETCS and algebraic set theory, this material has been well worked over, so I’m curious as to why you express doubt about its validity.

Posted by: Todd Trimble on September 30, 2009 3:37 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I believe the strengthening of ETCS to a theory equivalent with ZFC actually predates McLarty by quite some time (although McLarty’s axiom is a bit different). One reference (which I wanted to mention earlier but failed to remember) is

  • Osius, Gerhard, “Categorical set theory: a characterization of the category of sets”, JPAA 1974
Posted by: Mike Shulman on September 30, 2009 5:14 PM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

AN: I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC

TT: As has already been mentioned several times, there are real theorems here. The book by Mac Lane and Moerdijk proves that ETCS is equiconsistent with a fragment of ZFC (Bounded Zermelo with Choice), but as shown by Colin McLarty, one can easily strengthen ETCS with structural axioms so that ETCS+ is equiconsistent (bi-interpretable) with full ZFC.

I didn’t express doubts but said I’d find it surprising, meaning that it would reveal something to me that would extend my intoition. Clearly, my intuition about structural foundations is more limited than yours, so I can be easier surprised than you.

Maybe there is a theorem there, but the SEAR material here on the web doesn’t give complete proofs, and ETCS is, as you say, weaker than ZFC.

So I’d like to see Colin McLarty’s stronger version in order to understand what is missing in my intuition. Can you give me a reference to his work?

Posted by: Arnold Neumaier on September 30, 2009 5:33 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Osius’ paper I cited above is a good one to look at. I think the paper of McLarty’s that we are referring to is “Exploring categorical structuralism” in Philos. Math.

I am also planning to include a more detailed proof, dealing with the non-well-founded case as well as the well-founded one, in my forthcoming paper “Unbounded quantifiers and strong axioms in topos theory,” which I will post about when it is in a state to be read by others.

Posted by: Mike Shulman on September 30, 2009 5:42 PM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

Yes, that was the paper by McLarty that I had in mind. Unfortunately I am not familiar with the paper of Osius, although I’m aware of it.

Posted by: Todd Trimble on September 30, 2009 6:07 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

MS: Osius’ paper I cited above is a good one to look at. I think the paper of McLarty’s that we are referring to is “Exploring categorical structuralism” in Philos. Math.

I got the latter from the web; it refers to Osius for the crucial part. The latter is not free online; so it will take a while for me to get it and read it.

Posted by: Arnold Neumaier on September 30, 2009 6:15 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I didn’t express doubts but said I’d find it surprising, meaning that it would reveal something to me that would extend my intoition. Clearly, my intuition about structural foundations is more limited than yours, so I can be easier surprised than you.

Hmm. Here is what you wrote:

I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC, but I’d find it plausible that there are hierarchies of structural theories and hierarchies of material theories such that each of the former has a proof strength inferior to one of the latter, and conversely. Thus from the logical point of view, the structural and the material approach are still equivalent but in a weaker sense.

The last sentence, a straightforward declaration, sure reads like a rejection of the stronger sense. If you didn’t mean that way, it is seriously misleading.

Posted by: Todd Trimble on September 30, 2009 6:29 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: Thus from the logical point of view, the structural and the material approach are still equivalent but in a weaker sense.

TT: The last sentence, a straightforward declaration, sure reads like a rejection of the stronger sense. If you didn’t mean that way, it is seriously misleading.

I should have written: … but (to my present understanding) in a weaker sense.

But I thought that everything anyone says is to be considered subject to the restriction “according to the writer’s present understanding”.

Posted by: Arnold Neumaier on September 30, 2009 7:08 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

But I thought that everything anyone says is to be considered subject to the restriction “according to the writer’s present understanding”.

Yes of course, but that still doesn’t erase the fact that it’s a declaration of belief. My question was why you believe(d) it.

Posted by: Todd Trimble on September 30, 2009 7:33 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TT: but that still doesn’t erase the fact that it’s a declaration of belief. My question was why you believe(d) it.

My present intuition tells me that equivalence is unlikely to hold. You tell me otherwise, and evidence of a proof (which I hope to gather by reading the paper by Osius - McLarty doesn’t have the details) may well change my belief.

Posted by: Arnold Neumaier on September 30, 2009 7:59 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

My present intuition tells me that equivalence is unlikely to hold.

Okay, thank you, that’s a good honest positive declaration. But you still haven’t said WHY. Why does your intuition tell you that?

Posted by: Todd Trimble on September 30, 2009 8:18 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

In a structural theory, material objects are constructed as the rigid objects in some category, with being isomorphic as equality. Then one can do all material mathematics inside suitable such collections of rigid objects.

I am forced to conclude that you have not understood anything that we have been saying.

One can construct a model of material set theory inside structural set theory by using rigid trees or other models. This may be interesting to do if one doubts that they are equally strong. However, it is irrelevant for matheamtical practice, because this is not, not, not how one does mathematics in a structural set theory! A group in structural set theory is a set with a multiplication operation and an identity satisfying the axioms—there is no need to equip this set with the superfluous extra structure of a rigid tree.

After all, we already need a material free monoid to communicate mathematics.

I have no idea what that means.

the material point of view is nearly obvious to any newcomer, making for a simple entrance and plenty of very elementary exercises that lead to mathematical insight, while the structural point of view emerges only after having digested enough of more elementary material mathematics.

My experience in teaching newcomers to mathematics is that even material set theory is fraught with conceptual hurdles. At present, one tends to appreciate the structural point of view only after digesting some abstract mathematics (or, perhaps, never), but there’s no evidence that it has to be that way. I would argue that that’s an artifact of the fact that almost everyone is taught material set theory first, and hardly anyone is ever taught structural set theory.

Posted by: Mike Shulman on September 30, 2009 5:21 PM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

AN: In a structural theory, material objects are constructed as the rigid objects in some category, with being isomorphic as equality. Then one can do all material mathematics inside suitable such collections of rigid objects.

MS: I am forced to conclude that you have not understood anything that we have been saying.

Maybe, but then the communication barrier is deeper than we both think.

MS: A group in structural set theory is a set with a multiplication operation and an identity satisfying the axioms—there is no need to equip this set with the superfluous extra structure of a rigid tree.

But as far as this goes there is no difference at all to the material point of view. A group in material set theory is also a set with a multiplication operation and an identity satisfying the axioms.

ZF adds superfluous extra stuff in terms of tuples that are sets of sets of sets, while SEAR adds superfluous extra structure in terms of lots of trivial conversion and embeddign functors.

None of this stuff is relevant for doing mathematics as it is done in practice.

But some of it is needed (in different ways) if one wants to force mathematics into either a purely material or a purely structural straitjacket. This is why I like neither of these constructive foundations. I want to avoid both extremes. (But find the material straitjacket still preferable to the structural one.)

AN: After all, we already need a material free monoid to communicate mathematics.

MS: I have no idea what that means.

The text displayed on the screen where you are reading this is composed of material elements of such a monoid.

Without language no communication of mathematics. But language needs a material monoid.

MS: strengthening of ETCS to a theory equivalent with ZFC actually predates McLarty

Thanks for the reference. I’ll try to get it…

Posted by: Arnold Neumaier on September 30, 2009 5:59 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

The text displayed on the screen where you are reading this is composed of material elements of such a monoid.

Yes, I think we knew you were referring to words, but how should we interpret what you mean by ‘material’?

For example, ‘word’ is a 4-tuple. In material set theory, there are various ways of representing 4-tuples, e.g.,

{w,{{w},o},{{{w},o},r},{{{{w},o},r},d}}\{w, \{\{w\}, o\}, \{\{\{w\}, o\}, r\}, \{\{\{\{w\}, o\}, r\}, d\}\}

Is that what you meant by a “material element” of the free monoid? If so, why are you convinced that we need such constructions? If not, then what did you mean?

Posted by: Todd Trimble on September 30, 2009 7:04 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TT: For example, ‘word’ is a 4-tuple. In material set theory, there are various ways of representing 4-tuples,

ZF and its relatives are not the only material theories.

And ‘word’ is not a 4-tuple; Kuratowski tuples form a monoid only under a very unnatural operation. Instead, it is an element of a free monoid generated by 4 material characters w, o, r, and d.

FMathL takes this into account.

Posted by: Arnold Neumaier on September 30, 2009 7:20 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

You haven’t answered my question.

What, sir, do you mean by “material element of a free monoid”? For I take it you were saying that words (not characters, words) are “material elements of free monoids”.

What about “material” is necessary here?

Posted by: Todd Trimble on September 30, 2009 7:47 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TT: What, sir, do you mean by “material element of a free monoid”? For I take it you were saying that words (not characters, words) are “material elements of free monoids”.

What about “material” is necessary here?

If w and o are material characters then their product (well-defined in any monoid) is material, too.

At least this is the understanding I gained from the use of material and structural in your community.

But even if this is not what you understand by these terms, it is the meaning I want to give the term (and is how I used it in all my mails), since this is the way it works in FMathL.

Posted by: Arnold Neumaier on September 30, 2009 8:14 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

a free monoid generated by 4 material characters w, o, r, and d

I (at least) have no idea what you mean when you call these characters ‘material’. Surely you don't mean that they are themselves sets, with their own elements? Perhaps you mean that they can be compared for equality with any other mathematical object, but I fail to see how this is needed for anything.

To discuss words, we need set a AA, called the alphabet and whose elements are called letters; then a word is an element of the free monoid on AA. We only need to test letters for equality with other letters, which is provided by the set AA. We need to test words for equality with other words, which the free monoid construction also provides; it even provides a test for equality of composites of tuples of words. This is all perfectly structural. Indeed, the concept of ‘free monoid’ is inherently categorial.

Posted by: Toby Bartels on September 30, 2009 7:49 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TB: I (at least) have no idea what you mean when you call these characters ‘material’. Surely you don’t mean that they are themselves sets, with their own elements? Perhaps you mean that they can be compared for equality with any other mathematical object

No, I mean that they have an identity such that w can be recognized as the letter `w’, and not only as an anonymous element from some set. One needs not only know that w is different from o but also that w is in fact `w’!

Structurally, there is no difference between any two 4-letter words with distinct letters.

But to know what a word means you need to know the identity of each letter. This is what makes things material in the sense I find most natural to give to this word, not that it is written as a set or that one can compare for equality.

TB: This is all perfectly structural. Indeed, the concept of ‘free monoid’ is inherently categorial.

I don’t think that material and structural are always in opposition.

If it were, it were not possible to translate reasonably smoothly from a traditionally material view of mathematics to a structural view.

Posted by: Arnold Neumaier on September 30, 2009 8:15 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

But to know what a word means you need to know the identity of each letter.

And so you do. If formalised in ETCS\mathbf{ETCS} (for example), each letter is a function from 11 to AA, and these have their own identities.

This is what makes things material in the sense I find most natural to give to this word

I no longer remember who suggested to Mike that set theory in the style of ZFC\mathbf{ZFC} be called ‘material’, and I don't think that I ever knew the reason. But unless you're claiming that this requires a foundation in the style of ZFC\mathbf{ZFC} (in particular, with global equality and a global membership predicate), then I have no disagreement with you … but I don't see the relevance, either.

Posted by: Toby Bartels on September 30, 2009 8:29 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: But to know what a word means you need to know the identity of each letter.

TB: And so you do. If formalised in ETCS (for example), each letter is a function from 1 to A, and these have their own identities.

Then please tell me which function from 1 to A is the letter w.

Posted by: Arnold Neumaier on September 30, 2009 9:10 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

It’s whatever function 1A1 \to A has been named ww.

It’s no different in principle from telling numbers apart. You could for example specify the subset AA of \mathbb{N} consisting of the first 26 elements (with respect to say the standard ordering of \mathbb{N}), and decide to name the 23rd element ww. From that point on, you know which function 1A1 \to A is meant by “ww”.

I think I can understand what’s behind the question. For example, the complex numbers ii and i-i behave exactly alike. But of course they are not the same. To deal with that, you can decide to represent the complex numbers as the quotient field [x]/(x 2+1)\mathbb{R}[x]/(x^2 + 1) and then say “I’ve decided to name the residue class of xxii’.” You could have named it i-i of course, but once you settle on the name, you stick with that.

Posted by: Todd Trimble on September 30, 2009 9:56 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: please tell me which function from 1 to A is the letter w.

TT: It’s whatever function 1→A has been named w. […] You could for example specify the subset A of ℕ consisting of the first 26 elements (with respect to say the standard ordering of ℕ), and decide to name the 23rd element w. From that point on, you know which function 1→A is meant by “w”.

Thus you don’t need just a set A, called the alphabet, but you need a particular well-ordering of the set A before your prescription makes sense. In my view, giving a well-ordering to A is materializing the set A.

Strictly speaking you also need to name the letters, which is to give a mapping from A to the set of names for the letters, which must be material. Othewrwise you cannot tell someone else formally which element represents which symbol.

But I think I understand your point of view, without agreeing with that it is any improvement over a more naive material point of view.

Posted by: Arnold Neumaier on September 30, 2009 11:28 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

In my view, giving a well-ordering to A is materializing the set A.

Well, that has very little (or nothing) to do with how I have been using the word “material.” Should we take “giving a well-ordering” as your definition of “materializing”? So that in particular, when you said “we already need a material free monoid to communicate mathematics,” we should have interpreted that as meaning “we need the free monoid on a well-ordered set?” I don’t think I would disagree with that latter assertion, but I don’t think it has anything to do with the material/structural divide in the way we have been using the words.

Posted by: Mike Shulman on October 1, 2009 5:03 AM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

AN: In my view, giving a well-ordering to A is materializing the set A.

MS: Well, that has very little (or nothing) to do with how I have been using the word “material.”

Then please give your definition of how the material and the structural point of view should be recognized.

MS: Should we take “giving a well-ordering” as your definition of “materializing”?

No. A set is materialized if it is given extra structure which makes its elements uniquely identifiable by giving a formal expression identifying it.

In particular, a well-ordering of a finite set materializes it since you can point to each particular element by a formal expression identifying it: ”the first element”, ”the second element”, etc.

If this is not the meaning of material then I have no clue why you can refer to ZF set theory as a material theory.

Posted by: Arnold Neumaier on October 1, 2009 10:33 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

No. A set is materialized if it is given extra structure which makes its elements uniquely identifiable by giving a formal expression identifying it.

I’m glad this has finally come out, although we’ve probably been doing an awful lot of talking past each other because we misunderstood how we each intended the word ‘material’.

It reminds me of a Buddhist story I once read. There was a man who worshipped Amitabha, who in traditional iconography is bright red, but the man had misunderstood or mistranslated and thought the color was gray, like ash from the fire. So whenever he meditated on and envisioned Amitabha, it was always a gray Amitabha. Finally the guy is on his deathbed, and just to be sure, asks his teacher what color Amitabha is, and on finding out bursts into laughing, saying, “Well, I used to think him the color of ash, and now you tell me he is red,” and died laughing.

‘Material’ as in “material set theory” is something I’d only heard in the last few months at latest. I just assumed it meant we were talking about a form of set theory founded on a global membership relation, like ZF, Bernays-Gödel, Morse-Kelly, etc. The “material” signified to me that elements had “substance” (I used the phrase ‘internal ontology’ before): could have elements which themselves could have elements, and so on.

Posted by: Todd Trimble on October 1, 2009 12:44 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I have tried to clarify what I intended to mean by “material set theory” and “structural set theory” at the set theory page on the nlab. This is pretty close to what Todd said. In particular, “material” is a property of a theory, not of a set. When you start talk about giving a set extra structure, that can of course be done structurally just as naturally (as the word suggests).

Posted by: Mike Shulman on October 1, 2009 2:36 PM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

MS: I have tried to clarify what I intended to mean by “material set theory” and “structural set theory” at the set theory page on the nlab. […] In particular, “material” is a property of a theory, not of a set.

I added some remarks there. It doesn’t seem to define when an arbitrary theory is material, and hence does not define a property of a theory. It only defines the compund concept of a “material set theory”, and does this in terms too vague that one could decide questions such as whether FMathL is or isn’t a material set theory.

I very much prefer the concept of material vs. structural that I presented in a previous mail and extracted from your usage in the present discussion. (There you also used the terms “material foundations”, Todd Trimble and Toby Bartels used “material sets”, TB also used “material framework”; so the term clearly wants to be generalized…)

Posted by: Arnold Neumaier on October 1, 2009 4:13 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Todd Trimble and Toby Bartels used “material sets”, TB also used “material framework”

For the record, I'll specify what I mean by these.

I use ‘material’ as short for ‘membership-based’, which itself really means ‘featuring a global membership predicate’, which means ‘featuring a binary predicate which, given any two terms for a set, returns a proposition whose intended meaning is that the first set is a member of the other’. This is not a purely syntactic concept; it depends on the intended meaning.

In front of ‘set theory’, ‘foundations’, or ‘framework’, this is exactly what ‘material’ means; but ‘material sets’ really means ‘sets in a material set theory’, which in turn might literally mean ‘terms for sets in a material set theory’ or ‘the intended meaning of terms for sets in a material set theory’.

Posted by: Toby Bartels on October 1, 2009 9:17 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: A set is materialized if it is given extra structure which makes its elements uniquely identifiable by giving a formal expression identifying it.

TT: we misunderstood how we each intended the word ‘material’. […] ‘Material’ as in “material set theory” is something I’d only heard in the last few months at latest. […] The “material” signified to me that elements had “substance” (I used the phrase ‘internal ontology’ before): could have elements which themselves could have elements, and so on.

I hadn’t heard at all the term “material” in this context. Judging from a Google search, the term was coined in the n-Lab.

I guessed at the likely meaning from the examples of usage given by those discussing here. Being clearly a contrast to “structural” I was trying to see what sort of meaning I could give it that made sense in my general view of mathematics.

The only natural pair of informal contrasts I could find that matched reasonably were

structural = defined only up to isomorphism, independent of any particular construction

material = given in terms of concrete building blocks.

After having seen how material set theory was constructed within SEAR, I was able to make the second more specific to

material = being able to identify the elements uniquely by giving a formal expression identifying it.

This seemed to match, giving both a precise meaning to the terms and showing that the two concepts are not in complete opposition but having a common intersection that explains why both points of views can be taken as foundations and still be some sort of equivalent.

I am still in doubt about the precise nature of this equivalence. You had asked why? about my intuition, but I can’t pinpoint it at the moment. Perhaps reading Osius will help me understand my and his intuition.

But I think these terms, with the above meanign, are useful general notions, the endpoints of a continuum of ways of thinking about mathematics.

FMathL is trying to plough here middle ground.

Posted by: Arnold Neumaier on October 1, 2009 2:59 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

The only natural pair of informal contrasts I could find that matched reasonably were

structural = defined only up to isomorphism, independent of any particular construction

material = given in terms of concrete building blocks.

After having seen how material set theory was constructed within SEAR, I was able to make the second more specific to

material = being able to identify the elements uniquely by giving a formal expression identifying it.

There’s a missing ingredient in your (informal) characterization of “structural” which I think is crucial to the discussion, and which actually is very close in spirit to the characterization of “material” quoted at the very end. Properly understood, there is no clash whatsoever between “material mathematics” as I understand your use of the term now, and structural mathematics.

The missing ingredient is that in general, structures defined by means of “universal elements” are defined up to canonical (uniquely determined) isomorphism.

The bit I recently wrote about what we mean precisely in describing [x]\mathbb{R}[x] as ‘the’ “free \mathbb{R}-algebra on one generator” should suffice to illustrate what I mean. There can be many such structures (many realizations of such structure), but given any two of them, say (A,a:1U(A))(A, a: 1 \to U(A)) and (B,b:1U(B))(B, b: 1 \to U(B)), there is exactly one homomorphism f:ABf: A \to B such that U(f)(a)=bU(f)(a) = b. By a famous argument, this homomorphism must be an isomorphism. It is the (unique) canonical isomorphism between these two universal structures.

In particular, the only structure-preserving automorphism from (A,a:1U(A))(A, a: 1 \to U(A)) to itself is the identity, and once this structure is given, we can uniquely specify elements therein by means of formal expressions. For instance, we are given a specified (explicitly named) formal generator aa, and other elements are uniquely and formally specified by applying algebra operations in recursive fashion, starting with that aa.

Of course, this is just standard practice of mathematicians; structural mathematicians shouldn’t be seen as doing anything different.

Posted by: Todd Trimble on October 1, 2009 6:05 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

‘Material’ as in “material set theory” is something I’d only heard in the last few months at latest. I just assumed it meant we were talking about a form of set theory founded on a global membership relation, like ZF, Bernays-Gödel, Morse-Kelly, etc.

Mike introduced the term to the discussion here. That is exactly what it means.

Posted by: Toby Bartels on October 1, 2009 8:11 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Arnold wrote:

Thus you don’t need just a set A, called the alphabet, but you need a particular well-ordering of the set A before your prescription makes sense. In my view, giving a well-ordering to A is materializing the set A.

Well, that’s just one prescription, just something simple off the top of my head. It doesn’t have to be a well-ordering, but yes, the naming is definitely an additional structure, just as in the parable of ii and i-i.

For example, in ETCS, if [n][n] represents the coproduct of nn copies of a chosen terminal object 1, then there are exactly nn elements 1[n]1 \to [n]; they are all coproduct inclusions, and certainly they are all distinct. It may help to think of [26][26] as a blob of twenty six distinct points. The points are clearly distinct, but they look exactly alike, are clones if you will.

Then, you may assign them names however you please, writing next to them (or on their identical red shirts), ‘A’, ‘B’, …, ‘Z’ say. If you choose to close your eyes and they take off their shirts (in other words, if you forget the naming) and they permute among themselves, you obviously can’t retrieve the original naming. But, as along as the names are firmly attached, as long as you bear in mind the naming structure, you are free to use it, knowing for example where Mr. P went to under some specified mapping f:[26]Δf: [26] \to \Delta.

This sort of thing happens at the formal level too. For example, part of the structure of [2][2] as so-called “subobject classifier” is a given element 1[2]1 \to [2] which is traditionally called “true”. Such an element is considered part of the structure of the subobject classifier as such. With that structure firmly attached, you are then in the position to set up a well-defined bijective correspondence between functions f:X[2]f: X \to [2] and subsets of XX, by considering f 1(true)Xf^{-1}(true) \subseteq X. You could have chosen the other element of [2][2] of course as your “true”, but whichever element you chose, you stick to it and remember it for future reference.

Posted by: Todd Trimble on October 1, 2009 2:13 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TT: there are exactly n elements 1→[n]; they are all coproduct inclusions, and certainly they are all distinct. It may help to think of [26] as a blob of twenty six distinct points.

Yes, this is a different way of creating materially a set of 26 nameable elements, and again, it is not a pure set but a set with additional structure. Mathematicians very rarely use pure sets!

This reminds me of the C abcdC_{abcd} problem, which still puzzles me. I’d like to know your answer to my query:

Let C abcdC_{abcd} be the category whose objects are the symbols a,b,c,d, with exactly one morphism between any two objects, composing in the only consistent way. Let the categories C abcC_{abc} and C abdC_{abd} be defined similarly. Clearly, these are both subcategories of C abcdC_{abcd}, with the identity as the inclusion functor. But I can compare their objects for equality.

Do you agree that from the material point of view (e.g., with categories modelled inside ZF, as in Lang’s book), this reasoning is correct?

If not, what is contrary to the axioms?

And if my reasoning is right from the material point of view, which extra axioms (in addition to what is in Wikipedia, or Lang, or Asperti and Longi) characterize the permitted ways of structural reasoning?

Posted by: Arnold Neumaier on October 1, 2009 3:20 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Todd’s description of the complex numbers reminds me of debates I heard seven or so years ago about the structuralism then popular in the philosophy of mathematics which said that mathematical entities are patterns, and that all that mattered about elements of the pattern are their properties invariant under isomorphism. The idea here was to explain how 22 is merely a place in a pattern however it is realised set theoretically.

Someone pointed out that this would entail identifying ii and i-i in the complex numbers since nothing distinguishes them according to their place within the structure of the complex numbers. After discussion with John, I realised that we are often not careful saying what we mean by \mathbb{C}. There’s a difference between the field [x]/(x 2+1)\mathbb{R}[x]/(x^2 + 1) and the same field with the extra structure of a choice of a residue class to be designated ii. They belong to different categories.

In the first case, there are two automorphisms on the object; in the second case, only one, but there’s another object with the same image under the functor which forgets the structure.

Posted by: David Corfield on October 1, 2009 11:06 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

DC: There’s a difference between the field ℝ[x]/(x 2+1x^2+1) and the same field with the extra structure of a choice of a residue class to be designated i. They belong to different categories.

Does choosing a notation really change the category an object belongs to? This would make the conversion headache in the structural approach even worse.

Does the monoid \mathbb{N} of natural numbers under addition no longer belong to the category monoids if I add the conservative definition 2:=1+1?

Similarly, why can’t I put i:=xmodx 2+1i:=x mod x^2+1 to define the imaginary unit in ℝ[x]/(x 2+1x^2+1) without changing the category the latter object belongs to?

This does not affect the existence of the automorphism induced by iii\to -i.

Or do you hold that each definition changes the type of an algebraic structure? This would make things extremely unworkable formally!

Posted by: Arnold Neumaier on October 1, 2009 3:33 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Does the monoid \mathbb{N} of natural numbers under addition no longer belong to the category monoids if I add the conservative definition 2:=1+1?

Similarly, why can’t I put i:=xmodx 2+1i:= x mod x^2+1 to define the imaginary unit in [x]/(x 2+1)\mathbb{R}[x]/(x^2+1) without changing the category the latter object belongs to?

This does not affect the existence of the automorphism induced by iii \mapsto -i.

I think David said it right, but it’s slightly subtle. The way to reconcile it with the point you’re making is by recognizing that, considering [x]\mathbb{R}[x] as an abstract \mathbb{R}-algebra, it’s not clear which element is xx until you say so. Thus, there’s an automorphism on [x]\mathbb{R}[x] which sends xx to x-x, and either (or indeed any ax+ba x + b with a0a \neq 0) could be considered a distinguished generator of the polynomial algebra. Giving a generator 1U([x])1 \to U(\mathbb{R}[x]) (here UU denotes the appropriate underlying-set functor) is thus adding some extra structure to the algebra.

A typical categorical response to all this is to define [x]\mathbb{R}[x] to be the free \mathbb{R}-algebra on one generator, which has a materializing or concretizing effect. More explicitly, this involves a universal property: when we say “free algebra on one generator”, we mean (to be precise) that there is given a function i:1U([x])i: 1 \to U(\mathbb{R}[x]), traditionally called ‘xx’, such that for every function f:1U(A)f: 1 \to U(A) into the underlying set of an \mathbb{R}-algebra AA, there exists a unique \mathbb{R}-algebra homomorphism ϕ:[x]A\phi: \mathbb{R}[x] \to A such that f=U(ϕ)if = U(\phi) \circ i. And there: this formulation involving the universal function ii gives you a distinguished element which people usually call xx.

Also note that [x]\mathbb{R}[x] equipped with this distinguished element i:1U([x])i: 1 \to U(\mathbb{R}[x]) has no non-trivial automorphisms. This is just an instance of a feature holding true for general universal properties.

(People often also say “free” to refer to a property: there exists a distinguished element such that… rather than giving the element at the outset as extra structure. Caveat lector.)

Posted by: Todd Trimble on October 1, 2009 5:05 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: why can’t I put i:=xmodx 2+1i:=x mod x^2+1 to define the imaginary unit in ℝ[x]/(x 2+1)[x]/(x^2+1) without changing the category the latter object belongs to?

TT: indeed any ax+b with a≠0 could be considered a distinguished generator of the polynomial algebra.

I don’t understand:

This should not matter in a purely structural view. If you change the generator, you also change the ideal and hence the resulting field, but in any case, the ii so defined will be the distinguished square root of -1 of this field. Since structrally everything is defined anyway only up to isomorphism, this gives exactly the right result, with a canonical ii that changes with the field considered.

TT: ℝ[x] equipped with this distinguished element i:1→U(ℝ[x]) has no non-trivial automorphisms.

This is true if you require that ii is preserved, but this is another reason why I find a purely structural point of view awkward.

I find it unacceptable that the concept of an automorphism changes simply by labeling an element. The world of pure structure is a strange world, not the world of the average mathematician.

The complex numbers as mathematicians generally use them have complex conjugation as an automorphism, although ii is distinguished but not preserved by this automorphism.

TT: The missing ingredient is that in general, structures defined by means of “universal elements” are defined up to canonical (uniquely determined) isomorphism.

Again I do not understand:

Two instances of the field of complex numbers (without a distinguished imaginary unit) are not structurally the same since there is no canonical isomorphism? This would be very strange indeed.

Posted by: Arnold Neumaier on October 1, 2009 6:59 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Two instances of the field of complex numbers (without a distinguished imaginary unit) are not structurally the same since there is no canonical isomorphism? This would be very strange indeed.

I wouldn't say that they are not, in some sense, the same just because there is more than one isomorphism. (There is always a sense in which they are not the same, if they are represented differently syntactically. But that is not itself a question for mathematics.) I would say this: It is not only important whether things are isomorphic, but also in how many ways they are isomorphic; after all, Iso(A,B)Iso(A,B) is not just a truth value, but a set (a meta-set, although usually also realisable internally as a set). In higher category theory, we even have Equiv(A,B)Equiv(A,B) as (in general) an \infty-groupoid!

Of course, there is more to say than just the cardinality of Iso(A,B)Iso(A,B), such as the action on it by the monoid Hom(B,B)Hom(B,B) and so on. But when Iso(A,B)Iso(A,B) is a singleton, then things become much simpler, to the point that simply writing A=BA = B is an abuse of language that is easy to handle. If Iso(A,B)Iso(A,B) is inhabited but (possibly) not a singleton, then writing A=BA = B is a little more dangerous; the danger only really comes to fruition, however, when you get loops A=B=C=AA = B = C = A since the composite isomorphism ABCAA \to B \to C \to A might not be the identity.

Posted by: Toby Bartels on October 1, 2009 9:48 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I am afraid, Arnold, that you did not attend carefully to what I wrote. I hope at least it was clear that I was trying to build a bridge of understanding between what you wrote and what David wrote. But, as I said, the mathematical point involved was slightly subtle, so I ask you to read again, with care.

Let me try again.

AN: why can’t I put i:=xmodx 2+1i := x mod x^2+1 to define the imaginary unit in [x]/(x 2+1)\mathbb{R}[x]/(x^2+1) without changing the category the latter object belongs to?

TT: indeed any ax+ba x + b with a0a \neq 0 could be considered a distinguished generator of the polynomial algebra.

I don’t understand:

This should not matter in a purely structural view. If you change the generator, you also change the ideal and hence the resulting field, but in any case, the ii so defined will be the distinguished square root of -1 of this field. Since structrally everything is defined anyway only up to isomorphism, this gives exactly the right result, with a canonical ii that changes with the field considered.

First of all, the part of mine that you quoted was lifted from between a pair of parentheses, where it was indeed a parenthetical aside. I now regret that aside, because it seems to have distracted you from the point I was trying to make.

Second, please note that the ideal generated by x 2+1x^2 + 1 does not change if you replace xx by x-x. That’s the point! There are two candidates in the polynomial algebra whose residue class modulo this ideal yields a square root of -1, but these residue classes are different square roots of -1. It follows that if you haven’t chosen a candidate to work with, you haven’t uniquely specified which so-called canonical square root of -1 in this model you intended to label ii! (And if you’ll recall, unique specification was what this discussion was originally about.)

You may think, “well, clearly I meant to choose xx”, but knowledge of which element that is is not encoded within the polynomial algebra structure, hence it is an extra piece of information in addition to the algebra structure.

TT: [x]\mathbb{R}[x] equipped with this distinguished element i:1U([x])i: 1 \to U(\mathbb{R}[x]) has no non-trivial automorphisms.

This is true if you require that ii is preserved, but this is another reason why I find a purely structural point of view awkward.

I find it unacceptable that the concept of an automorphism changes simply by labeling an element. The world of pure structure is a strange world, not the world of the average mathematician.

The complex numbers as mathematicians generally use them have complex conjugation as an automorphism, although ii is distinguished but not preserved by this automorphism.

We are not “simply labeling an element”, we are also choosing an element to label. This is important for the purpose of making unique specifications, which are important for ‘material’ constructions according to the sense “material = being able to identify the elements uniquely by giving a formal expression identifying it.”

Since we are not simply assigning a label but choosing an element to label, and since this choice is an extra datum or structure, it is logical for this discussion (which was to elucidate a point David made, not to discuss the behavior of “average mathematicians”) that we consider automorphisms which “remember” (respect) this extra structure.

What categories average mathematicians choose to work in is their business. It’s fine if they want their morphisms to ignore preservation of the chosen “ii”. Me: I’m flexible – I’ll work in whatever category is best suited to the discussion I’m having.

(With the little polemical dig “strange world”, I can’t resist adding my own: category theory in fact teaches one great flexibility in thinking. But this point is perhaps lost on someone who often whines about categorical straitjackets, on rather thin and not terribly well-informed evidence.)

I’ll also add, for what it’s worth, that this category, the one whose objects are pairs (A,a:1U(A))(A, a: 1 \to U(A)) consisting of algebras and elements in their underlying sets, and whose morphisms are algebra homomorphisms that preserve elements thus distingished, is an example of what we category theorists call a comma category, a very important tool. Comma categories are extremely relevant to discussions in which adjoint pairs of functors crop up (just about everywhere, in case you didn’t know), including in particular free functors which are adjoint to forgetful functors, and more particularly the polynomial algebra functor which is left adjoint to the forgetful functor from algebras to sets, which I touched upon over here.

TT: The missing ingredient is that in general, structures defined by means of “universal elements” are defined up to canonical (uniquely determined) isomorphism.

Again I do not understand:

Two instances of the field of complex numbers (without a distinguished imaginary unit) are not structurally the same since there is no canonical isomorphism? This would be very strange indeed.

Your quotation is taken from another comment, here. But please attend closely to what I said: I said structures defined by means of universal elements. The main example from that comment was the polynomial algebra [x]\mathbb{R}[x] equipped with a universal element i:1U([x])i: 1 \to U(\mathbb{R}[x]). Did I speak of the complex numbers there? No, I did not. But could I speak of canonical isomorphisms if \mathbb{C} is considered as also coming equipped with a an element i:1i: 1 \to \mathbb{C} which is universal among \mathbb{R}-algebras equipped with a chosen square root of -1? Yes, I could.

Please observe as well that I added that missing ingredient because I thought the little sound-bite you gave for “structural” was a bit thin and needed more. That ingredient I consider particularly relevant for building a bridge between ‘structural’ and your sense of ‘material’. But please also note that I said “a missing ingredient” – I wasn’t pretending to exhaust the meaning of ‘structural’.

As to your question, though, Toby has given an informed reply. There is rather more to be said than can be encapsulated within a brief aphorism.

Posted by: Todd Trimble on October 2, 2009 5:44 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Then please tell me which function from 1 to A is the letter w.

The letter ‘w’, of course.

Are you suggesting that you have another way to answer the question, which letter is the letter w?

Posted by: Toby Bartels on October 1, 2009 9:10 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: Then please tell me which function from 1 to A is the letter w.

TB: The letter ‘w’, of course.

This is like answering ‘the expression A 5A_5’ in response to ‘Which group is A 5A_5?’. It doesn’t explain anything. You are simply pushing things you don’t like to the metalevel, as if this would solve the problem.

TB: Are you suggesting that you have another way to answer the question, which letter is the letter w?

I didn’t ask which letter is the letter w but which function form 1 to A is the letter w.

In a material set theory with urelements, you have A={a,…,w,x,y,z}, and w is a well-defined urelement.

The point is that there must be a way to tell a computer what is meant by w, and this can only be done on a formal level involving material objects.

Posted by: Arnold Neumaier on October 1, 2009 10:25 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I didn’t ask which letter is the letter w but which function form 1 to A is the letter w.

Yes, and I defined a letter to be (following the framework of ETCS) a function from 1 to A.

In a material set theory with urelements, you have A={a,…,w,x,y,z}, and w is a well-defined urelement.

But which urelement is w?

Really, I have not the faintest idea what your question is asking! Please, can you answer it for me in your framework, so that I can answer it for you in mine?

Posted by: Toby Bartels on October 1, 2009 7:29 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Arnold writes:

So what ultimately counts is the practical point of view. Here the advantage of the material point of view is very clear. After all, we already need a material free monoid to communicate mathematics. Then, the material point of view is nearly obvious to any newcomer, making for a simple entrance and plenty of very elementary exercises that lead to mathematical insight, while the structural point of view emerges only after having digested enough of more elementary material mathematics.

I don’t understand what you mean by “we already need a material free monoid to communicate mathematics.” Please explain.

The advantage of material set theory from a practical point of view will not be at all clear to some of us here; quite the contrary. In fact, I argued here that there are strong practical advantages of structural set theory – “practical” in the sense of being faithful to working practice of contemporary mathematics. In particular, I argued that a categories-based set theory, by focusing on the relevancy of universal properties, is at a formal level very directly concerned with mathematical essence – getting at the heart of what contemporary mathematicians need sets for and what they do with them – while at the same time eliminating extraneous and irrelevant features which manifest themselves in material set theory.

The basic argument you seem to be making is that structural set theory is harder to learn than material set theory. I think Mike, with his SEAR, makes a good case that that need not be true. Thus, I reject

the structural point of view emerges only after having digested enough of more elementary material mathematics

as mere assertion. Clearly the real test of the pedagogical viability of structural set theory is in the classroom. I’m happy to say that I’ve incorporated structural ways of thinking into undergraduate courses I’ve taught, and Toby says the same for himself. So these are not completely idle claims.

Posted by: Todd Trimble on September 30, 2009 5:55 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

There is an asymmetry that is still missing.

In a material theory, structural objects are constructed as […]. Then one can do all structural mathematics inside suitable such […]. However, to do so for nontrivial mathematics requires numerous abuses of language and notation; otherwise the description becomes tedious and virtually incomprehensible.

In a structural theory, material objects are constructed as […]. Then one can do all material mathematics inside suitable such […]. However, to do so for nontrivial mathematics requires numerous (but different) abuses of language and notation; otherwise the description becomes tedious and virtually incomprehensible.

The asymmetry is this: While one can construct material sets as you described, still for the purposes of normal mathematics there is no reason whatsoever to do so. When we structuralists hear an ordinary mathematician describe something about, say, Lie algebras, we immediately turn it into our own language (where, as Goethe would say, it might mean something completely different) and think about it that way. We definitely do not construct material pure sets and think of a Lie algebra as a Kuratowski pair. And we find that we have no difficulty in communicating with the Lie algebraist this way; they can't even tell that we are doing this.

Even when we hear set theorists talk about large cardinals, we still don't bother to construct material sets; if we only care about sets up to cardinality, then we're still talking about objects of the category SetSet of structural sets. (Now sometimes the set theorists can tell if we're using categorial model theory, but that's perfectly valid in a material framework too.) Only if the set theorists bring up the von Neumann hierarchy do we need to construct material sets.

To be fair, the material set theorist doesn't really have to construct structural objects either, at least not in the formal way that you describe. But they still have to deal with certain categories and ignore the membership structure of the objects and morphisms in these categories; they know intuitively what to ignore (which is why anything that they say can be translated so readily into our language), but for us it is automatic.

Thus I favor a declarative theory similar to FMathL, which accounts for the actual mathematical language and needs no abuses of language.

I still want to see how you will interpret ‘An ordered monoid is a set that is both ordered and a monoid, such that ….’ with no abuse of language. Good for you if you can do it! But if abuses of language are unavoidable, and one must work to formalise their meaning rather than to define everything in such a way that they are already literally valid, then I'm just as happy to add one more for ‘a function on AA’ when AA was declared to be a subset.

From a logical point of view, there is the additional question of proof power of the two views. I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC

Notice that the only reason that anyone ever linked to pure set was to indicate (very roughly, of course) how such an equivalence would be proved. (If you don't accept that we've put in enough details to establish that, very well; even I am relying more on my intuition and Mike's judgement than a careful check of Mike's argument about Collection.) But this is not necessary to understand how ordinary mathematics may be formalised in structural set theory (especially since ordinary mathematics doesn't even need high-powered set-theoretic axioms like Collection).

After all, we already need a material free monoid to communicate mathematics.

I don't understand what you mean by this. Aren't the elements of this free monoid simply words? (properly, strings of characters). Why do words need to have material elements??? (Of course, they need to have letters, but those are different.)

Then, the material point of view is nearly obvious to any newcomer,

I dispute this too.

The newcomer will think that they know what a set is, until you tell them that everything is a set, which they will find odd. You can avoid this, at least at first, with Urelemente, but eventually you'll do something like construct the set of real numbers, and then they will learn that a real number is a set, which is odd. Meanwhile, the structural set theorist, whose sets all have anonymous elements, has all along said that a set is merely a way to encode or describe certain things; we have never pretended that the elements of the set are those things, and so it is no surprise when a real number may be encoded as or described by, say, a set of rational numbers.

And at some point you must tell them that they are not allowed to take the set of all sets (or if they are, that they are at any rate not allowed to take the set of all sets that do not belong to themselves), which is no worse than telling them that they are not allowed to compare elements of two sets without some explicit way (such as a bijection between the sets) of comparing them. At least I know how to motivate the latter (but to be fair, the former also has to be explained, although perhaps later, by our group).

On the other hand, for many problems, both the material and the structural perspective offer insights. Therefore a good foundation of mathematics should offer both views.

I agree with this (well, at least the second sentence). But you've already agreed that either perspective allows one to formalise the other.

Posted by: Toby Bartels on September 30, 2009 6:42 PM | Permalink | Reply to this

Re: What is a structured object?

So part of the problem appears to lie in that you switch between different points of view (formal object or only a way of speaking tha can be formalized only by eliminating the concept) about what a group is.

This is a fair criticism; I think we’ve been a bit sloppy about this in the foregoing discussion. The problem is that category theory which deals with large categories is hard to formalize in any kind of set theory. Neither ZF nor SEAR nor ETCS has an intrinsic object called “a large category.” In ZF, one “defines” a “proper class” to be specified by a first-order formula, and then a “large category” to be a “meta-category” whose objects and arrows are proper classes.

In structural set theory, one way to “define” a “large category” is to give a finite graph D CD_C together with a couple of first-order formulas obj Cobj_C and arr Carr_C with free variables labeled by the vertices and edges of D CD_C. “An object” of this category is then a diagram of shape D CD_C in SetSet (hence, a collection of sets and functions) such that obj Cobj_C holds with the appropriate variables substituted, and likewise for “a morphism”.

Neither of these situations is really completely satisfactory. In ZF one can extend the theory to NBG or MK or add universes, and redefine “large category” to mean “category whose set of objects is not necessarily an element of the universe.” One structural counterpart of this is algebraic set theory in which classes, rather than sets, are the objects of the basic category under consideration, and there is a notion of “smallness” such that “sets” are the small classes. I feel that a more structural version of this considers a 2-category of large categories, rather than a category of classes, since in practice one rarely cares about the objects of a proper class up to more than isomorphism; I have some axioms for such a 2-category written down but haven’t put them up anywhere yet.

So, although when talking informally about category theory, I tend to think “2-structurally,” I’m not sure whether there yet exists a formal system which really captures what I mean by this. Thus there are really two questions here: the suitability of structural set theory for “small mathematics,” and its potential extensions to a “structural category theory” or “structural class theory” adequate for dealing with large categories (and which could hopefully be extended to treat extra-large 2-categories, XXL 3-categories, etc.).

Posted by: Mike Shulman on September 23, 2009 4:16 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

BTW, if we want to keep talking about the structural viewpoint on large categories and we want a formal setting in which to do it, universes in structural set theory should be perfectly adequate. Their main flaw is that they permit evil, but this shouldn’t be essential for understanding the issues in a structural viewpoint. David Roberts and I have been working on the axioms for universes in SEAR.

I don’t have time right now to explain how one goes about constructing a category of small sets, and thence a category of small groups, from a universe, but maybe someone else can.

Posted by: Mike Shulman on September 24, 2009 5:41 AM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

MS: A group in SEAR consists of a set G and an element e∈G and a function m:G×G→G, such that certain axioms are satisfied. A group in SEAR is not a single thing in the universe of discourse.

AN: So part of the problem appears to lie in that you switch between different points of view (formal object or only a way of speaking that can be formalized only by eliminating the concept) about what a group is.

MS: This is a fair criticism; I think we’ve been a bit sloppy about this in the foregoing discussion. The problem is that category theory which deals with large categories is hard to formalize in any kind of set theory.

This has nothing at all to do with large categories. Consider the category of finite groups. Its objects are finite groups, not group structures on a finite set. Thus you need to have the concept of finite group as an object rather than as a metaobject that cannot be formalized except by eliminating it from the formal representation.

MS: This is not a problem for formalization at a low level, but it may be undesirable when trying to formalize at a higher level, for all the reasons that you’ve given. But it doesn’t prevent SEAR from reflecting on itself formally.

It does. Reflection means being able to define a copy of SEAR inside SEAR, including all the language used to define this copy. (This is independent on any computer implementation. The latter, of course, must in addition care about efficiency, which causes some additional problems for theories where important concepts like that of a group are not a single thing in the universe of discourse.)

This means you need to start by calling the elements of a certain SEAR sets characters, then creating a SEAR model of text, then creating a SEAR model of context-free languages to express formulas and phrases, then create a model of what are variables, type declarations, axioms, definitions, assertions, proofs, and then state in this language the SEAR axiom system together with the definitions and assertions needed to explain the terminology used in the axioms.

Then, and only then, you can speak of having SEAR as a foundation.

MS: I do assert that structural set theory is a sufficient low-level foundation for mathematics on a par with ZF, and I believe that it is closer to the way mathematicians treat sets in everyday practice.

With ZF in place of SEAR, all of the above has be done at various levels of detail, and one can find for each step literature expanding on it in fairly detailed ways.

But I do not see how you can do this consistently with SEAR.

Apparently, you cannot even define formally the concept of a category without avoiding the problems with the category C 1234C_{1234} I had mentioned. (For definiteness, here I specialize abcd to elements from the natural numbers inside SEAR.)

And there are ZF-based texts like Bourbaki and Lang who introduce each permitted abuse of language before using it. SEAR abuses the language without any excuse, and without saying how to undo the abuses if one wants to be more careful.

I know that it is a time-consuming task to repeat this for new foundations, and I neither expect that you do this quickly or that a single person should be expected to do this.

But I’d expect that you don’t assert something you are so far from having achieved.

Posted by: Arnold Neumaier on September 24, 2009 9:14 AM | Permalink | Reply to this

Re: What is a structured object?

AN: Apparently, you cannot even define formally the concept of a category without avoiding the problems with the category C 1234C_{1234} I had mentioned.

Just to clarify: I didn’t mean obstacles related to large cardinals. The problem arises even for categories all of whose objects are finite sets equipped with extra structure.

Posted by: Arnold Neumaier on September 24, 2009 11:04 AM | Permalink | Reply to this

Re: What is a structured object?

I’ve enjoyed reading this vigorous exchange.——————–

In “Introduction to higher order categorical logic” by Joachim Lambek, P. J. Scott
I noticed that they recommended type theory as a foundation to mathematics rather than either category theory or set theory.

Bertot and Casteran, Interactive Theorem Proving (Coq)
“Amokrane Saibi showed that a notion of subtype with inheritance and implicit coercions could be used to develop modular proofs in universal algebra, and most notably, to express elegantly the main notions in category theory.”

http://pauillac.inria.fr/~saibi/Cat.ps by Amokrane Saibi (Coq)

“We then construct the Functor Category, with the natural definition of natural transformations. We then show the Interchange Law, which exhibits the 2­categorical structure of the Functor Category. We end this paper by giving a corollary to Yoneda’s lemma.
This incursion in Constructive Category Theory shows that Type Theory is adequate to represent faithfully categorical reasoning. Three ingredients are essential: \Sigma­ types, to represents structures, dependent types, so that arrows are indexed with their domains and codomains, and a hierarchy of universes, in order to escape the foundational difficulties. Some amount of type reconstruction is necessary, in order to write equations between arrows without having to indicate their type other than at their binder, and notational abbreviations, allowing e.g. infix notation, are necessary to offer the formal mathematician a language close to the ordinary informal categorical notation.”

SH: Perhaps this is interesting.

Posted by: Stephen Harris on September 24, 2009 2:29 PM | Permalink | Reply to this

Re: What is a structured object?

Type theory is, indeed, a very nice foundation for mathematics, which is very closely related to structural set theory. In fact, Bounded SEAR is nearly indistinguishable from type theory, and ETCS is also basically equivalent to it. However, my opinion (and this is only my opinion) is that type theory is harder for mathematicians without training in logic to understand, whereas they are quite used to thinking in terms of sets, relations, and functions. Perhaps this is only a relic of the ascendancy of material set theory as a foundation for so many years. Perhaps it is an artifact of the viewpoint taken by most textbooks on type theory.

Posted by: Mike Shulman on September 24, 2009 6:02 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

Perhaps it is an artifact of the viewpoint taken by most textbooks on type theory.

I blame this. Most books on ‘type theory’ are about logic; most books on ‘set theory’ (even if structural) are about mathematics. But I see ‘type’ and ‘set’ as nearly interchangeable, although ‘type’ can also be used in a broader context (for which there are other words if I want to be more specific, such as ‘preset’, ‘class’, or even —conjecturally for me— ‘\infty-groupoid’).

Posted by: Toby Bartels on September 25, 2009 8:28 PM | Permalink | Reply to this

Re: What is a structured object?

But I see ‘type’ and ‘set’ as nearly interchangeable

Here are some differences in the way I think of them:

  • The elements of a set are always equipped with a notion of equality, while the elements of a type need not be.
  • In type theory, one cannot quantify over all types (although one can fake it with universes), whereas in set theory one (potentially) can.
  • The previous point is perhaps a consequence of a “level” distinction. Constructions on sets are either specified by operations or by axioms which are part of the theory. But I think type constructors are usually viewed as syntactic judgements external to any theory. (Probably I’m not using the buzzwords correctly here, but hopefully you get my meaning.)
  • Type theory can be more flexible, e.g. it can be interpreted in fibered preorders rather than in categories. I’m not sure how to do that with set theory.

Admittedly, these are all subtle distinctions.

Posted by: Mike Shulman on September 25, 2009 10:29 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

I wrote:

‘type’ can also be used in a broader context

and Mike wrote:

Type theory can be more flexible

I would say that not every type theory is a set theory, far from it; but every set theory is a type theory. Types can (and usually do, in my experience) have equality predicates but (as you note) need not; in Martin-Lf's original ‘impredicative’ Intuitionistic Type Theory (the one that turned out inconsistent by Burali-Forti), you can quantify over all types, so the term ‘set’ doesn't have a monopoly on that idea. I don't see type constructors as external to type theory; I don't know what you're trying to say there.

And I wouldn't be too averse to somebody's using the term ‘set’ more flexibly either. It's not that different from our use of ‘set’ to mean, basically, a structured set with all of the extra structure removed, which AN correctly objects is inconsistent with its use, by Cantor and the material set theorists who followed him, to mean a part of some universe (originally the real line, eventually the von Neumann hierarchy). We can respond to AN that there is now substantial literature that uses the term in this way (and a vast literature in which it is easily interpreted in this way), which this hypothetical more flexible person may not have; but if we discover some group of mathematicians that does use ‘set’ for, say, something without an equality predicate, then I wouldn't have any standing to complain (even though I would rather call that particular sort of thing ‘preset’ myself).

Posted by: Toby Bartels on September 26, 2009 12:53 AM | Permalink | Reply to this

Re: What is a structured object?

in Martin-Löf’s original ‘impredicative’ Intuitionistic Type Theory (the one that turned out inconsistent by Burali-Forti), you can quantify over all types, so the term ‘set’ doesn’t have a monopoly on that idea.

Is there a consistent type theory in which you can quantify over all types? The way I think of type theory, quantifiers are tied to quantifying over elements of some type.

I don’t see type constructors as external to type theory; I don’t know what you’re trying to say there.

I think what I mean is that where type theory has type constructors, which are operations on types, set theory often has existence axioms about sets. Admittedly the distinction is not always possible to see.

Posted by: Mike Shulman on September 26, 2009 4:57 AM | Permalink | PGP Sig | Reply to this

Sets vs types

Is there a consistent type theory in which you can quantify over all types?

Sure, SEAR\mathbf{SEAR} for example.

I know, you call SEAR\mathbf{SEAR} a ‘set theory’ instead of a ‘type theory’, but if that's only because it allows quantification over all types, then the argument is circular. Meanwhile, we've got Arnold Neumaier objecting that SEAR\mathbf{SEAR} is not a set theory because it's not material; membership in sets should be a predicate, and making it a typing declaration is a give-away that you've really got a type theory (although AN said ‘copies of cardinal numbers’, and later ‘universes’, instead of ‘types’). There is a historical basis for either distinction.

I don't see type constructors as external to type theory; I don't know what you're trying to say there.

I think what I mean is that where type theory has type constructors, which are operations on types, set theory often has existence axioms about sets. Admittedly the distinction is not always possible to see.

No wonder you didn't want me to introduce Cartesian products as an operation in SEPS\mathbf{SEPS}; you were trying to build a set theory rather than a type theory! Of course, if one's type theory sticks to propositions-as-types, then it really can't tell the difference between these. On the other hand, even material set theory can be written down using operations; I can't think of a reference now, but 10 years ago I was working out how to eliminate existential quantifiers from the ZF\mathbf{ZF} axioms entirely.

Posted by: Toby Bartels on September 27, 2009 1:37 AM | Permalink | Reply to this

Re: Sets vs types

You seem to have a more expansive notion of what “type theory” means than I’ve encountered anywhere else. In part D of the Elephant, or in Jacobs’ Categorical Logic and Type Theory, type theory is given a specific meaning: there are types, function symbols, terms, type constructors (such as products and sums, possibly dependent), and so on. If we allow a logic on top of the the type theory (or fake it with propositions-as-types), then there are relation symbols and formula constructors as well, such as \wedge, \vee, \Rightarrow, \exists, etc., with inference judgements such as “if ϕ\phi is a formula containing a free variable xx of type AA, then x:A.ϕ\exists x:A.\phi is a formula without such a free variable.” Type theory together with logic might also be called “typed first-order logic.”

By contrast, SEAR is formulated in a typed first-order logic, but the types involved are “set”, “relation”, and “element.” Just like ZF is formulated in a single-sorted first-order logic, where the elements of the single sort are called “sets”. SEAR looks kind of like type theory because when AA is has the type “set,” the dependent type “element of AA” looks a lot like calling AA itself a type. But in type theory as I have learned it from the references above, one cannot write something like “for all types AA”, since every variable must have a type and there is no type of all types (at least, not if you want to avoid paradoxes). But perhaps I have learned too narrow a meaning of “type theory;” can you point me to any references that use it more expansively?

Posted by: Mike Shulman on September 27, 2009 9:22 PM | Permalink | PGP Sig | Reply to this

Re: Sets vs types

By contrast, SEAR is formulated in a typed first-order logic, but the types involved are “set”, “relation”, and “element.”

Yes, but type theory itself is also formulated in a typed first-order logic, where the types involved are ‘type’, ‘term’, ‘proposition’, and the like. There is, in my opinion, a significant difference between a type theory such as that which underlies SEAR\mathbf{SEAR}, in which all of the types are listed up front once and for all, and a type theory such as Martin-Löf's, in which enough generic type constructors are given that one can formalise all of ordinary mathematics. In fact, I would say this difference is greater than that between the second kind of type theory and structural set theory, and the difference between material and structural set theory is not really smaller.

But in type theory as I have learned it from the references above, one cannot write something like “for all types AA”, since every variable must have a type and there is no type of all types (at least, not if you want to avoid paradoxes). But perhaps I have learned too narrow a meaning of “type theory;” can you point me to any references that use it more expansively?

I think that the problem is that type theorists never invented a word analogous to ‘class’ in set theory; if they had, then nobody would say that every variable must have a ‘type’, since they would use this new word instead. But suppose that material set theory had developed differently, never inventing the word ‘class’, but instead always using ‘set’ for the general notion and ‘small set’ for the more restrictive case. Then the axioms of separation and collection (to keep their meaning the same as they have now in ZFC\mathbf{ZFC}) would only apply to formulas whose variables are all bounded by some set, and while one can write down other formulas, one cannot actually do anything with them; all that we have done is to develop NBG\mathbf{NBG} in a different language.

As I said, Martin-Löf wrote down a theory in which one can say ‘for all types AA’, but it was inconsistent. One can make a consistent version as follows: replace the word ‘type’ everywhere by ‘small type’, except in the phrase ‘type of all types’, where only the second ‘type’ is replaced; this would be perfectly analogous to the use of ‘small set’ above. I would now like to cite that Martin-Löf did just this, but he did not; instead, he developed a stronger theory with a hierarchy of universes, in each of which all type constructors may be used. But it seems to me that if type theory without universes is ‘type theory’ and type theory with a hierarchy of universes is ‘type theory’, then type theory with a single fixed universe of small types, in between these two, is also ‘type theory’.

Some people (I think Beeson, and since I'm already going to look up something else in that for you, I'll try to check this too) distinguish ‘set’ and ‘type’ by whether the theory is material or structural; to them, SEAR\mathbf{SEAR} is, like ETCS\mathbf{ETCS}, already a ‘type’ theory. (For what it's worth, that's how I used the words before you convinced me that there was no reason to do this.) You seem to distinguish them by whether one can quantify over all of them when defining one of them, which is also reasonable but not the only way to do things (and then ETCS\mathbf{ETCS} is still a ‘type’ theory). Another way to distinguish them is to say that a ‘set’ has an arbitrary equality relation, while a ‘type’ has none (or has only syntactic identity); that is done here for example (although using ‘preset’ is probably a more precise way to do this). There are many distinctions that can be made in one's style of foundations, but I don't see any of them as an essential or universal distinction between these two words, nor do I see the need for such a distinction.

Posted by: Toby Bartels on September 28, 2009 1:57 AM | Permalink | Reply to this

Re: Sets vs types

I fully agree that there is a continuum of theories, and it is by no means a priori clear where to draw the line between “type theory” and “set theory.” But we have to have words that mean something, or we’ll never know what we’re talking about!

I had a lengthy email exchange with Thomas Streicher several months ago about more or less this question. We did a lot of not understanding what each other was saying, and we got especially confused because we were also talking about interpretability of theories internal to a non-well-pointed topos. The metric of quantifiers over all sets/types to distinguish “set theory” from “type theory,” which I’ve been adhering to here, is what came out of that discussion as a convention we could both agree on. (BTW, I don’t agree that ETCS is a type theory by that metric—the question is not whether quantifiers over sets are allowed in the separation axiom, but whether they exist at all in the language.)

It is certainly true that for many people, “set theory” means “material set theory,” so perhaps we structural-set-theorists should have just stuck with “type” instead of “set.” (Thomas also mentioned that when he was first learning topos theory, the use of “set theory” for the internal logic of a topos confused him because it was clear that set theory was stronger than type theory—another possible axis along which one could distinguish.) I do of course feel that there is something important to be gained by calling structural set theory “set theory” rather than “type theory”; in particular, it points out that this (and not material set theory) is really how sets are used by mathematicians (although apparently this can be harder to convince people of than I realized, pace AN!).

And I still think there is a difference between structural set theory and type theory.

By contrast, SEAR is formulated in a typed first-order logic, but the types involved are “set”, “relation”, and “element.”

Yes, but type theory itself is also formulated in a typed first-order logic, where the types involved are “type”, “term”, “proposition”, and the like.

I agree that type theory can be formulated in such a way, but it can also stand alone as such a theory itself. To borrow the metaphor of programming languages, type theory is a part of logic, which is the machine language of mathematics. You can write an interpreter for machine language in machine language (and you might want to, in order to run it on some other architecture), but you can also run it directly on the machine it was written for. But SEAR must be compiled/interpreted into type theory/logic; it is not the machine language of any machine.

Posted by: Mike Shulman on September 28, 2009 4:20 AM | Permalink | PGP Sig | Reply to this

Re: Sets vs types

Here is a contentful and important mathematical consequence of that difference. Type theories (in the sense that I am using the word) have a term model. That is, you can construct a topos (or a category with less structure, if your theory doesn’t require as much) which is the free topos containing an internal model of that theory. In particular, applying this to “IHOL” (the type theory corresponding to an ordinary topos) there is a free topos.

This is not true (at least, not as far as I can tell) for SEAR and other “structural set theories” which allow quantifiers over sets in their axioms. (You might have seen a draft of my UQ&SA paper in which I claimed that it was, but now I believe that is incorrect.)

In both cases you can also interpret the logic as happening “one level up,” as you suggested, and now in both cases there is a free model. But this sort of free model looks very different: now instead of a category whose individual objects represent the individual types/sets, we have a category containing a single “object of types” and a single “object of elements.”

What we get in this latter case can be thought of as a “free category of classes.” The category of small objects in a category of classes is a topos—but even if the category of classes satisfies its version of the stronger axioms like unbounded separation and collection, it does not in general follow that its category of small objects satisfies its version of them. All we can say is that the internal category of small objects satisfies these axioms in the internal logic of the category of classes.

Posted by: Mike Shulman on September 29, 2009 3:26 PM | Permalink | PGP Sig | Reply to this

Re: Sets vs types

I wrote:

Some people (I think Beeson, and since I’m already going to look up something else in that for you, I'll try to check this too) distinguish ‘set’ and ‘type’ by whether the theory is material or structural

Nothing so clear cut as that. Actually, Beeson seems to be confused; in Chapter II (Informal Foundations of Constructive Mathematics), he claims (Section II.3) to use Bishop's concept of set (which is definitely structural) and even notes that x=yx = y is not globally meaningful. But then (Section II.9) he defines xYx \in Y whenever xXx \in X and XYX \subseteq Y, calling this a ‘difference in use of language’ from Bishop. And so it is, but it's not clearly explained.

All of the formal ‘set theories’ in Beeson are both material and based on first-order logic, while the only ‘type theories’ are those of Martin-Löf, so that doesn't help. The same is true in other references that I've just checked.

Posted by: Toby Bartels on October 3, 2009 12:58 AM | Permalink | Reply to this

Re: What is a structured object?

This has nothing at all to do with large categories. Consider the category of finite groups.

Ah, okay, I misunderstood your complaint.

The way to deal with this is the same as the way to deal with any sort of family of objects in structural set theory. A small category in structural set theory consists of a set C 0C_0 of objects, a set C 1C_1 of morphisms, functions s,t:C 1C 0s,t:C_1\to C_0, i:C 0C 1i:C_0\to C_1, and c:C 1× C 0C 1C 1c:C_1\times_{C_0}C_1\to C_1 with axioms as defined for instance here. If you want to consider the objects of such a category as “being” sets with structure, then you simply consider a C 0C_0-indexed family of sets with structure and a C 1C_1-indexed family of morphisms between them.

(A small equivalent of) the category of finite groups, for instance, would be a category as above equipped with a C 0C_0-indexed family of finite groups GG and a C 1C_1-indexed family of morphisms HH between them, such that any morphism between groups in GG occurs exactly once in HH, and such that any finite group is isomorphic to one in GG.

Unfortunately I don’t have time to explain in more detail right now exactly what is meant by “family” in all these cases, but it is not hard.

This is not a problem for formalization at a low level, but it may be undesirable when trying to formalize at a higher level, for all the reasons that you’ve given. But it doesn’t prevent SEAR from reflecting on itself formally.

It does. Reflection means being able to define a copy of SEAR inside SEAR, including all the language used to define this copy.

That is in fact what reflection means, but you haven’t explained why not having “a group” as a single object in the domain of discourse prevents it.

Then, and only then, you can speak of having SEAR as a foundation.

I don’t understand why reflection should be the defining test of a foundation. To me, saying that something is a foundation for mathematics means that it can be used to formalize all (or a substantial part) of mathematics. Logic is, indeed, an important part of mathematics, but only a part. Being able to compile its own compiler is an important test of a (compiled) programming language, but it is not the defining feature that enables us to call something a “programming language.” My impression is that generally by the time that a language is able to compile its own compiler, it is fairly well-accepted that it is, in fact, a programming language.

Regardless, if formalizing logic is what you want, I claim that logic, just like most of the rest of mathematics, is already written in an essentially structural way. For example, suppose one chooses to code logical sentences as natural numbers. This never depends on the specific definition of natural numbers as finite von Neumann ordinals or what-have-you; it only depends on the fact that they satisfy the induction property. Well, so do the natural numbers in SEAR or ETCS. Consider for simplicity a one-sorted theory with nn binary function symbols, which we code by the natural numbers 0,1,,(n1)0,1,\dots,(n-1), and mm binary relation symbols, coded similarly. We can then use the separation property to define a subset FF of \mathbb{N} consisting of those natural numbers that code well-formed formulas in this language. A logical theory then consists of a subset of FF, the axioms. A structure for this language is a set MM, together with a function {0,1,,(n1)}×M×MM\{0,1,\dots,(n-1)\}\times M\times M\to M coding the function operations and a subset of {0,1,,(m1)}×M×M\{0,1,\dots,(m-1)\}\times M\times M coding the relation symbols. (Here, of course, {0,1,,(n1)}\{0,1,\dots, (n-1)\} denotes an nn-element set equipped with a specified injection into \mathbb{N} that gives its elements meaning as natural numbers.) The inductive property of \mathbb{N} enables us to define the truth value of any formula on such a structure, so we can define a model of a theory to be a structure in which all the axioms are true.

In other words, all the work of reflection is already done. All that remains for structural set theory to do is point out that existing mathematics is already structural.

Posted by: Mike Shulman on September 24, 2009 5:58 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

This has been a very interesting discussion, and I hope Mike won’t mind (since he says he’s busy) if I touch upon some of what he was saying above, and outline a construction of an internal category of finite groups within a structural set theory.

As a warmup, let’s construct an internal category FinFin equivalent to the category of finite sets. We take the set of objects Fin 0Fin_0 to be \mathbb{N}, the set of natural numbers, with one element n0n \geq 0 for each finite cardinality.

As Mike was saying, in order to construe objects nn \in \mathbb{N} as giving actual finite sets, we construct a “family” ϕ:F\phi: F \to \mathbb{N} where each fiber F nF_n is a set of cardinality nn. For example, consider the function

ϕ:×:(m,n)m+n+1\phi: \mathbb{N} \times \mathbb{N}\to \mathbb{N}: (m, n) \mapsto m + n + 1

Then, for each n0n \geq 0, the fiber ϕ 1(n)\phi^{-1}(n) is a set of cardinality nn. This fiber will also be denoted [n][n].

Next, using the existence of dependent products in a structural set theory like ETCS, one may construct the family of morphisms between finite sets,

ψ:Fin 1×,\psi: Fin_1 \to \mathbb{N} \times \mathbb{N},

where the fiber over (m,n)×(m, n) \in \mathbb{N} \times \mathbb{N} is [n] [m][n]^{[m]}, the set of functions from [m][m] to [n][n]. In other words, an element ff of Fin 1Fin_1 “is” a function between finite sets. Let us write dom(f)dom(f) for the first component of ψ(f)\psi(f) and cod(f)cod(f) for the second component, so that ψ(f)=dom(f),cod(f)\psi(f) = \langle dom(f), cod(f) \rangle. This gives us functions

dom,cod:Fin 1Fin 0dom, cod: Fin_1 \overset{\to}{\to} Fin_0

which are part of the structure of an internal category FinFin; the rest of the structure consists of identity and composition functions

id:Fin 0Fin 1c:Fin 1× Fin 0Fin 1Fin 1,id: Fin_0 \to Fin_1 \qquad c: Fin_1 \times_{Fin_0} Fin_1 \to Fin_1,

which are not hard to construct. In the end, the internal category constructed is equivalent to the category of finite sets.

Now let us continue by sketching the internal category of finite groups. To construct a set G 0G_0 whose elements represent all isomorphism classes of finite groups, we construct a family

card:G 0card: G_0 \to \mathbb{N}

where each fiber card 1(n)card^{-1}(n) is the set of all group structures on the set [n][n]: the subset of

[n] [n]×[n]×[n]×[n] [n][n]^{[n] \times [n]} \times [n] \times [n]^{[n]}

whose members (m,e,i)(m, e, i) are those triples which obey the equational axioms (appropriate to the theory of groups) for multiplication mm, identity ee, and inversion ii. We may construe elements gg of G 0G_0 as “finite groups”. In particular, the “underlying set” of a finite group gG 0g \in G_0 is

U(g)=ϕ 1(card(g))U(g) = \phi^{-1}(card(g))

Finally, we construct the set G 1G_1 of finite group homomorphisms. This is the set of those triples

(g,f,h)G 0×Fin 1×G 0(g, f, h) \in G_0 \times Fin_1 \times G_0

such that dom(f)=card(g)dom(f) = card(g), cod(f)=card(h)cod(f) = card(h), and the function ff satisfies the equations necessary to make it a homomorphism from the group structure gg to the group structure hh.

This completes the sketch of an internal category equivalent to the category of finite groups. While it’s just a sketch, all the formal details can be filled in within the framework of a structural set theory such as ETCS or SEAR.

Which brings me to a question. Sometime earlier Arnold wrote:

At present, every formalization of a piece of mathematics is a mess; this was not the point.

What I was referring to was the overhead in the length of the formalization. With ZF, you can formalize a concept once as a tuple, and then always use the concept on a formal level.

and then

This is what I was aiming at. For reflection purposes, one cannot work in pure SEAR, while one can do that in pure ZF.

As a matter of fact there are bi-interpretability theorems which show that any construction in Zermelo set theory (Bounded Zermelo theory with Choice to be more precise) can be expressed in the structural theory ETCS, and vice-versa, and certainly one can augment ETCS with additional axioms to recover the full power of ZF. Similarly, if I recall correctly, Mike has basically said in his article that SEAR is bi-interpretable with (has the same expressive power as) ZF. So it is not clear to me why Arnold believes that for reflection purposes, one can work with ZF but not with SEAR. For example, what was sketched above indicates that one can reflect finite groups within (say) ETCS at a formal level. Mike said a little more about reflection in his later comment here.

Posted by: Todd Trimble on September 25, 2009 6:59 AM | Permalink | Reply to this

Re: What is a structured object?

I hope Mike won’t mind (since he says he’s busy)

I should hope I wouldn’t mind either, no matter how busy I am! (-: I hope I haven’t given the impression that I own structural set theory or something. As many people have been saying, all of this stuff (except perhaps some details of SEAR) is decades old.

Posted by: Mike Shulman on September 25, 2009 8:00 AM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

MS: As many people have been saying, all of this stuff (except perhaps some details of SEAR) is decades old.

If this is true, it should be easy to point to a paper or book that contains in terms of ETCS the definition of the basic concepts of category theory, including the examples of a few concrete categories (comparable in richness of structure to the category of finite groups).

Thus I’d appreciate getting such a decades old reference that backs up your claim.

Posted by: Arnold Neumaier on September 25, 2009 10:09 AM | Permalink | Reply to this

Re: What is a structured object?

it should be easy to point to a paper or book that contains in terms of ETCS the definition of the basic concepts of category theory, including the examples of a few concrete categories (comparable in richness of structure to the category of finite groups).

As I’ve been saying over and over again, I don’t think anyone has felt the need to do this sort of thing, because once the basic structure of ETCS (say) is developed sufficiently it becomes “obvious” to people who think like we do that the rest of mathematics can follow, and everyone would rather spend their time pushing the boundaries. Rewriting Bourbaki by changing a word here and there isn’t a really fun way to spend one’s time, nor likely to be counted as a significant contribution to mathematics when one is applying for jobs. That isn’t to say that I don’t wish that someone had, so that I could point you to it! Mathematics is full of things that are “understood” by people who work in a given field for a long time before being carefully written down with enough details to make sense to others.

You will find this perspective running implicitly through many books on topos theory, and they are actually doing something more general: considering how mathematics can be developed on the basis of any elementary topos. But again, they probably don’t supply enough details about how to do this to satisfy you.

Posted by: Mike Shulman on September 25, 2009 3:49 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

I’ll second what Mike said: for those people who have absorbed the methods that are explained in a book like Moerdijk and Mac Lane’s text, the sort of explicit detail of the sort I laid out is more along the lines of an exercise whose solution would be well-understood by many. It’s probable that it would be carried out in more explicit detail only when an outsider comes along and begins asking a different set of questions like you are doing here, so what you are looking for exactly might be hard to track down in the literature.

Posted by: Todd Trimble on September 25, 2009 4:20 PM | Permalink | Reply to this

Re: What is a structured object?

If this is true, it should be easy to point to a paper or book that contains in terms of ETCS the definition of the basic concepts of category theory, including the examples of a few concrete categories (comparable in richness of structure to the category of finite groups).

This may not exist, because any basic textbook on category theory has to mention foundations to deal with size issues, and this discussion is unlikely to be independent of material vs structural foundations.

However, any modern algebra book, if it doesn't talk about either set theory or category too much, will do this. For example, take Lang, remove (or rewrite) only the two pages on Logical Prerequisites, and the rest (including the Appendix on more advanced set theory!) is fine as it is. (I haven't checked every page, but I did skim through Chapter I and Appendix 2.)

There is a constant abuse of language (which should probably be remarked upon if one rewrites the Logical Prerequisites) where a subset SS of a set XX is conflated with the underlying set of SS (and also an element aa of XX that belongs to SS is conflated with the unique corresponding element of the underlying set of SS), but this is no worse than the abuse (not remarked upon!) that begins Section V.1 in my (1993) edition:

Let FF be a field. If FF is a subfield of a field EE, […]

Literally, a subfield of EE is (as Lang defined it) a subset of EE, not a field in its own right. (In ZFC\mathbf{ZFC}, a subset of EE might happen to equal the ordered triple that is a field, but if so then that is not what Lang wants here!) Structural set theory uses the same abuse of language, although now also for unstructured sets just as much as for structured sets such as fields.

Lang also discusses category theory, but he doesn't indicate how to formalise it, so that text doesn't need any changing either. (What is a ‘collection’? Lang doesn't say. The unwary reader may assume that it's the same as a ‘set’ and be led to a paradox on the next page!)

Posted by: Toby Bartels on September 25, 2009 9:38 PM | Permalink | Reply to this

Re: What is a structured object?

Mike: you didn’t give me that impression (or even that you were pretending to such ownership (-: ). In fact, I salute both you and Toby for all your hard work in providing all those many thoughtful responses. I think all of us have been learning a lot from the exchange.

Posted by: Todd Trimble on September 25, 2009 12:30 PM | Permalink | Reply to this

Re: What is a structured object?

TT: This completes the sketch of an internal category equivalent to the category of finite groups.

OK, I get the idea of how to reflect things. Once one has the group as a single object (and in contrast to Mike Shulman, you modelled it that way), the basic obstacle to full reflection is gone.

One builds some machinery that mimicks the material structure of ZF, for example by providing triples that encode the group. Then one uses this structure to do what one is used to do in the standard reduction of mathematics to ZF.

I agree that one can probably fill in all details, and that this gives a way to define formally what the category FG of finite groups is, and hence what a finite group is, namely an element of Ob(FG).

Thus I now grant that (and understand how) ETCS - and maybe SEAR in a similar way - may be viewed as being a possible foundation of all of mathematics (when enhanced with enough large cardinals to handle large categories).

What I no longer understand now, however, is the claim that this way of organizing mathematics is superior to that of basing it on ZF since it is structural rather than material.

For I find the meaning of a finite group implied by the construction you gave not any more natural than the meaning of a natural number implied by its ZF construction by von Neumann.

It is ugly, and no mathematician thinks of this as being the essence of finite groups.

Moreover, for a (general) group, one has a similar messy construction, and a finite group is no longer a group but only ”becomes” a group under the application of a suitably define functor.

This flies in the face of the ordinary understanding of every algebraist of the notions of group and finite group.

In the attempts (in this discussion) to capture the essence of mathematics the proponents introduce so much artificial stuff in the form of trivial but needed functors that the result no longer resembles the essence to be captured.

Thus the structural, ETCS-based approach is no better in capturing the essence of mathematics as the material, ZF-based approach.

Both create lots of structure accidental to the construction, structure that is not in the nature of the mathematics described but in the nature of forcing mathematics into a ETCS-theoretic or ZF-theoretic straitjacket.

Posted by: Arnold Neumaier on September 25, 2009 10:36 AM | Permalink | Reply to this

Re: What is a structured object?

For I find the meaning of a finite group implied by the construction you gave not any more natural than the meaning of a natural number implied by its ZF construction by von Neumann.

I think you are misunderstanding the point of the construction. The meaning of a finite group is still “a finite set GG equipped with a multiplication m:G×GGm:G\times G\to G and a unit eGe\in G such that …”. Just like the meaning of a Cauchy sequence of rationals is “a function \mathbb{N}\to \mathbb{Q} such that …”. It’s only when you want to consider “the category of finite groups” or “the set of Cauchy sequences” as an abstract object that you need to construct a set whose elements code for finite groups or Cauchy sequences.

Posted by: Mike Shulman on September 25, 2009 3:31 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

Arnold wrote:

What I now longer understand now, however, is the claim that this way of organizing mathematics is superior to that of basing it on ZF since it is structural rather than material.

For I find the meaning of a finite group implied by the construction you gave not any more natural than the meaning of a natural number implied by its ZF construction by von Neumann.

It is ugly, and no mathematician thinks of this as being the essence of finite groups.

Moreover, for a (general) group, one has a similar messy construction, and a finite group is no longer a group but only ”becomes” a group under the application of a suitably define functor.

This flies in the face of the ordinary understanding of every algebraist of the notions of group and finite group.

In the attempts (in this discussion) to capture the essence of mathematics the proponents introduce so much artificial stuff in the form of trivial but needed functors that the result no longer resembles the essence to be captured.

Thus the structural, ETCS-based approach is no better in capturing the essence of mathematics as the material, ZF-based approach.

Okay, a lot of opinions are being expressed here. Let me first say that the charge of “ugliness” is an aesthetic judgment, not part of formalized mathematics. Given the strictures I placed myself under (showing that a group could be expressed as a single element), to satisfy your demands, the notion was bound to look harder than the ordinary understanding of the algebraist, whose “essence” [as you like to say] is simply, as we have been saying over and over,

  • A group is a set equipped with a group structure

which I maintain is structural in essence: there is no reference in that definition to the fact that elements may themselves have elements. The word “structural” means that it is abstract structure that is paramount, not the internal ontology of elements which is necessarily uninvariant under isomorphism – internal ontology of elements is a consideration which is alien to the practice of working mathematicians (unless they are investigating ZF perhaps, from a platonist point of view).

Presumably, if FMathL is well-developed, the human user can work in the customary style of sets+structure, and it is the job of the computer to then translate (or shoehorn) that into a single object or element. I don’t think the computer would care or have an opinion whether that’s done in ZF or SEAR or whatever, although obviously consideration must be given to what is the most efficient way to do the shoehorning.

Rather than say the structuralist view is “better” (it may certainly be better for certain purposes), and bring in aesthetic disagreements which may well be irreconcilable, I would say that at least in some respects, the structural view is closer to the way mathematics has traditionally been practiced. For example, the idea that a point on the real line may have elements which themselves have elements is, I think you will admit, peculiar to twentieth-century mathematics (and maybe to some extent now), and is an idea that is utterly irrelevant to working practice. And yet this abnormality is an undeniable consequence if one takes ZF and particularly a global membership relation as one’s foundations. I believe there’s some merit in rejecting those consequences as abnormal and irrelevant to mathematics.

On the other hand, a different twentieth-century development which has proven itself extremely relevant to current practice is category theory, which emphasizes universal properties and invariance of structure with respect to isomorphism. A structural development like ETCS takes those precepts very seriously indeed and embeds them as part of the formal development, whereas those precepts for a committed ZF-er would have to remain at the level of “morality” and are not part of the formal set-up.

Don’t get me wrong – as an abstract structure, the cumulative hierarchy is a recursively rich, powerful, and interesting mathematical structure. But as foundations, it’s not particularly pertinent to how mathematicians think about L 2L^2 and such things. Those of us committed to category theory have come a bit closer to the essence, I believe, by focusing on things like universal properties as far more relevant to practice.

Posted by: Todd Trimble on September 25, 2009 4:04 PM | Permalink | Reply to this

Re: What is a structured object?

On the other hand, a different twentieth-century development which has proven itself extremely relevant to current practice is category theory, which emphasizes universal properties and invariance of structure with respect to isomorphism.

I’d like to add my 5 cents worth to this discussion by agreeing with Todd. I am not a category theorist and never will be — category theory hurts my head. On the other hand I find it very useful to try to think like a category theorist. Even (especially!) when I am working on something that appears quite far from category theory, like dynamical systems or symplectic toric geometry.

Posted by: Eugene Lerman on September 25, 2009 9:49 PM | Permalink | Reply to this

Re: What is a structured object?

Eugene wrote:

I am not a category theorist and never will be — category theory hurts my head.

The only thing stopping you is that you still think it’s bad for your head to feel that way. It’s actually good — it’s the feeling of new neurons growing.

It’s sort of like the aches and pains you get after lifting more weights than you’re used to. Good weightlifters still feel those aches; they just learn to like them.

Posted by: John Baez on September 26, 2009 3:46 AM | Permalink | Reply to this

its the feeling of new neurons growing; Re: What is a structured object?

No pain, no gain, in the visceral brain, or the complex plane.

Posted by: Jonathan Vos Post on September 26, 2009 4:55 PM | Permalink | Reply to this

Re: What is a structured object?

The worrying thing about your weight-lifter analogy is that body builders tear their muscles to promote growth.

Posted by: David Corfield on September 26, 2009 6:23 PM | Permalink | Reply to this

Re: What is a structured object?

David wrote:

The worrying thing about your weight-lifter analogy is that body builders tear their muscles to promote growth.

And what’s worrying about that? I bet the ‘aching head’ feeling I get when struggling to learn new concepts is somehow analogous to the ‘torn muscle’ feeling I get whenever I up the amount of weight I lift at the gym. I bet there’s some real ‘damage’ to ones conceptual/neurological structure whenever one struggles really hard to master difficult new ideas: comfortable old connections are getting torn apart. But then new improved connections grow to take their place!

I think the people who do well at learning new things are the ones who learn to enjoy the ache. In the case of the ‘torn muscle’ feeling, the pleasure comes from 1) knowing that one is getting stronger, 2) the endorphin high, 3) a learned association between the two. Maybe something similar happens in the intellectual realm.

Posted by: John Baez on September 26, 2009 10:13 PM | Permalink | Reply to this

Re: What is a structured object?

I will say: as someone who has begun a strength-training regime fairly recently, and whose aching arms feel like useless appendages right now, this mini-thread is helping a little bit. Thanks!

Posted by: Todd Trimble on September 27, 2009 4:20 PM | Permalink | Reply to this

No fiber bundle pain, no gain; Re: What is a structured object?

Ironically, the pain from body building comes from fiber bundles. Or, actually, tearing the membranes surrounding bundles of fibers.

Skeletal muscle is made up of bundles of individual muscle fibers called myocytes. Each myocyte contains many myofibrils, which are strands of proteins (actin and myosin) that can grab on to each other and pull. This shortens the muscle and causes muscle contraction.

It is generally accepted that muscle fiber types can be broken down into two main types: slow twitch (Type I) muscle fibers and fast twitch (Type II) muscle fibers. Fast twitch fibers can be further categorized into Type IIa and Type IIb fibers.

These distinctions seem to influence how muscles respond to training and physical activity, and each fiber type is unique in its ability to contract in a certain way. Human muscles contain a genetically determined mixture of both slow and fast fiber types. On average, we have about 50 percent slow twitch and 50 percent fast twitch fibe

Andersen, J.L.; Schjerling, P; Saltin, B. Scientific American. “Muscle, Genes and Athletic Performance” 9/2000. Page 49

McArdle, W.D., Katch, F.I., and Katch, V.L. (1996). Exercise physiology : Energy, nutrition and human performance

Lieber, R.L. (1992). Skeletal muscle structure and function : Implications for rehabilitation and sports medicine. Baltimore : Williams and Wilkins.

Andersen, J.L.; Schjerling, P; Saltin, B. Muscle, Genes and Athletic Performance. Scientific American. Sep 2000

Thayer R., Collins J., Noble E.G., Taylor A.W. A decade of aerobic endurance training: histological evidence for fibre type transformation. Journal of Sports Medicine and Phys Fitness. 2000 Dec; 40(4).

Posted by: Jonathan Vos Post on September 28, 2009 7:09 AM | Permalink | Reply to this

Clues To Reversing Aging Of Human Muscle Discovered; Re: No fiber bundle pain, no gain; Re: What is a structured object?

DOING Math (what Erdos called “being alive”) also helps reverse the effects of aging on the Brain. I don’t much like the common analogy: “The brain is a muscle; use it or lose it” because, you know, the brain is NOT a muscle. Yet regular and vigorous use IS beneficial, and to an extent that surprises many people.

Clues To Reversing Aging Of Human Muscle Discovered

… “Our study shows that the ability of old human muscle to be maintained and repaired by muscle stem cells can be restored to youthful vigor given the right mix of biochemical signals,” said Professor Irina Conboy, a faculty member in the graduate bioengineering program that is run jointly by UC Berkeley and UC San Francisco, and head of the research team conducting the study. “This provides promising new targets for forestalling the debilitating muscle atrophy that accompanies aging, and perhaps other tissue degenerative disorders as well.”…

Morgan E. Carlson, Charlotte Suetta, Michael J. Conboy, Per Aagaard, Abigail Mackey, Michael Kjaer, Irina Conboy. Molecular aging and rejuvenation of human muscle stem cells. EMBO Molecular Medicine, 2009; DOI: 10.1002/emmm.200900045

Posted by: Jonathan Vos Post on September 30, 2009 9:01 PM | Permalink | Reply to this

Re: What is a structured object?

If you say this …

I agree that one can probably fill in all details, and that this gives a way to define formally what the category FG of finite groups is, and hence what a finite group is, namely an element of Ob(FG).

then naturally you will say this …

What I no longer understand now, however, is the claim that this way of organizing mathematics is superior to that of basing it on ZF since it is structural rather than material.

A finite group ‘is’ a set equipped with a group structure. If it vital to encode this formally as a single object, then supplement SEAR or ETCS with a dependent type theory with dependent sums. But it is not essential to mathematical practice to do so.

If you want to have a collection of finite groups (or whatever), then any foundations requires some reasoning to show that your collection is valid. (After all, a collection of literally ‘all’ finite groups is impossible in ZFC, as is a collection of all groups whatsoever in either ZFC or ETCS.) Although other methods may be available in some cases, the uniform way to do this is by using the Axiom of Collection: you find some way to index your objects by a set, and the axiom gives you your collection.

In material set theory, you can set things up so that each object is literally an element of the collection, which is convenient; this wouldn't make sense in structural set theory, so you instead introduce an abuse of language in which the ‘elements’ of the collection are actually the fibres over the elements of the index set (together with the structures defined on those fibres).

I said that material set theory is convenient, but in fact it is not convenient enough! Even in ZFC, there is no small category FG such that a finite group is literally the same as an object of FG. Instead, if you insist on recovering the notion of finite group from the category FG, then you can define a finite group to be a set UU together with an object SS of FG and a bijection between UU and the underlying set of SS. In ZFC, presumably the ‘underlying set’ of SS is the first entry in a tuple (S,m)(S,m); in ETCS, the ‘underlying set’ of SS is as defined in Todd's comment. (In both cases, it takes another step to recover the group in the usual sense, as a set together with a group operation.) Once again, structural set theory prevents a potential mistake (thinking that GG is not a finite group because it is not literally an object of FG) by throwing up a typing error.

Moreover, for a (general) group, one has a similar messy construction, and a finite group is no longer a group but only “becomes” a group under the application of a suitably define functor.

Hopefully you see now that this is not true in ETCS, but even so … this is no worse than the fact that a Riemannian manifold only “becomes” a manifold under the application of (in the structured-sets-as-tuples formalisation) projection onto the first entry (or possibly even something a bit more complicated).

Posted by: Toby Bartels on September 25, 2009 9:39 PM | Permalink | Reply to this

Re: What is a structured object?

AN: Moreover, for a (general) group, one has a similar messy construction, and a finite group is no longer a group but only “becomes” a group under the application of a suitably define functor.

TB: this is no worse than the fact that a Riemannian manifold only “becomes” a manifold under the application of (in the structured-sets-as-tuples formalisation) projection onto the first entry (or possibly even something a bit more complicated).

I think the standard mathematical language teaches something different that gets lost both by encoding it into ZF and by encoding it in ETCS or SEAR, though in different ways.

In mathematical practice, to say that an object is a group or a manifold says that it has certain properties. To say that it is a finite group or a Riemannian manifold adds properties but of course preserves all previous properties.

Similarly, to say that a subset H of a group G is a subgroup if it is closed under products and inversion is not an abuse of notation (as was claimed in the discussion on SEAR), since the subset H is not only a set and a subset of G but inherits from the group a product mapping from H x H to G (and even one from H x G to G, etc), and if the subgroup condition holds, this is a mapping from H x H to H and hence the binary operation alluded to in calling it a subgroup.

Similarly, to say that L 2()L^2(\mathbb{R}) and L 2( 3)L^2(\mathbb{R}^3) are separable Hilbert spaces does not strip them of any distinguishing property these spaces have by construction, although the category of separable Hilbert spaces contains only one object up to isomorphism.

Thus almost all objects mathematicians talk about are almost always equipped with lots of stuff through their context, but neither the formalization in ZF nor that in SEAR or ETCS (or Coq, etc.) takes account of that.

That one doesn’t use all these extra structure all the time is not to be handled by deleting entries from the tuple (in ZF) or by applying a forgetful functor (in the structural approach) but by the same common sense that logicians use when they list in some formal natural deduction only the stuff they actually used.

That the categorial way alone cannot capture this essence of mathematics is quite obvious from simple examples:

If GOb(Grp)G\in Ob(Grp) any sane mathematician infers that GG is equipped with a set structure with which the assumption x,y,zGx,y,z\in G makes sense, and infers that there is a product operation for which xyGxy\in G and (xy)z=x(yz)(xy)z=x(yz).

But this only holds if GrpGrp is the category materially constructed by the definition of GrpGrp, and not (as claimed in this discussion - I don’t remember by whom) if one forgets this construction once the category is formed, and only retains the isomporphism class of the category.

Thus the “structural” point of view actually loses structure!

Sometimes the loss of structure is dramatic: The category CLOFCLOF of closed linearly ordered fields and the category E7E7 of undirected graphs isomorphic to the E 7E_7 Dynkin diagram are isomorphic, but objects from these two categories have very different properties. There is not even a canonical isomorphism betwee the two categories. Here the essence is completely lost.

Posted by: Arnold Neumaier on September 30, 2009 12:27 PM | Permalink | Reply to this

Re: What is a structured object?

I assume that by “properties” you mean “properties or structure or stuff” (around here we use a precise meaning of property according to which a finite group is a group with extra properties, but a Riemannian manifold is not a manifold with extra properties (but rather extra structure)).

I agree that both ZF and ETCS/SEAR handle this issue clumsily, albeit clumsily in different ways. However, I think this argument:

If G∈Ob(Grp) any sane mathematician infers that G is equipped with a set structure with which the assumption x,y,z∈G makes sense, and infers that there is a product operation for which xy∈G and (xy)z=x(yz).

But this only holds if Grp is the category materially constructed by the definition of Grp, and not… if one forgets this construction once the category is formed, and only retains the isomporphism class of the category.

misses the point. If one wants to treat Grp as an abstract category, then one forgets how its objects were constructed (which has nothing to do with materiality), just as if one wants to treat A 5A_5 as an abstract group, one forgets that its elements have a natural action on some 5-element set. However, nothing forces us to do that forgetting as soon as the object is formed, and quite often we don’t.

But, as I said, I agree that both ZF and ETCS/SEAR are clumsy about moving between different levels of properties or structure. This would be something that would be great for a higher-level formalization to improve on.

Actually, it strikes me right now that this issue is very similar to class inheritance in object-oriented programming. When we say that a Riemannian manifold is a manifold, the “is a” really has the same meaning as in OOP: a Riemannian-manifold object can be used anywhere that a manifold is expected, but it doesn’t thereby lose its Riemannianness (although if we access it only through a manifold ptr then we can’t use any of its Riemannianness). From this point of view, the clumsiness of existing foundations amounts to requiring all upcasts to be explicit.

Posted by: Mike Shulman on September 30, 2009 5:53 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

MS: I assume that by “properties” you mean “properties or structure or stuff”

Yes. For me extra properties and extra structure are synonymous. It is just something more that can be profitably exploited for reasoning.

MS: However, nothing forces us to do that forgetting as soon as the object is formed, and quite often we don’t.

This piece of moral sounds quite different and much more agreeable than the many times repeated one I had to put up with before:

“However, once the construction is finished, we generally forget about it” “but in each case once the construction is performed, its details are forgotten. I always assumed, without really thinking much about it, that all modern mathematicians thought in this way” “once the construction is performed, the fact that you used “the same” objects is discarded.” “When you construct one category from another, you might use the “same” set of objects, but once you’ve constructed it, there is no relationship between the objects, because after all any category is only defined up to equivalence.” (quotes from your earlier mails)

“ two categories may have an object in common, but you should never use that fact.” “You’re completely (intentionally?) missing the distinction I drew between a construction demonstrating the existence of a model of a structure and the subsequent use of the properties of a structure. As I said before, “moral” (which was someone else’s term) refers to the latter segment, not the former. […] (John Armstrong)

You now seem to say that all the categories can be considered as concrete categories or as abstract categories depending on the purpose the mathematician wants to achieve. This is fine with me. Indeed, the standard (material) definition of a category is precisely that of a concrete category. And in concrete categories I am allowed to do all the stuff you wanted to forbid: compare objects of different categories, check whether the objects of one form a subclass of those of the others, create intersections of the class of objects of two different categories, etc.. once this is allowed, I have no problems at all with the categorial language (except for lack of fluency in expressing myself in it). One has all this structure around unless one deliberately forgets it. There is no moral that tells one that one should forget it, except if one wants to forget it.

It was only the strange moral that was imposed on it without having any formal justification that bothered me.

MS: it strikes me right now that this issue is very similar to class inheritance in object-oriented programming. […] From this point of view, the clumsiness of existing foundations amounts to requiring all upcasts to be explicit.

Yes. This is why FMathL will have on the specification level a much more flexible type-like system that borrows much more from the theory of formal languages than from the theory of types.

Posted by: Arnold Neumaier on September 30, 2009 6:49 PM | Permalink | Reply to this

Re: What is a structured object?

You now seem to say that all the categories can be considered as concrete categories or as abstract categories depending on the purpose the mathematician wants to achieve.

Yes, of course. The comments you quoted were in a different context, explaining that (for example) the particular construction of the real numbers as Dedekind cuts is usually forgotten once we have the real numbers, so that it is better if you can forget it rather than actually have the real numbers be Dedekind cuts as in material set theory.

I made this same point here.

Indeed, the standard (material) definition of a category is precisely that of a concrete category.

No, I don’t think so. Some people have a precise definition of a “concrete category,” but here I’m thinking of it in a more vague way like “a category together with some information preserved from its construction.” I don’t see what this has to do with materiality.

And in concrete categories I am allowed to do all the stuff you wanted to forbid: compare objects of different categories, check whether the objects of one form a subclass of those of the others, create intersections of the class of objects of two different categories, etc.

No. If two concrete categories CC and DD both have a forgetful functor to SetSet (being part of the information you remembered from their constructions), then you can ask whether the underlying sets of an object xCx\in C and yDy\in D are isomorphic, or whether every set that underlies an object of CC also underlies an object of DD, or consider the collection of all sets that underlie both an object of CC and an object of DD, but these are quite different things from the forbidden ones.

Posted by: Mike Shulman on September 30, 2009 8:25 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

AN: Indeed, the standard (material) definition of a category is precisely that of a concrete category.

MS: No, I don’t think so. Some people have a precise definition of a “concrete category,”

You had at least the additional qualifier of an abstract category, which seems to be something different from the category as defined in the textbooks.

MS: but here I’m thinking of it in a more vague way like “a category together with some information preserved from its construction.”

I am referring to the standard definition of a (small) category C found everywhere, with the standard interpretation of Ob(C) as class (or set) in the traditional sense (not SEAR, not ETCS, which,in most textbooks, do not figure early if at all).

There is nothing vague in this definition beyond what is vague in any mathematical discourse.

This definition does not ask you to forget anything about the category you constructed.

Indeed, the definition does not even provide a formal mechanism for forgetting. The reason is presumably either that such an automatic mechanism was never intended by those who invented and traded the definition, or that it is difficult to formalize rigorously at this stage.

On the contrary, to forget something you need to do something to the category, and no such doing is formally sepcified in any introduction to category theory.

It is an additional moral that you want to impose without specifying it axiomatically.

But what is not in the axioms can be ignored by anyone working with them, without harming in the least the correctness of what is done, and without affecting any consistent interpretation of the axioms.

AN: And in concrete categories I am allowed to do all the stuff you wanted to forbid: compare objects of different categories, check whether the objects of one form a subclass of those of the others, create intersections of the class of objects of two different categories, etc.

MS: No.

My example of the categories C abcdC_{abcd} etc. is still there; nobody has shown me any conflict with the standard definitions of a category and a subcategory (interpreted with Ob(C) as class in the traditional sense).

If you want to consistently uphold your No, you’d prove my assertions there wrong!

Posted by: Arnold Neumaier on September 30, 2009 9:08 PM | Permalink | Reply to this

Re: What is a structured object?

We have already been over this same territory several times. In my view we have given adequate responses to all of these issues, including your category C abcdC_{a b c d}. I don’t have time to repeat the same arguments again, especially since I have no reason to believe the communication would be any more successful the second or third time around. So I guess we’re at an impasse here.

Posted by: Mike Shulman on October 1, 2009 5:00 AM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

I am referring to the standard definition of a (small) category C found everywhere

So are we. You're focussing on the question of whether you can take two arbitrary categories C and D and ask whether C is a subcategory of D, but really you should (to avoid confusion with other issues around categories) start with the question of whether you can take two arbitrary sets C and D and ask whether C is a subset of D. (Or perhaps use groups instead of sets.) Certainly you can do the former if you can do the latter, which is obvious enough looking at the standard definition. But doing the latter is already objectionable (at the very least, an abuse of language) from a structuralist perspective.

with the standard interpretation of Ob(C) as class (or set) in the traditional sense (not SEAR, not ETCS, […]).

How can you tell?

Saunders Mac Lane, one of the two people who first defined categories, is on record as preferring ETCS as a foundation of mathematics. (This is quoted in that McLarty paper that's been linked here.) He considers his concept of category perfectly well formalised by structural set theory.

There is an additional complication, which goes beyond merely having structural foundations, that even within a single arbitrary small category, one should not be able to compare objects for equality (only for isomorphism); this is the problem of evil. ETCS and SEAR do allow this, while I would prefer a foundation of category theory that does not. I know some ways to approach this, but I don't think that it's a solved problem yet.

Posted by: Toby Bartels on October 1, 2009 9:07 AM | Permalink | Reply to this

Re: What is a structured object?

TB: really you should (to avoid confusion with other issues around categories) start with the question of whether you can take two arbitrary sets C and D and ask whether C is a subset of D.

According to the first paragraph of the Prerequisites in Serge Lang, Algebra, second printing 1970 (who treats categories in Chapter I.7), I am allowed to do this. I take this to be the standard point of view.

Lang’s context allows me to do everything I did with C abcdC_{abcd} etc., although your moral forbids it.

TB: Saunders Mac Lane, one of the two people who first defined categories, is on record as preferring ETCS as a foundation of mathematics.

So he allows only bounded comprehension in mathematics?

If your view is right, it depends on the foundations of mathematics whether one is allowed to do the things I do. This would mean that the foundations are not equivalent.

But this would conflict with the result by Osius (which I still need to check) that ETCS+R is equivalent to ZFC.

I think you cannot consistently claim both.

Posted by: Arnold Neumaier on October 1, 2009 10:47 AM | Permalink | Reply to this

Re: What is a structured object?

TB: Saunders Mac Lane, one of the two people who first defined categories, is on record as preferring ETCS as a foundation of mathematics.

So he allows only bounded comprehension in mathematics?

Toby didn’t say that; he said “preferred foundations”. Saunders would have been very happy to allow you to speak if you were giving him an instance of unbounded separation, and was well familiar with ZFC and its cousins.

Saunders’ position was that just about all core mathematics (what goes on in basic courses on functional analysis, algebraic topology, and so on) can be developed on the basis of ETCS. Not all developments – he was well aware that some set-theoretic constructions required going beyond ETCS. I think he chose not to be too exercised by that, but he may have had some occasional doubts. (I got to know Saunders rather well during my Chicago years, so I think I can say this.) He was also much concerned with making ETCS more accessible to people; I think this worried him more than any limitations of ETCS.

Posted by: Todd Trimble on October 1, 2009 2:40 PM | Permalink | Reply to this

Re: What is a structured object?

According to the first paragraph of the Prerequisites in Serge Lang, Algebra, second printing 1970 (who treats categories in Chapter I.7), I am allowed to do this. I take this to be the standard point of view.

And so it is. And yet, nowhere does Lang actually use the idea that one can take two arbitrary sets and ask whether one is contained in the other; he never needs to. He may take a set UU and then consider an arbitrary subset of UU; what this means can be defined (or even taken as axiomatic) in structural foundations. And he may take two arbitrary subsets of some set UU and ask whether one is contained in the other, which can also be defined structurally. But there is no need in ordinary mathematics to take two arbitrary sets and ask whether one is contained in the other; even if one thinks it meaningful, it never matters.

Lang’s context allows me to do everything I did with C abcdC_{abcd} etc., although your moral forbids it.

I'm not sure why you keep saying this. Is there anything that you did with C abcdC_{abcd} etc that we have not yet formalised structurally?

If your view is right, it depends on the foundations of mathematics whether one is allowed to do the things I do. This would mean that the foundations are not equivalent.

ETCS is equivalent to BZC (which is ZFC without replacement and with only bounded separation). ETCS+R is equivalent to ZFC (since replacement and bounded separation together imply full separation).

Posted by: Toby Bartels on October 1, 2009 7:58 PM | Permalink | Reply to this

Re: What is a structured object?

In mathematical practice, to say that an object is a group or a manifold says that it has certain properties.

This connects with the idea earlier that ‘ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation’. As I said then, I would be very interested to see a formalism in which this can be taken literally!

But it would be tricky. We should be able to say, for example, that a ring (that is, an associative unital ring) is an object that is both an abelian group and a monoid, satsfying a compatibility relation. But since every abelian group is already a monoid, surely AbGrpMon=AbGrpAbGrp \cap Mon = AbGrp, so now it has only one structure, which the compatibility condition forces to be trivial! (For an even worse example, try a commutative rig, where now both structures are commutative monoid structures.)

One thing that you could do is to say that a ring is an object that is both an additive abelian group and a multiplicative monoid, satisfying a compatibility condition. Then you seem to have to define monoids twice, and you're forbidden to say that (]0,[,,(a,ba logb)(]0,\infty[, \cdot, (a,b \mapsto a^{\log b}) is a ring, even though we have found it useful to say so. Of course, there may be ways around that, but I don't know them.

The way that I do know to formalise the idea that a ring may be defined as somehow both an abelian group and a monoid is to start with AbGrp×MonAbGrp \times Mon and then carve out RingRing with a compatibility condition which includes having the same underlying set. (On the face of it, this is evil, but I know ways around that. And in any case, there's no point worrying about evil if one doesn't even have structuralism.) This only makes sense, as far as I can tell, if a group is a set equipped with some structure rather than simply a set satisfying some property.

Once one has grown out of the idea that a group is literally simply a certain kind of set, then it's not so hard that an abelian group might not be literally simply a certain kind of group, even when that can still be done.

Sometimes the loss of structure is dramatic: The category CLOFCLOF of closed linearly ordered fields and the category E7E7 of undirected graphs isomorphic to the E 7E_7 Dynkin diagram are isomorphic, but objects from these two categories have very different properties. There is not even a canonical isomorphism betwee the two categories. Here the essence is completely lost.

Again, you can always put that structure back if you want it. Then you have categories equipped with some structure rather than just categories. (In particular, you might equip CLOFCLOF with the structure of its inclusion into the category of fields, and you might equip E7E7 with its inclusion into the discrete category of Dynkin diagrams, which itself has functors to various categories, such as LieAlgLie Alg.) Incidentally, there is a canonical relationship between these categories; there is an adjoint equivalence between them that is unique up to unique isomorphism, which is enough. But of course, the structures that we like to put on them are very different (even though we could put each structure on the other category if we wished).

Posted by: Toby Bartels on September 30, 2009 6:32 PM | Permalink | Reply to this

Re: What is a structured object?

AN: In mathematical practice, to say that an object is a group or a manifold says that it has certain properties.

TB: This connects with the idea earlier that ‘ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation’. As I said then, I would be very interested to see a formalism in which this can be taken literally!

I am working on that and will soon show you how it can be done in FMathL.

TB: Once one has grown out of the idea that a group is literally simply a certain kind of set, then it’s not so hard that an abelian group might not be literally simply a certain kind of group, even when that can still be done.

I never had this idea, hence could not outgrow it.

I always had the idea that although a group G is different from the set making up the elements of G, G contains precisely these elements. Thus I always doubted the semantic legitimacy of the extensivity property of sets, since mathematical practice does not support it.

For exactly the same reasons I oppose the idea that an abelian group should not be literally a group.

It is like claiming that a person is not literally a man or a woman, because you add structure in the form of a gender. This is completely foreign to my understandign of language.

mathematical language shares this additive property of natural language, and good foundations should preserve this important feature.

TB: you can always put that structure back if you want it.

In ordinary language, in informal mathematical langualge, and in FMathL you never lose it, unless you want to lose it.

Posted by: Arnold Neumaier on September 30, 2009 8:35 PM | Permalink | Reply to this

Re: What is a structured object?