Categories for the Working Philosopher

November 18, 2013

Categories for the Working Philosopher

Posted by John Baez

$MathML-enabled post (click for more details).$

Elaine Landry, in the philosophy department at U. C. Davis, is putting together a book called Categories for the Working Philosopher.

$MathML-enabled post (click for more details).$

Here are the contributors and their (perhaps tentative) topics:

Samson Abramsky — Computer Science, etc.
John Baez — Applied Mathematics
John Bell — Logic/Model Theory
Bob Coecke — Quantum Mechanics and Ontology
Robin Cockett — Proof Theory/Linear Logic
David Corfield — Geometry
Andrée Ehresmann — Biology
Hans Halvorson — The Structure of Physical Theories
Kohei Kishida — Modal Logic
Jim Lambek — Special Relativity
Jean-Pierre Marquis — First-Order Logic with Dependent Sorts
Colin McLarty — Set Theory
Michael Moortgat — Linguistics and Computational Semantics
Michael Shulman — Univalent Foundations
David Spivak — Mathematical Modeling
James Weatherall — Spacetime Theories

It looks like a nice lineup!

I say the topics are tentative because I just changed my own. I’d been wanting to write about category theory and the foundations of applied mathematics, since I’d talked about that a while ago at U. C. Irvine. However, this seems like a bad idea for a couple of reasons.

First, I feel my thoughts on this subject aren’t best expressed at a philosophical level right now. I think I should first go ahead and do applied mathematics using category theory. And indeed, that’s one of the main things I’m doing these days, with the help of 7 grad students.

Second, a lot of authors in this book will be talking about more traditional regions of overlap between philosophy and category theory — like logic, and the foundations of mathematics and physics. Since I’m still interested in such topics, it makes sense to write something on this theme.

Third, since Mike Shulman will be writing about univalent foundations and Jean–Pierre Marquis will be talking about first-order logic with dependent sorts (or ‘FOLDS’ for short), I think it would be good to write about some general issues which both these initiatives address.

So, here’s a draft of an abstract of my paper.

Notions of Sameness

The role of equality and other notions of ‘sameness’ in mathematics is fundamental, yet still far from settled. When we say $2 + 3 = 5$ , a naive schoolchild may wonder why we are claiming different expressions are the same. When we say $5 = 5$ , we have identical expressions on both sides of the equation, so this problem does not arise. Yet equations of this form, $x = x$ , are of almost no use in mathematics! Ironically, the more different the two sides of an equation look, the more valuable it is.

A standard way of dealing with this is with the idea of ‘reference’. A formal theory consists of syntactic expressions and rules for manipulating them. A ‘model’ for such a theory, formulated in some meta-theory, maps expressions to mathematical entities in the meta-theory. In particular, for the theory of arithmetic, expressions such as 2 + 3 and 5 are mapped to natural numbers: we say they ‘refer’ to these numbers. The sentence ‘2 + 3 = 5’ is mapped to the truth value ‘true’ because 2 + 3 and 5 are mapped to the same natural number.

This is fine as far as it goes. However, the recourse to a meta-theory, and the explanation of equality in the theory in terms of equality in the meta-theory, suggests that we may be merely postponing the need to think about what equality really means.

The standard axioms obeyed by equality (reflexivity, symmetry and transitivity) certainly agree with our intuitions about the meaning of this concept. Yet these axioms are really the definition of an ‘equivalence relation’, and there are many other equivalence relations beside equality. This suggests that these axioms are capturing intuitions about what it means for two things to be ‘the same in a way’, not necessarily equal.

Leibniz’s principle of ‘identity of indiscernibles’ was a great step forward in understanding equality. He said that $x = y$ if and only if for all predicates $P$ , $P(x) \iff P(y)$ . The intuition is a powerful one: things with the same properties are equal. However, when mathematics was formalized on the basis of set theory in the early 20th century, this principle played little role. The reason is that it uses second-order logic — we are quantifying over the predicate $P$ — while for important technical reasons, these systems are usually formulated in first-order logic.

With the introduction of category theory, a series of important breakthroughs began, which gradually transformed the whole discussion of equality. It is time for philosophers to pay attention to this changed landscape.

First, category theory formalizes the concept of ‘isomorphism’. Roughly speaking, two things are isomorphic if they are the same in a way. An isomorphism is a specific way. There can be many such ways.

For example, a cube has 48 symmetries. Each of these is an isomorphism between the cube and itself. Since the cube is equal to itself, it follows that the cube is isomorphic to itself; this is nothing special, since it is true of any shape. But the fact that the cube is isomorphic to itself in 48 ways is something interesting.

In general, the isomorphisms from an object in a category to itself form a group. Thus, group theory is automatically brought into our discussion of ‘sameness’ when we formalize this concept using isomorphisms in a category. This is a powerful step, for many reasons. One is that group theory is a rich and lively branch of mathematics. Another is that ever since the work of Klein, we have been able to think of group theory as a distilled, concentrated approach to geometry. The example of the cube hints at this, but Klein and subsequent mathematicians went much further.

In addition to isomorphisms between an object and itself, we can also consider isomorphisms between objects that are not equal. This gives a ‘groupoid’: a powerful generalization of the concept of group. And groupoids naturally bring topology into the game. The reason is that for any space there is a groupoid with the points of that space as objects, and certain equivalence classes of paths in that space as isomorphisms.

What does this mean? It may sound technical, but it is fundamental. It means simply that two points are ‘the same in a way’ if we can get from one to the other by following some path. Furthermore, the path is the way. It is no coincidence that ‘way’ is sometimes used to mean ‘path’: we are capturing some ancient intuitions here!

In short, by formalizing the concept of sameness using isomorphisms instead of equality, we bring group theory, geometry and topology closer to the foundations of mathematics, giving them a kind of inevitability that they might not otherwise seem to possess. We also see that these three subjects are tightly connected. A lot of important mathematics, and also theoretical physics, flows from this realization.

For example, all the forces we understand in nature—electromagnetism, the strong and weak nuclear forces, and gravity—can be described as ‘gauge fields’. In simple terms, a gauge field is a recipe saying how a particle transforms as we move it along a path in spacetime. Formalizing this brings in groupoids and their connection to group theory, topology and geometry.

To see how, note that ‘spacetime’ is really just an example of a space, where we treat time as an additional dimension. Thus we can associate to spacetime a groupoid in the manner discussed, where the objects are points and the morphisms are equivalence classes of paths. Particles, on the other hand, have symmetries of a geometrical nature, and these symmetries form a group. A gauge field is a map from the groupoid associated to spacetime to the group of symmetries of a particle.

For another example, we can reconsider the equation 2 + 3 = 5. This takes a whole new appearance in the light of category theory. We can see it as summarizing an isomorphism in the category of finite sets. Understanding this requires that we think about ‘decategorification’, which is the process of treating isomorphisms as equations.

All this is quite worthy of thought, but mathematicians have by now gone much further in their investigation of sameness. Category theory is just the tip of an iceberg, namely the theory of infinity-categories. The key idea is to stick firmly to the intuition that sameness is more precisely described by specifying an isomorphism than by merely asserting equality. Thus, instead of talking about whether two morphisms are equal or not, we should go ahead and talk about isomorphisms between them. For this we can use a structure with objects, morphisms between objects, 2-morphisms between morphisms, and so on ad infinitum. This is called an ‘infinity-category’.

Examples are not hard to come by. In topology they arise starting from a space by considering points, paths, ‘paths of paths’ (that is, one-parameter families of paths), and so on. This particular kind of infinity-category is an ‘infinity-groupoid’, meaning that the morphisms at all levels are invertible up to morphisms at the next higher level.

Infinity-groupoids also play an important role in string theory: they give higher generalizations of groups, which serve to describe symmetries in ‘higher gauge theory’. The reason is that just as ordinary groups are used to describe how particles transform when moved along paths, higher groups describe how strings are moved along paths of paths, and so on.

In the last decade or so, there has been an interesting shift in attitudes regarding the role of equality in infinity-category theory. For quite a while, it seemed natural in certain circles that we should ban talk of equality whenever possible, speaking instead of isomorphisms, or more precisely, ‘equivalences’. Michael Makkai’s ‘first-order logic with dependent sorts’ makes it possible to enforce such a ban within the foundations of mathematics. The consequences are still just beginning to be explored.

However, around 2005, Vladimir Voevodsky, building on previous work, began promoting the ‘axiom of univalence’. This says, briefly, that ‘equivalence is equivalent to equality’. This axiom has the effect of rehabilitating equality. Indeed, one might say it effectively redefines equality to mean equivalence! There is now a powerful program, called ‘homotopy type theory’, seeking to redo the foundations of mathematics based on this idea.

There are several ways to think about the axiom of univalence. One can see it as a sophisticated updating of Leibniz’s principle of the identity of indiscernibles. However, a full understanding of this axiom requires understanding how the foundations of mathematics become infected by topology when we try to base mathematics on infinity-groupoids. Sets are re-envisioned as a special case of spaces, and the axiom of univalence states a property of the ‘space of all spaces’.

One goal of this paper is to explain this in simple terms. Another, more general goal is to show how the study of equality, isomorphism, equivalence and other notions of sameness leads to fruitful new interactions between logic, topology, and physics, which make rich topics for philosophical inquiry.

Posted at November 18, 2013 4:09 AM UTC

TrackBack URL for this Entry: https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2671

53 Comments & 0 Trackbacks

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

You work at another UC and you still can’t tell Irvine from Davis? They’re 400 miles apart.

Posted by: D. Eppstein on November 18, 2013 5:14 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Fixed. Since I met Elaine Landry at a workshop at U. C. Irvine, I mistakenly assumed she was from there.

Posted by: John Baez on November 18, 2013 7:01 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

This is absolutely terrific. I think quite a lot about how to update our theory of truth and world, and you say a lot of what I’ve been thinking better than I could.

I’m certainly not an expert mathematician, but from my amateur trawlings, I had strongly suspected that category theory, homotopic type theory, univalence, and the Howard-Curry isomorphism were promising routes to formalize my intuitions.

Because of thoughts of these stripes, I’ve been thinking about the relationships between physics, logic, information theory, causality and structure, etc.

It was nice to see this suspicion confirmed to some extent. I guess I need to study (a lot) more math though.

Oh, and that book you’re contributing to looks fascinating!

Posted by: Brian C on November 18, 2013 7:46 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Thanks for the encouragement.

If you want to read something more about what I just said, I strongly recommend this:

Steve Awodey, Structuralism, invariance, and univalence.

because it’s nontechnical but still gets into some very important issues. If that whets your appetite, this free book may be the next step:

Homotopy Type Theory: Univalent Foundations of Mathematics.

It’s fairly self-contained and has a lot of material in plain English.

Posted by: John Baez on November 18, 2013 8:36 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Hi John, Good luck with this project. Are you aware of the use of Leibniz’s PII in the philosophy of physics, notably in the theory of indistinguishable particles? A nice introduction, which also goes into modifications of ZF-style set theory by introducing Ur-elements in this context, is the book “Identity in physics : a historical, philosophical, and formal analysis” by Steven French and Décio Krause (Oxford, 2006).

Such discussions would surely benefit from some form of Categorification. Of course, as you know, the DHR analysis in algebraic quantum field theory heavily relies on category theory, so in that context indistinguishable particles have already been subjected to this tool. Fort an update, see the paper “The conventionality of parastatistics” by Hans Halvorson, David Baker, and Noel Swanson (forthcoming in British Journal for the Philosophy of Science).

On a different note, what your write about x=x reminded me of “The Blind Spot: Lectures on Logic” by Jean-Yves Girard (European Mathematical Society, 2011), in which he discusses such issues (though, again, not categorically).

PS I support the call for you guys to write an Open Access book!

Best wishes, Klaas

Posted by: Klaas Landsman on November 28, 2013 8:48 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Mine goes like this:

Reviving the Philosophy of Geometry

Looking through Robert Torretti’s book ‘Philosophy of Geometry from Riemann to Poincaré’ (1978), it is natural to wonder why, at least in the Anglophone community, we currently have no such subject today. By and large it is fair to say that any philosophical interest in geometry shown there is directed at the appearance of geometric constructions in physics, without any thought being given to the conceptual development of the subject within mathematics itself. This is a result of a conception we owe to the Vienna Circle and their Berlin colleagues that one should sharply distinguish between mathematical geometry and physical geometry. Inspired by Einstein’s relativity theory, this account, due to Schlick and Reichenbach, takes mathematical geometry to be the study of the logical consequences of some Hilbertian axiomatisation. For its application in physics, in addition to a mathematical geometric theory, one needs laws of physics and then ‘coordinating principles’ which relate these laws to empirical observations. From this position, the mathematics itself fades from view, as a more or less convenient choice in which to express a physical theory. No interest is taken in which axiomatic theories deserve the epithet ‘geometric’.

However, in the 1920s this view of geometry did not go unchallenged as Hermann Weyl, similarly inspired by relativity theory, was led to very different conclusions. His attempted unification of electromagnetism with relativity theory, was the product of a coherent geometric, physical and philosophical vision. While this unification was not directly successful, it did give rise to modern gauge field theory. Weyl, of course, also went on to make a considerable contribution to quantum theory. And while Einstein gave initial support to Moritz Schlick’s account of his theory, he later became an advocate of the idea that mathematics provides important conceptual frameworks in which to do physics:

Experience can of course guide us in our choice of serviceable mathematical concepts; it cannot possibly be the source from which they are derived; experience of course remains the sole criterion of the serviceability of a mathematical construction for physics, but the truly creative principle resides in mathematics. (Herbert Spencer Lecture, Oxford 1933)

We may imagine then that an important chapter in any sequel to Torretti’s book would describe both Reichenbach’s and Weyl’s views on geometry. This is done in Thomas Ryckman’s excellent ‘The Reign of Relativity’ (Oxford 2005). Ryckman ends his book with a call for philosophical inquiry into “what sense a ‘geometrized physics’ can have”. It seems that the time is right to take up this challenge in the context of a new foundational approach to geometry provided by higher category theory.

The question to be addressed in this chapter is what philosophical sense should be made of a new approach to geometry which aims to provide a language for all fundamental physics. Univalent foundations (cf. Michael Shulman’s chapter) provide the syntax for theories which can be interpreted in $\infty$ -toposes. The basic shapes of mathematics are now the so-called ‘homotopy $n$ -types’. But the story does not end here, since to do all that needs to be done in modern geometry, and especially in the geometry necessary for modern physics, we need to add further structures. As we add extra properties and structures to $\infty$ -toposes, characterised by qualifiers – local, $\infty$ -connected, cohesive, differentially cohesive, increasing amounts of mathematical structure are made possible internally. It is the claim of Urs Schreiber that cohesive $\infty$ -toposes provide the right environment to approach Hilbert’s sixth problem on axiomatising physics, allowing the formulation of relativity theory and all quantum gauge theories, including the higher-dimensional ones occurring in string theory.

Cohesiveness in this sense arose from earlier formulations of the notion in the case of 1-toposes by Bill Lawvere, motivated in turn by philosophical reflection on geometry and physics. The claim, however, is that for these concepts to take on their full power they must be extended to the context of higher topos theory, where differential cohomology finds its natural setting. Now, rather than the mathematics necessary for physics being viewed, as it often is at present from set theoretic foundations, as elaborate and unprincipled, we can see the simplicity of the necessary constructions through the universal constructions of higher category theory.

Posted by: David Corfield on November 18, 2013 8:57 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Great, David! I changed your topic from “TBA” to “Geometry”.

I hope you and Mike and perhaps Jean-Pierre Marquis can post not only your abstracts but also your paper drafts here on this blog. That might help us think about these issues and improve our papers.

Posted by: John Baez on November 18, 2013 8:24 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Good idea. It would certainly help integrate the papers.

Posted by: David Corfield on November 19, 2013 8:47 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

So did everyone notice the two puns in my abstract?

Posted by: John Baez on November 18, 2013 10:27 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Was one of them the one about ‘way’ as having two meanings?

Posted by: David Corfield on November 19, 2013 8:57 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

No, I meant two that I didn’t explicitly highlight. They’re more silly.

Posted by: John Baez on November 19, 2013 5:23 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Thank you for the clarity of your explanations.

May I ask to be put on a list to be notified when this book is published?

Birrell Walsh birrell@well.com

Posted by: Birrell Walsh on November 18, 2013 11:35 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Alas, I’m not organized enough to keep such a list. I’ll certainly say something about this book here when it appears. Right now it seems likely to be published by Oxford U. Press. But even before it appears, I’ll post a link to a version of my article here, so you can get it for free.

Posted by: John Baez on November 19, 2013 1:39 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

This looks amazing!

John, Mike, David: Instead of going OUP, could you be challenged into creating an open access book, like we did for the univalent foundations? With such an impressive line-up, and n-cafe style proofreading, it is unclear what external quality control would add.

If I understand correctly, the (lulu-)sales for our book are very good compared to what one would expect from a commercial publisher.

Perhaps, I should reverse the question. What would be the reasons for not doing this ? John, you were the one who inspired most of us (me) to start thinking about these issues.

[ I guess the decision may not be yours, but Elaine Landry’s. ]

Posted by: Bas Spitters on November 19, 2013 8:04 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

I guess the decision may not be yours, but Elaine Landry’s.

Right. This book wasn’t my idea. If it were, of course I’d make it all open-access. I should have suggested this to Landry when I first got invited to contribute, but I didn’t think of it. Sorry!

I suspect that in philosophy and other humanities disciplines the idea of self-publishing would be quite hard to sell, since promotions and tenure are largely based on books rather than articles (unlike math, where it’s the reverse), and getting published by a ‘good publisher’ (like Oxford U. Press) is considered important. The number of sales is important to the publisher but not, I think, to most authors.

I would have been happy to push for it, though.

John, you were the one who inspired most of us (me) to start thinking about these issues.

Thanks, that makes me very happy.

Posted by: John Baez on November 19, 2013 7:01 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

I suspect that in philosophy and other humanities disciplines the idea of self-publishing would be quite hard to sell, since promotions and tenure are largely based on books … getting published by a ‘good publisher’ (like Oxford U. Press) is considered important.

Landry’s CV from her page says she became a Full Professor in 2013, so it really shouldn’t matter for her.

Posted by: RodMcGuire on November 19, 2013 7:16 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

You think promotions and the quest for more prestige stop at the level of Full Professor? If they did, a lot of us would kick back and relax at that point.

Posted by: John Baez on November 19, 2013 7:45 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

I’ll post my abstract when I’m at work and I have a moment. For now I want to say: please leave something for me to talk about! (-:

Posted by: Mike Shulman on November 19, 2013 6:16 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

This looks really interesting, but (as a philosopher with a mathematical background) I have a slight quibble, which is that it seems very biased towards foundations. This is a bit frustrating from the point of view of a philosopher-who-is-not-a-philosopher-of-mathematics, because mainstream philosophy has really moved away from foundationalism in the last couple of decades. But philosophy of mathematics (or at least the part of the philosophy of mathematics which does formal work) has remained stubbornly foundationalist, which means that (from the point of view of a pwinapom) many problems which are basically problems of the ontology, or metaphysics, of mathematics get addressed as problems about the foundations of mathematics. And important questions about mathematical practice never seem to get addressed at all.

Now, ideally, one would have thought that category theory could help us to break out of this: that one could develop an account of category theory as a formal theory of mathematical practice, and that this would allow us to connect with, for example, Wittgenstein on rule following, and such like. But apparently this is not to be (or have I been misreading the chapter titles?)

And I’ve left out Kreisel, of course, in this rather grumpy description of the philosophy of mathematics: he has quite a lot to say about mathematical practice.

Posted by: Graham White on November 19, 2013 1:47 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

You might think when a new foundational language comes along, it shouldn’t just be a case of doing the same old thing with the new language, an analytic philosophy Mark II, as I call it. Instead, it might give us cause to remember that we are historically-situated beings whose best efforts will no doubt be seen as quite primitive in centuries to come. So we might try to say what is rational about the way practices of intellectual enquiry are modified.

I’ve devoted plenty of time to this, and have some more ideas waiting to be written up properly.

But at the same time I can see the attraction of seeing what we might make of category theory/HoTT as replacements for set theory and logic in their usual philosophical functions.

Posted by: David Corfield on November 19, 2013 4:01 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

David wrote:

You might think when a new foundational language comes along, it shouldn’t just be a case of doing the same old thing with the new language, an analytic philosophy Mark II, as I call it. Instead, it might give us cause to remember that we are historically-situated beings whose best efforts will no doubt be seen as quite primitive in centuries to come. So we might try to say what is rational about the way practices of intellectual enquiry are modified.

That sounds very much like what you’d do, since you’re interested in history, and how people do things. But as you know, a lot of philosophers interested in math and logic focus directly on math and logic, largely ignoring their sociohistorical context.

Even such people shouldn’t be “doing the same old thing with the new language.” The new language radically transforms our vision of the most basic issues—for example, what it means for two things to be the same. Everything should be rethought from the ground up.

And it’s not as if this language came along by chance. In fact it came along because people were rethinking everything from the ground up! Constructing this language was a deeply philosophical activity, at least for some of us.

Unfortunately, many people in philosophy departments didn’t join this process. They missed the fun. Unlike Leibniz, they didn’t dive in when the new ideas were first being developed. And that means a lot of analytic philosophers are stuck with a lot of catching up to do. They may also be stuck clarifying other people’s big ideas, instead of having big ideas of their own.

Why? Why didn’t more of them dive into topos theory, homotopical algebra and $n$ -category theory when the foundations were first getting laid?

This loops around back to your original point. I bet a lot of analytic philosophers believed the foundations of mathematics were largely stable by now: that things wouldn’t radically change from the ground up.

That would be sad. It consigns philosophers to small jobs, like polishing the doorknobs of a grand hotel — instead of exploring wild territories where the roads have scarcely been built.

Posted by: John Baez on November 19, 2013 8:12 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

That was why starting this blog off with Klein 2-geometry was such fun.

Look where it ended up, with the higher Klein geometry needed for supergravity.

Posted by: David Corfield on November 20, 2013 11:28 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Graham wrote:

This looks really interesting, but (as a philosopher with a mathematical background) I have a slight quibble, which is that it seems very biased towards foundations.

I agree. On the bright side, categorical and $n$ -categorical foundations bring logic and the foundations of math closer to the topics ordinary mathematicians care about: as I said, group theory, geometry and topology stop seeming like potentially arbitrary topics that can be built ‘on top of’ set-theoretic foundations; they instead become fully integrated with the foundations. This is probably still news to many people.

On the dark side, category theorists, and philosophers of math who like category theory, still don’t pay enough attention to applied math. I was originally going to battle against this by writing a paper on category theory and the foundations of applied mathematics, where I’d be trying to use the word ‘foundations’ in a provocative new way. But I decided to give up and let Samson Abramsky and Andrée Ehresmann fight that battle. There are still a few things I want to say about how $n$ -categories affect basic concepts like ‘equality’, and this seems like a good chance to do that.

These days, I’m mostly busy applying $n$ -categories to biology, chemistry, electrical engineering and the like. The applications of $n$ -categories to the foundations of math and physics are very important, and ten years ago that’s what I loved to think about, but by now there are too many greyhounds running on that particular race track.

Posted by: John Baez on November 20, 2013 1:57 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Hi John - I think you are right about the application of category theory to basic concepts like “equality”. On my blog I have done a little bit of speculative thinking on this possibility, as a non-mathematician interested in the field, in relationship to the work of Francois Laruelle. Perhaps this line of thought may be too fast-and-loose, make-shift, and not as rigorous as many would like, but I have come to believe higher categorical thinking generally imitates the internal logic of Gandhian satyagraha (satya/truth as sheaf + agraha/insistence as stack) and its emergent emphasis upon the truth-force of Nonviolence (at the level of 2-categories).

Grothendieck for his part uses the term “radical equivalence” when it comes to 2-cats. For these and many other connections, I think taking Gandhi seriously as a heavy-thinking philosopher (he wrote many volumes which routinely go unread despite his popular image) over and above a social figure, in this Grothendieckian and non-Laruellean fashion, might prove extremely interesting. Following this idea, the n-categories for n>2 would then begin to get us into Zen Buddhism and other kinds of enlightened or nondual awareness. Food for thought, take it or leave it. All my best, and thank you for your work. - David

Posted by: inthesaltmine on December 9, 2013 1:58 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

It’s a delight to see someone who can quote Gandhi, Laruelle and n-category theory in one paragraph. I don’t agree with your optimism about a clear path from n-category theory to nondualism. The philosophical traditions aren’t (obviously) commensurable even if they are available to each other.

That said, if I were to choose a way to bring the two together, I would look at Madhyamika as the glue, for it brings a dialectical form of reasoning with a very sophisticated approach to metaphysical questions.

Posted by: Rajesh Kasturirangan on December 9, 2013 3:23 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Rajesh, it is truly a delight to read your comment here. I am humbled that you would warrant my rather experimental musings a with serious and constructive response.

In pausing to reflect, I came to believe you are right that my optimism here is most certainly misplaced. There appears to be a “felt and bodily” issue to nondualism which cannot quite be eclipsed by even the most complex of formalisms in n-category theory. Given your work on organisms and world-embededness, you are probably in a position where you can see this a bit more clearly than I.

Perhaps your insight also comes to bear in conversations that deal with the relationship between category theory and its applications to biology generally. I am thinking of course of the work of those like Robert Rosen and Nils Baas, and specifically the question of “Life itself” seen though this more biological lens.

In any case, it was originally Gandhi’s saying that non-violence is “a truth as old as the hills” which got me intrigued, for the next thing you know I am exploring “the hills” in their very spatiality as a matter of topos theory in order to glean the truths in non-violence. I will look into your suggestion. Thank you!

Posted by: inthesaltmine on December 10, 2013 1:16 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Typo:

“Yet equations of this form, x = x, are of almost no use [in] mathematics!”

Posted by: Blake Stacey on November 19, 2013 7:16 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Thanks—fixed!

Posted by: John Baez on November 19, 2013 11:45 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

There seems to me to be a jump in the jargon intensity here:

This particular kind of infinity-category is an ‘infinity-groupoid’, meaning that the morphisms at all levels are invertible up to morphisms at the next higher level.

The “invertible up to” phrasing reads as more arcane or argot-y than the rest of the abstract.

Posted by: Blake Stacey on November 20, 2013 12:56 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Yes, by the time I write the actual paper I’ll need to explain this jargon before using it… and somehow figure out to avoid it in the introduction. (This long ‘abstract’ may turn into the introduction.)

One point that’s important here is to explain the attitude ‘everything is only true up to something’. It’s related to the concept of ‘flab’ in homotopy theory: instead of wanting things to be rigidly true ‘on the nose’, we’re often happy for them to be true up to homotopy, where the homotopies obey certain equations up to homotopy, ad infinitum.

Posted by: John Baez on November 20, 2013 4:52 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Here’s mine. Of course, I didn’t know when I wrote it that John would already be telling people about $\infty$ -groupoids.

Univalent Foundations

Mathematics is conventionally founded on set theory, so that even basic concepts such as the natural numbers are defined as particular sets. For instance, $0$ is the empty set, while $1$ is the set $\{0\}$ . Such a foundation for mathematics was a major innovation, providing powerful tools and a common language for mathematicians, as well as a common context for philosophy. However, there are problems with set theory as the foundation. One is that it takes a lot of work to define mathematical objects using sets, and the resulting definitions include a lot of irrelevant information, as famously pointed out by Benacerraf. But a deeper problem is set theory’s “axiom of extensionality”, which says that two sets are equal precisely when they have the same elements.

The issue is that often mathematicians treat structures as “the same” in looser ways. For instance, two topological spaces are considered the same when one can be deformed into the other, such as a donut and a coffee cup. Importantly, two spaces can be “the same” in more than one way, and a space can even be “the same as itself” in more than one way. Similarly, the vector potential of electromagnetism admits different configurations that are observationally indistinguishable — and yet the specific transformations exhibiting such equivalences are relevant. This phenomenon of “gauge equivalence” underlies all of modern quantum field theory. Thus set theory is not well-adapted to describe the practical use of mathematical objects.

Univalent Foundations is a new proposed foundation for mathematics which resolves this problem. Its basic objects are not sets, but $\infty$ -groupoids, a notion from higher category theory which is designed specifically for this purpose. Like a set, an $\infty$ -groupoid contains “elements”, but also “equalities” relating pairs of elements. This recalls the notion of Bishop that a “set” is given by stating how to construct its elements and how to prove that two of them are equal. However, two elements of an $\infty$ -groupoid can be related by more than one equality. Thus, topological spaces and gauge field configurations are the elements of $\infty$ -groupoids, whose equalities are deformations and gauge transformations, respectively.

In fact, the equalities between two elements of an $\infty$ -groupoid form the elements of another $\infty$ -groupoid, and so on. The resulting infinite structure may seem horrendously complicated; and indeed, homotopy theorists and higher category theorists have worked hard to understand the combinatorial complexity of $\infty$ -groupoids, defined as structures inside set theory. However, under Univalent Foundations much of this difficulty is obviated, because $\infty$ -groupoids are its basic objects, just as sets are the basic objects of axiomatic set theory. Thus, rather than trying to define $\infty$ -groupoids, we give rules describing their behavior.

There are many possible variations in the list of rules, but three particular rules are most responsible for the characteristic features of Univalent Foundations. The first is called equality induction: in a simple form it states that given an equality $e$ between two elements $x$ and $y$ , any statement or construction involving $x$ can be “transported along $e$ ” to yield a statement involving $y$ . Such a principle of “substitution of equals for equals” is of course fundamental to the meaning of “equality”; the new feature of Univalent Foundations is that the result can depend on the particular equality $e$ .

The second characteristic feature of Univalent Foundations is the univalence axiom. It specifies the equalities in an $\infty$ -groupoid whose elements are $\infty$ -groupoids; thus it is analogous to set theory’s extensionality axiom in characterizing equalities between the theory’s basic objects. However, rather than extensionality’s rigid elementwise comparison, univalence says that for two $\infty$ -groupoids $X$ and $Y$ to be equal, it suffices to have functions $f:X\to Y$ and $g:Y\to X$ and equalities $g\circ f = \mathrm{id}_X$ and $f\circ g = \mathrm{id}_Y$ ; i.e. for $X$ and $Y$ to be isomorphic. This is the common notion of “sameness” used in practice for mathematical structures; univalence makes it coincide with the foundational notion of equality.

The third characteristic feature is higher inductive definitions, which provide the basic ways to construct $\infty$ -groupoids, analogous to the other axioms of set theory. Like Bishop, we specify objects and equalities; but also perhaps equalities between equalities, and so on. These specifications can also be recursive; for instance, we obtain the natural numbers with the usual inductive definition: a natural number is either zero or the successor of some other natural number.

Finally, one of the most intriguing aspects of Univalent Foundations is that despite its category-theoretic character, it is actually based on a different foundational tradition: constructive type theory. Thus it also inherits a computational character, and can straightforwardly be formalized in computer proof assistants. It also retains type theory’s paradigm of propositions as types, which unifies logic and mathematics in one formal system. In fact, Univalent Foundations offers a finer analysis of this unification, revealing traditional logic and traditional set theory as the first and second rungs, respectively, on an infinite ladder of more complex structures.

It remains to be seen whether Univalent Foundations will replace set theory as a foundation for mathematics, or even emerge as a respectable competitor. Regardless, however, its philosophical implications are profound. It suggests that the basic objects not only of mathematics, but of logic and thought, need not be featureless set-like collections, but may contain intricate “higher equality structures”. In other words, whenever we think of a type of object, we must also think (perhaps implicitly) of when two such objects are equal, when two such equalities are equal, and so on. We are forced to this not only by abstract mathematics, but also by computer science and quantum physics; thus it emerges as a fundamental aspect of reality.

Posted by: Mike Shulman on November 21, 2013 11:24 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

We should think about how to divide up topics and in what order our chapters should be read. It seems that the most natural order would be (1) John, (2) me, (3) David. I guess Jean–Pierre’s chapter should come after John’s as well, but it probably doesn’t matter where it goes in relation to mine and David’s. (Actually, there are also a lot of other chapters with potential relationship to UF: Abramsky, Bell, Cockett, Kishida, McLarty. I’m looking forward to reading their abstracts as well and making more connections.)

John, can you say a bit more about what you were actually planning on talking about in your chapter, for those of us who already know the subject? Your abstract mentions a bunch of things that the rest of us are writing whole chapters about (FOLDS, gauge field theories, univalence). How do you see your chapter in relation to ours?

Posted by: Mike Shulman on November 21, 2013 11:34 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Mike wrote:

John, can you say a bit more about what you were actually planning on talking about in your chapter, for those of us who already know the subject?

Hmm… saying more about that strongly resembles actually writing my chapter.

But that’s okay; we can try to coordinate things a bit by talking here. A bit of overlap is actually good. It helps to hear the same ideas explained by different people—both to understand the ideas, and to be convinced that they’re not just the ravings of a lone loonie!

I really want to focus heavily on the concept of equality and its various refinements, like isomorphism, equivalence, and maybe weak equivalence and Quillen equivalence if I get that far…

…. along with various attempts to understand or define equality, like the model-theoretic approach to first order logic with equality, and Leibniz’s principle of the ‘identity of indiscernibles’, and the axiom of univalence, and maybe more.

I don’t want to explain homotopy type theory any more than I absolutely need to, so that leaves a lot of interesting ground for you to cover, including: the idea of ‘types’, the idea of propositions as types, and higher inductive definitions, and the computational character of this setup. I don’t really want to say anything about that stuff. I’ll focus on its attitude toward equality and equivalence.

I don’t want to explain any more about FOLDS that I don’t absolutely need to, either. I mainly want to talk about its attitude toward equality and equivalence.

I really do want to talk about why it’s worth caring about equality and its refined versions. I’ll probably explain three reasons for moving from ‘equality between elements of sets’ to ‘isomorphism between objects of groupoids’:

The Erlangen program, which in its modern version says that a groupoid is an ‘incidence geometry’, with a subgroupoid as a ‘type of figure’. Motto: groupoids are geometries. (Hmm, I may retreat to groups here, since that seems easier and more historically accurate, though really we can use groupoids.)
The idea of a groupoid as a homotopy 1-type. Motto: groupoids are spaces.
The idea of putting 1 and 2 together with a functor. This gives us the idea that a functor between groupoids is like a connection on a space, which lets us do parallel transport while preserving some geometry. Motto: functors between groupoids are gauge fields.

I probably won’t talk much about ‘cohesion’ here, e.g., the role that differential structures play in 1, 2, or 3. Nor will I say much in detail about gauge theory. So that leaves plenty of ground for David to cover.

I don’t know how much I’ll be able to get into how 1-3 generalize from groupoids to $\infty$ -groupoids. But I definitely do want to talk about ‘categorification’ and how taking the attitude ‘never say equals, only provide equivalences’ pushes us into the $\infty$ -groupoidal generalization of 1-3, which is proving so fruitful today.

I will probably want to ponder an example from arithmetic, too: namely, how the set of natural numbers get re-envisioned as the groupoid of finite sets and then the space of finite subsets of $\mathbb{R}^\infty$ .

A lot of this is stuff I’ve said many times in fragmentary ways in different places. It’s all about how taking the seemingly static concept of equality and seeing it as a process reveals a rich, dynamic world.

If I don’t restrain myself, I might call this paper What Is “Is”?

Posted by: John Baez on November 22, 2013 1:27 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Sounds great! Sorry if I sounded defensive; I think your plan is excellent.

I also haven’t thought really deeply about what I’ll say, but currently I’m not thinking of saying very much about the computational aspects, since there are other chapters about computation, and UF arguably doesn’t really have much new to say there as compared with other forms of type theory as a foundation. I mean, there is the issue of how to compute with univalence and HITs, and what they can do for us, but one can say that that is just bringing HoTT in line with how type theories in general ought to behave. I’m planning instead to focus on the higher categorical point of view (since this is, after all, a book about category theory) of UF as a synthetic theory of $\infty$ -groupoids, as in my CT13 slides, and how the $n$ -type hierarchy unifies logic and mathematics.

Posted by: Mike Shulman on November 23, 2013 7:05 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Mike wrote:

I’m planning instead to focus on the higher categorical point of view (since this is, after all, a book about category theory) of UF as a synthetic theory of $\infty$ -groupoids, as in my CT13 slides, and how the $n$ -type hierarchy unifies logic and mathematics.

Great!

Speaking of this, I hope you say a bit about how homotopy $(-1)$ -types (or whatever they call them these days) are related to propositions. It’s a simple but really quite powerful idea. You may recall a paper we wrote about Postnikov towers including $(-2)$ - and $(-1)$ -types. I’m glad to see that outlook has caught on.

Posted by: John Baez on November 23, 2013 9:28 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Yes, I certainly will!

Posted by: Mike Shulman on November 23, 2013 11:14 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Dear John,

Have you seen this paper by Barry Mazur? It is also a philosophical discussion of equality in category theory, so might be worth a look.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.125.559&rep=rep1&type=pdf

Posted by: Colin Zwanziger on November 22, 2013 3:06 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

No, I hadn’t read that paper by Barry Mazur! Thanks, I’m reading it and will refer to it. It’s nice because among other things it’s an introduction to category theory. My article will be quite different because I’ll assume the basics of category theory are known. (Is that wise? Ideally this book would have an appendix explaining these basics; we certainly don’t want every paper to explain them.)

Posted by: John Baez on November 22, 2013 5:46 PM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

Sounds reasonable to me—I’m looking forward to reading the result. (I’m a linguistics/philosophy/toposophy student so the book sounds awesome in general as well.)

Posted by: Colin Zwanziger on November 23, 2013 1:40 AM | Permalink | Reply to this

Re: Categories for the Working Philosopher

$MathML-enabled post (click for more details).$

I am a student of linguistics with some math background, and have started to relearn categories actually for the purpose of exploring Moortgat’s work! I would be very excited to see what he has to say about the philosophical relationship between the subjects beyond the basic tools necessary for constructing the proofs of maps between grammars and such. I have long had an intuition about the connection between syntax and algebra, but only recently discovered the rich body of work following Lambek, so I am very excited about a book like this. I also have a hobby in geometry, and getting to see these subjects together reframed in category theory would be fun.

Posted by: Zach Stone on November 22, 2013 5:18 PM | Permalink | Reply to this

equality and similarity

$MathML-enabled post (click for more details).$

In machine learning, I work a lot with “similarity measures”. For numbers on the real line this is our everyday notion of “approximately equal”. You could say “3.04+2.1 is more similar to 5 than to 10”. In machine learning you’ll also have objects represented as points in a space, where the absolute value of the distance is the similarity measure. the Euclidean or Mahalanobis distance is commonly used. For more complex objects, similarity can be measured by the “cost” of transforming one into the other using some set of operations. Time series for instance, can be compared using “dynamic time warping”. In applications, these methods gets mixed in with feature extraction. A text document may occupy a point in a high-dimensional space, but a point perhaps only defined by keywords.

My point is that the concept of “similarity”, while often tweaked to fit the current application, always go hand in hand with an equality. Distance zero, transforming cost of zero, etc. But when we go into isomorphisms and further to groupoids and topology, it’s not clear where that leave similarity. I understand the concept of the donut and the teacup being topologically equivalent because they can be morphed into one another by following some continuous path. But this is two objects that are dissimilar when viewed in one framework and exactly equal in another. It’s the familiar story of dissimilar things being shown to have a common foundation when generalizing and stripping away “non-essential” properties. If I want to measure the similarity to a pot with two handles, I’m breaking the concept of topological equivalence by punching an extra hole.

I would very much have liked to apply more advanced concepts in machine learning’s similarity matchings, but while “equality” is progressing rapidly, “similarity” sometimes seems like it’s stuck back with Euclid.

Posted by: M. on November 28, 2013 3:19 PM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

That’s fascinating to hear! I have been thinking a bit about developing a mathematical theory of resources, of which your comments have reminded me. So I have some questions and remarks.

similarity can be measured by the “cost” of transforming one into the other using some set of operations

Let’s call these objects $x$ and $y$ and write $C(x,y)$ for the cost of transforming $x$ into $y$ . This quantity is a nonnegative real number. What properties would we expect this quantity to satisfy? Certainly, every object should transform to itself at no cost:

$C(x,x) = 0.$

Furthermore, the cost of transforming $x$ into $z$ should be at most the sum of the costs of transforming $x$ into $y$ and $y$ into $z$ ,

$C(x,z) \leq C(x,y) + C(y,z).$

These two properties constitute the definition of a Lawvere metric space, which, for a category theorist, is the most natural way of giving mathematical meaning to the term “distance”. In particular, the symmetry property $C(x,y)=C(y,x)$ does typically not hold: turning coffee beans into coffee costs much less than turning coffee back into beans!

Does this non-symmetry of cost also arise in the situations that you mentioned? And are you sometimes not only interested in how much it costs to transform $x$ into $y$ , but also in the different ways in which this can be done? If the answer to one or both of these questions is yes, then you are probably looking for enriched category theory, and I would love to hear more about what exactly you’re doing!

Posted by: Tobias Fritz on November 29, 2013 12:18 AM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

Why should the cost, $C(X,Y)$ , of transforming $X$ to $Y$ be non-negative?

There appears to be a negative cost to transform $H_2 + O \to H_{2}O$ .

Posted by: RodMcGuire on November 30, 2013 7:00 PM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

Rod wrote:

Why should the cost, C(X,Y), of transforming X to Y be non-negative?

I don’t know — maybe negative values should be allowed! But here are two arguments for why they shouldn’t.

First, in many situations, the cost $C(X,Y)$ measures by how much $X$ needs to be distorted in order to make it convertible to $Y$ . Since this “distortion” is necessarily a non-negative number, this results in non-negative $C$ . Since you’re not the first person to ask this question, I get the impression that “cost function” may not be an appropriate term for this concept. How else could one call it?

Second, from a purely mathematical perspective, imposing non-negativity gives rise to the nicer abstract theory. For example, we may ask what the average cost of turning $X$ into $Y$ is in a “mass production” setting in which one works with a large number $n$ of copies of each,

$\lim_{n\to\infty} \frac{C(nX,nY)}{n} = ?$

The assumption of non-negativity is crucial for guaranteeing that this limit exists. If we allowed negative values and $C(nX,nY)$ gets too negative too quickly, then the limit would not exist! So we need the set of allowed cost values to have a smallest element, and more generally to be complete. With this assumption, one can state a prove a theorem about such an “asymptotic cost” which generalizes the Hahn-Banach theorem. There is much more to say here and I’ve been glossing over some important details, but the point I’m trying to make is that the non-negativity assumption has useful consequences.

There appears to be a negative cost to transform $H_2 + O \to H_2 O$ .

You seem to assume that the “cost” is necessarily given by the energy input required for the reaction to take place. This seems like a reasonable thing to do, and maybe this is how it should be done. On the other hand, the energy itself can also be counted as a resource, so that one gets something like this:

$H_2 + O \to H_2 O + 3 eV.$

In this kind of description, the cost function would not take values in the non-negative reals, but rather in the booleans: $C(X,Y)$ is then a truth value stating whether the transformation of $X$ into $Y$ is possible or impossible. I’m confused about how this relates to a formulation in terms of real-valued cost functions.

Posted by: Tobias Fritz on November 30, 2013 10:51 PM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

It’s also worth noting that in many applications people work with cost functions that don’t satisfy the triangle inequality.

Posted by: Mark Meckes on December 1, 2013 10:20 AM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

That’s interesting to hear! Do you have a concrete example?

Posted by: Tobias Fritz on December 1, 2013 2:34 PM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

I’ve tried to reply a couple times but my web browser is having serious problems. For the moment let me just quickly suggest looking at the Wikipedia page on “Transportation theory (mathematics)”.

Posted by: Mark Meckes on December 3, 2013 10:18 AM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

Thanks, Mark. I know a bit about transportation theory—mostly because we’ve discussed it before! It should be a very nice example of the framework that I have in mind.

As I understand it, the relevant “resources” in the context of transportation theory are the probability measures on some given space, which comes equipped with a distance function. The “cost” of turning one probability measure into another is given by the Wasserstein distance $W_1$ . For any two points in the space, the Wasserstein distance between the associated Dirac measures is just the original distance of the two points. Therefore, if the original distance function violates the triangle inequality, then so does $W_1$ . Is this correct so far?

Are you pointing out that transportation theory can be developed without assuming the triangle inequality? I imagine that this is true, but then I would have to ask: are there any real-world applications of transportation theory in which the distance function fails to satisfy the triangle inequality? I would like to look at those to figure out whether there’s indeed something wrong about assuming the triangle inequality, or whether the resources formalism could accomodate these examples in a different way.

Or did you have a different kind of failure of the triangle inequality in mind?

Posted by: Tobias Fritz on December 4, 2013 12:28 AM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

In my earlier comment which I failed to post I tried to be more informative about what I had in mind. Let’s try again.

I was thinking of the probability measures not as resources, but as distributions of resources. That is, a “resource” consists of a unit of stuff at a given point in a metric space, so that stuff at point $x$ and stuff at point $y$ are two different resources. Transportation theory considers a cost $c(x,y)$ of transporting a unit of stuff from point $x$ to point $y$ , so in this interpretation it’s also the cost of turning one resource into another. The most obvious and classical cost function is the distance: $c(x,y) = d(x,y)$ , but people also consider $c(x,y) = d(x,y)^p$ for $p > 0$ , which for $p > 1$ generally fails the triangle inequality. Wasserstein distances then arise when you look for the most efficient way to turn one distribution of resources into another one. A popular image is moving a sandpile from one configuration into another.

The Wasserstein distances $W_p$ for $p > 1$ satisfy the triangle inequality even though the underlying cost functions don’t. Unsurprisingly, the $p = 2$ case has the richest theory, which grew out of applications (which I know nothing about) to fluid mechanics — check out this book for some background.

Posted by: Mark Meckes on December 4, 2013 8:53 AM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

Mark wrote:

[..] a “resource” consists of a unit of stuff at a given point in a metric space, so that stuff at point x and stuff at point y are two different resources. Transportation theory considers a cost c(x,y) of transporting a unit of stuff from point x to point y, so in this interpretation it’s also the cost of turning one resource into another.

That makes sense intuitively, but your usage of the term “resource” is not consistent with its usage in the mathematical framework for resources which I’m trying to develop. In the following, let me explain two reasons for why this is the case.

Wasserstein distances then arise when you look for the most efficient way to turn one distribution [..] into another one.

That’s exactly the kind of question addressed by the resources formalism: how efficiently can one turn one thing into another one? This is one reason why the distributions themselves should be considered to be the “things” here: we are really interested in how efficiently these can be converted into each other, and this kind of question is exactly what the resources formalism is for.

Moreover, the resources formalism contains another important piece of structure which I haven’t mentioned until now. Namely, the set of resources should come equipped with a binary operation $+$ which specifies how resources can be combined. We have briefly seen this above in the context of chemical reactions, where it was already written as “ $+$ ”. I don’t want to explain right now what the axioms imposed on $+$ are, so suffice it to say that they are equivalent to saying that the resources should be the objects of a symmetric monoidal category enriched over the non-negative reals.

Now my point is that you do not get such a binary operation if you take the resources to be the points of the underlying metric space. But you do get such a binary operation if you take the resources to be the measures on that space! Namely, the binary operation here is just addition of meausres: if you have two distributions of stuff on your space, you can combine those into a new distribution of stuff by just adding both piles of stuff around each point. This makes intuitive sense as a “combination of resources”. It also forces us to drop the assumption of normalization of the distributions. I think that dropping this normalization makes intuitive sense as well, but since this comment is already too long, I won’t go into any more detail for now.

The Wasserstein distances $W_p$ for $p\gt 1$ satisfy the triangle inequality even though the underlying cost functions don’t.

How odd! I’ll have to read a bit to try and understand this.

So given that really the distributions themselves should be considered to be the resources, I think that this actually corroborates the triangle inequality postulate for the cost of turning one resource into another. And optimal transport will be a central example of the formalism!

I also understand now that the term “cost function” in the sense of cost of conversion between resources is misleading terminology, but I still haven’t found a better replacement. Any suggestions would be very welcome.

Posted by: Tobias Fritz on December 4, 2013 3:54 PM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

Thanks for the explanations. However, in a suitably modified form my point (that optimal transportation theory uses costs which don’t satisfy the triangle inequality) still stands.

If, as above, you have a cost function $c$ which tells you how expensive it is to move stuff from one point of a space $X$ to another, then there is an associated optimal transportation cost between two measures $\mu$ and $\nu$ on $X$ : $W_c(\mu,\nu) = inf_\pi \int c(x,y) d\pi(x,y),$ where the inf is over measures $\pi$ on $X \times X$ with marginals $\mu$ and $\nu$ . When $c = d^p$ , the optimal transportation cost isn’t the Wasserstein distance $W_p$ , it’s $W_p^p$ , which for $p > 1$ doesn’t satisfy the triangle inequality. People find it convenient to take the $p$ th root so that they can work with a metric, just as it’s convenient to take a $p$ th root in defining $L^p$ norms so that they actually turn out to be norms. But it sounds like for your purposes, $W_c = W_p^p$ is the more natural quantity to call the “cost”, or whatever else you settle on.

Posted by: Mark Meckes on December 4, 2013 7:14 PM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

In rereading Sebastiano Vigna’s A Guided Tour in the Topos of Graphs I noticed at the end that he says

Theorem 4 applies also to the topoi discussed in this section, so several categories of standard mathematical structures (with their standard morphisms) turn out automatically to be quasitopoi: just to name a few, simple finite undirected (schlicht) graphs (but with self-loops allowed), tolerance spaces and (of course) binary endorelations, a.k.a. digraphs without parallel arcs.

(a brief overview: His “topos of graphs”, $\mathcal{G}$ , refers to the directed multi-arc notion of a graph.

His Theorem 4 says in part “The objects of $\mathcal{G}$ which are separated for the double negation topology are exactly the graphs without parallel arcs.”

For $\Sigma$ a graph which is a “delooped” set (i.e one node with self arrows corresponding to set members) then the slice category, $\mathcal{G}/\Sigma$ is a topos and its separated objects are “transition systems” which form a quasi-topos.

)

I haven’t been able to google up a good simple discussion of tolerance spaces or of them seen as quasi-toposes. Tolerance spaces are based on tolerance relations (a generalization of equivalence that is symmetric, reflexive, but not transitive), and my googling indicates that some use them in machine learning and handling fuzzy sets.

Maybe someone more knowledgeable could chip in.

Posted by: RodMcGuire on December 17, 2013 10:03 PM | Permalink | Reply to this

Re: equality and similarity

$MathML-enabled post (click for more details).$

I’ve never heard of the term ‘tolerance space’, but if the category of tolerance spaces is equivalent to the category of sets equipped with a symmetric reflexive relation, then this sort of thing is familiar to me, although I don’t know how to give a really simple explanation.

Define a “symmetric graph” to be a presheaf over the category whose objects are sets $\{1\}$ , $\{1, 2\}$ and whose morphisms are functions between them. It’s not very hard to see that such a presheaf is essentially equivalent to an undirected reflexive graph with multiple edges between pairs of points allowed. (The presence of the non-trivial automorphism on $\{1, 2\}$ means that edges always come in pairs $a \to b, b \to a$ , where each such pair is indicated by a single undirected edge between $a$ and $b$ , or equivalently by a double-headed arrow $a \leftrightarrow b$ .) The category of symmetric graphs, being a category of presheaves, is a topos.

For any topos, one can consider the category of separated presheaves for a given topology, and this turns out always to be a quasitopos. For categories of graphs, it turns out to be interesting to consider the $\neg\neg$ -topology, and it’s typically the case that the separated presheaves there are those graphs with at most one edge between any given pair of points. In the case of symmetric graphs, this means we consider such symmetric graphs such that for any two vertices $a$ , $b$ there is at most one undirected edge between $a$ and $b$ , but this means we are just considering reflexive symmetric relations on the set of vertices. Thus the category of sets equipped with a reflexive symmetric relation forms a quasitopos.

This sounds rather fancy, and in a sense it is. But it might help to explain the $\neg\neg$ -separatedness condition in slightly more down-to-earth terms. In abstract generality, to say a presheaf $F$ is $\neg\neg$ -separated means that whenever we have a $\neg\neg$ -dense subpresheaf of a representable presheaf,

$i: R \hookrightarrow \hom(-, c),$

the induced map between hom-sets

$F(c) \simeq Nat(\hom(-, c), F) \stackrel{Nat(i, 1_F)}{\to} Nat(R, F)$

is an injection. Fortunately, the category we’re taking presheaves over here is simple enough that it’s easy to figure out what such $\neg\neg$ -dense presheaves of representables look like. In fact, there’s only one non-trivial example (where “trivial” means an identity subobject): it’s the inclusion of symmetric graphs that looks like this:

$(\underset{\circlearrowleft}{\bullet} \;\;\; \underset{\circlearrowleft}{\bullet}) \hookrightarrow (\underset{\circlearrowleft}{\bullet} \underset{}{\leftrightarrow} \underset{\circlearrowleft}{\bullet}) = \hom(-, \{1, 2\}).$

We have for any presheaf $F$

$Nat((\underset{\circlearrowleft}{\bullet} \;\;\; \underset{\circlearrowleft}{\bullet}), F) \cong F(\{1\}) \times F(\{1\}), \qquad Nat(\hom(-, \{1, 2\}), F) \cong F(\{1, 2\})$

and the separatedness condition thus says that the source-target pairing

$F(\{1, 2\}) \stackrel{\langle s, t\rangle}{\to} F(\{1\}) \times F(\{1\})$

is an injection, which is exactly the condition that for any pair of vertices $a, b$ there is at most one edge $a \to b$ . In this way, separated presheaves are seen to be equivalent to sets equipped with a symmetric reflexive relation.

Posted by: Todd Trimble on December 18, 2013 12:43 PM | Permalink | Reply to this

The n-Category Café

Skip to the Main Content

November 18, 2013