Re: A Categorical Manifesto
For me, two of the most interesting aspects of category theory in computer science have been monads and generalised folds/unfolds.
If M is a functor that happens to be monad, then given an arrow (ie. a function because we’re working in the category of types and functions) A->MB and an arrow B->MC we can compose them to make an arrow A->MC, even though the tail of the first arrow is incompatible with the head of the second. This pattern is incredibly common when programming, especially in functional programming languages. Rather than have to write the glue to compose these functions over and over again, monads give a really nice way to express the glue and simplify the composition. And to grasp monads it really helps to understand some category theory, after all, they are categorical constructs.
Folds are a ubiquitous operation in computing. A fold is easiest to understand by example. Given an ordered list of numbers [1,2,3,4] we can turn it into an expression 1+2+3+4+0. We put a + between each pair of values and a value at the end (often the identity for +). Clearly we can generalise to any binary operator and any ‘end’ value. This is a fold and clearly summing a list is an example of one (and you have to admit, summing a list is a pretty ubiquitious operation). Many (all?) functions of lists can be expressed as folds so understanding folds gives a nice way to abstract your code to see commonality between different algorithms. Additionally, folds generalise to many datastructures besides lists. To fully grasp this needs a little category theory and the notion of an F-algebra.
And the fun thing is that as soon as we view this stuff through category theory it becomes clear that these notions have duals. On the one hand we find comonads, and on the other we find F-coalgebras and unfolds. It turns out that many algorithms are naturally structured via comands and that unfolds are as ubiquitous as folds.
And of course the whole Curry-Howard isomorphism is really about category theory and I’ve found that to be quite illuminating.
A nice readable account on fold/unfold is contained in here. There’s no shortage of tutorials on monads in computer science so I won’t bother with a link to one of those.
Re: A Categorical Manifesto
Goguen’s description of ‘Foundations’ on page 15 is very relevant to our discussions on ‘causal’ ordering in the Foundations thread:
Foundations should provide general concepts and tools that reveal the structures and interrelations of various areas of mathematics and its applications, and that help in doing and using mathematics.
Re: A Categorical Manifesto
I have a more philosophical answer that abstracts away even from the gory details of category theory.
If we remember that most mathematics arose from generalizing the notion of counting, set theory (and the mathematical edifice built up from it) can be seen
as formalizing the primitive intuitions about counting things. It is a natural basis for numerical and numerically-originating ideas.
In contrast, computation is mainly about mapping inputs to outputs. The study of mappings and their composition, which gives rise to category theory, is a natural way to formulate the particular concerns of computational thinking.
The fact that the two approaches are dual allows us to build upon the rich legacy of numerically-founded work, while still allowing new insights into the fundamental mechanism of our scientific concern.
Re: A Categorical Manifesto
Can anyone comment on section 2.7, Programs and Program Schemes, and what connection, if any, it has to programs as morphisms in a CCC?
Re: A Categorical Manifesto
“A while back Gina asked why computer scientists should be interested in categories. …”
Wow! how sweet to have a whole (interesting) post related to my question. Thanks a lot!
Ulam on Banach; Re: A Categorical Manifesto
Stanisław Marcin Ulam, another mathematician of the Lwów School of Mathematics, in his autobiography, attributes this to Banach:
“Good mathematicians see analogies. Great mathematicians see analogies between analogies.”
Is this a prescient hint of the campaign to categorize, re-categorize, and/or n-categorize mathematical and Physics foundations?
Anyone think that the quote can be made axiomatical? That “Very Great mathematicians see analogies between analogies between analogies between analogies.”
And what the omega-closure of this? Or should it be embedded in a universe including strongly unreachable analogies?
Just wondering, as an admirer of Ulam who was to have coauthored with him, but suffered when he died before we could agree on the abstract.
Re: A Categorical Manifesto
Banach (according to Ulam): “Good mathematicians see analogies. Great mathematicians see analogies between analogies.”
Jonathan asked: “Anyone think that the quote can be made axiomatical? That ‘very great mathematicians see analogies between analogies between analogies between analogies’?”
Actually Banach’s alleged saying sounds appealing but rather suspicious as shown by Jonathan’s suggestion that seems to push it towards being absurd.
We can check this by examples:I am sure there are many examples of useful analogies in mathematics (what are the best such examples, in your opinion) but can you give a few of analogies between analogies? And can you give a single example of an analogy between analogies between analogies??
(We can also check it by generalizing: Does Banach’s saying (or Jonathan’s) make sense if we replace “mathematician” by “scientists”,”police detectives”, “medical doctors”??)
So maybe Banach’s statement is not much better when scrutinized than the common public perception:
“Good mathematicians can multiply 2 digit numbers in their heads and great mathematicians can multiply 5-digit numbers in their heads!”
(And what about computer science?)
Godel numbering endofunctions; Re: A Categorical Manifesto
Good examples a la Euclid, who may indeed have been an anthology editor, as suggested for Homer.
If Archimedes and Euler and Gauss are the greatest mathematicians, what analogies between analogies… are each best known for? Was Grothendieck or Turing or Godel or von Neumann the deepest into analogies of analogies of analogies…?
So what is the answer to my query on combinatorially enumerating the endofunctions on endofunctions on n points?
Graphs of morphisms of graphs; Re: A Categorical Manifesto
Has this been generalized to 2-categories?
-
BROWN, R., MORRIS, I., SHRIMPTON, J. and WENSLEY, C.D.
Graphs of morphisms of graphs
Abstract: This is an account for the combinatorially minded reader of various categories of directed and undirected graphs, and their analogies with the category of sets. As an application, the endomorphisms of a graph are in this context not only composable, giving a monoid structure, but also have a notion of adjacency, so that the set of endomorphisms is both a monoid and a graph. We extend Shrimpton’s (unpublished) investigations on the morphism digraphs of reflexive digraphs to the undirected case by using an equivalence between a category of reflexive, undirected graphs and the category of reflexive, directed graphs with reversal. In so doing, we emphasise a picture of the elements of an undirected graph, as involving two types of edges with a single vertex, namely `bands’ and `loops’. Such edges are distinguished by the behaviour of morphisms with respect to these elements
And I’ll ask the simplest way that I know: how many 2-categories are there on n objects, where the n objects are mapped to themselves by all possible endofunctions, and the n functions mapped to themselves by all possible endofunctionals?
John Baez partly answered; Re: A Categorical Manifesto
John Baez partly answered my questiion here.
Read the post
Back from NIPS 2006
Weblog: The n-Category Café
Excerpt: Background knowledge in machine learning
Tracked: December 13, 2006 10:17 PM
Re: A Categorical Manifesto
Why should computer scientists be interested in categories? Here’s my answer. Partly, of course, because they’ll help us understand the “Interesting Truths” of computer science: the “subtle unifying insights between broad classes of mathematical structures”, to quote the paragraph from Greg Egan’s novel Incandescence that I posted recently:
‘Interesting Truths’ referred to a kind of theorem which captured subtle unifying insights between broad classes of mathematical structures. In between strict isomorphism — where the same structure recurred exactly in different guises — and the loosest of poetic analogies, Interesting Truths gathered together a panoply of apparently disparate systems by showing them all to be reflections of each other, albeit in a suitably warped mirror. Knowing, for example, that multiplying two positive integers was really the same as adding their logarithms revealed an exact correspondence between two algebraic systems that was useful, but not very deep. Seeing how a more sophisticated version of the same could be established for a vast array of more complex systems — from rotations in space to the symmetries of subatomic particles — unified great tracts of physics and mathematics, without collapsing them all into mere copies of a single example.
But more specifically, I’m interested in them because I think they have huge — and under-used — potential in understanding human cognition and in building “intelligent” computers. I’m going to give several examples, which will make quite a long post. I hope that’s OK.
My first example concerns understanding how “logical unification” and “holographic reduced representations” are related. Logical unification is a kind of pattern matching. (It’s often just called “unification”, but I’ll preface with “logical” to avoid confusion with “unify” as in “insight”. Logicians reading this, please forgive such a cursory description.)
For instance, in the programming language Prolog, unifying the term 2*x+3*y with the term U+V, will bind the variable U to 2*x and the variable V to 3*y. Prolog programmers love this, because it’s such an easy way to get at pieces of structures.
But — computer science also knows a different kind of pattern matching, implemented by so-called “holographic reduced representations” or HRRs. These encode structures as bitwise or numeric vectors in very high-dimensional spaces, using circular convolution and circular correlation (or analogous operations) to build and decompose vectors. I’ll show an example in the next paragraph; for those who want to know more, these references give some background:
- Pentti Kanerva, Dual Role of Analogy in the Design of a Cognitive Computer. In Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences. Workshop. Sofia, Bulgaria, July 17–20, 1998.
- Jane Neumann, Learning the Systematic Transformation of Holographic Reduced Representations. Cognitive Systems Research, 3, (2002), 227-235.
- Chris Eliasmith, Structure Without Symbols: Providing A Distributed Account of Low-level and High-level Cognition. Southern Society for Philosophy and Psychology Conference March, 1997.
Here’s an example from Kanerva’s paper. The following equation computes a bitwise vector encoding data about France:
France = [Capital*Paris + Location*WesternEurope + Currency*Euro]
Here, + is vector addition, * is exclusive-or. The brackets [] denote a “normalised mean”, whose exact definition is given in Kanerva’s paper. Capital, Location and Currency are three randomly-generated vectors. The vector Capital represents a slot or field whose value will represent the capital of France; similarly for the vectors Location and Currency.
Paris, WesternEurope and Euro are three more randomly-generated vectors, representing the values of these slots. It’s important to note that all six vectors should be interpreted as atomic symbols: the actual values of their components have no significance.
Having computed the vector France, we can “probe” it by exclusive-oring with the vector Capital:
WhatCity = Capital *France
= Capital * [Capital*Paris + Location*WesternEurope + Currency*Euro]
= Capital*Capital*Paris + Capital*Location*WesternEurope + Capital*Currency*Euro
= Paris + (a bit of other stuff).
This works because * (that is, exclusive-or) is its own inverse, and because it distributes over the normalised mean [].
The final stage is to do what Kanerva calls a “clean-up”. You assume there is an associative memory that contains the six vectors originally generated. Fed any vector, it will quickly return whichever of the six closest to it. Which in this case,we hope will be Paris.
For this to work, various conditions on the distribution of vectors must be satisfied. But if they are, you can do some really nifty analogical reasoning. For example, you can ask how France is related to Paris by exclusive-oring the vector France with the vector Paris. This works the same way as above, and after the clean-up, will return Capital. Kanerva goes on to describe more ambitious reasoning tasks, such as forming a vector that represents Sweden, then asking “What is the Paris of Sweden”?
On the surface, holographic reduced representations are utterly different from logical unification. But I can’t help feeling that, at a deeper level, they are closely related. And there is a categorical formulation of logical unification, which regards it as finding coequalisers in a category whose objects are sets of terms and whose arrows are substitutions. This is described in the first reference below, by Rydeheard and Burstall. They say their formulation is derived from an observation by Goguen. So it may be based (I’m not an expert) on the ideas in the second reference:
So, can we extend that categorical formulation to holographic reduced representations? I don’t know. But if we could, we would better understand how they are related to logic programming, and we might gain new tools for analogical reasoning. It’s worth trying.
I’m going to post some more examples, but I’ll do so in a new comment, because I’m having terrible problems with HTML “character data” that the filter doesn’t like and that I can’t locate. So it’s easier to do so if I break the comment up into shorter pieces.
Re: A Categorical Manifesto
(I’ve just posted one comment in reply to this thread, here. Presumably it will appear above this one anyway. I broke the post I was going to make into several pieces, because of problems finding out why the blog’s filter didn’t like my text.)
My theme is that category theory has heaps to offer Artificial Intelligence and cognitive science. We ought to be doing all we can to promote it, and to make it easy to learn. And we should be showing people the kinds of task it could be useful for. Which includes unifying deeply disparate areas of computing. Which AI and cognitive science are full of!
So here’s my second example: neural networks, and logic-based representations. I suppose most readers have come across neural networks, but for those who haven’t, I’ll describe them briefly. Neural networks consist of “nodes” connected in a network. The connections between nodes correspond to synapses between biological neurons. Typically, a node sums the signals from its input nodes, multiplying each input signal by the strength of the corresponding connection. It may then process the sum in some way — e.g. by thresholding — to calculate an output and an “activation”. The entire network can be regarded as a vector of these activation values, representing the data in the network.
Most kinds of neural network can learn by adjusting the strengths of their connections, thus mimicking — in a very simplified way — how the brain learns by adjusting synaptic strengths.
In Artificial Intelligence, there is a vast cultural chasm between neural nets and symbolic ways to represent knowledge. It’s rare for the same researcher to work on both, and students usually get taught about them in different courses. This reflects the differences between the two. The symbols in symbolic representations denote things. But in some kinds of neural net, the individual nodes don’t, although a whole set of nodes might. Therefore, such nets are often called “subsymbolic”, a word introduced by Paul Smolensky in the following paper:
(Holographic reduced representation vectors are also subsymbolic.)
As well as the “symbolic/subsymbolic divide”, there is a “compositional/non-compositional divide”. Symbolic ways to represent knowledge are “compositional”, in that the meaning of a sequence of symbols is determined by the meanings of its parts. Compositionality is good: if you can represent “coffee cup” and “cake”, it’s nice to be able to represent “coffee cup and cake” too. But if you have (say) an image-processing net which has learnt to represent images of coffee cups by one pattern of activations, and images of cakes by another, you can’t assume that superimposing or adding the patterns will represent a coffee cup and a piece of cake. So there has been a lot of argument about questions like: is the brain compositional; is composition by superimposing activation patterns the only kind of compositionality; and (as engineers) how can we design nets that are compositional?
Yet another difference is that symbolic ways to represent knowledge are often said to be “contextual”: they are leaky, in that what happens at one node can be very sensitive to data elsewhere in the network. Which is different from my Prolog program that will happily unify 2*x+3*y with U+V, giving the same results no matter what else is in my laptop’s memory. Again, there’s been a lot of argument about this. Unfortunately, a lot of the key papers on this and the other topics, though on-line, are locked up in pay-to-view sites. The reference below seems a good, and free, starting point:
I mention these differences not only to show how far apart neural nets are from symbolic systems, but also because category theory must have a lot to offer in answering such questions. Which is why the two references below are exciting:
In the first, Brown and Porter suggest using colimits to understand brain activity in terms of the component structures and neurons.
Healy goes further. He sets up a category Conc whose objects are theories representing concepts. Arrows represent “is a subconcept of”: there is an arrow from c1 to c2 if c1 is a subconcept of c2. He sets up another category Neur, whose objects are nodes in a particular neural network. Arrows represent the “can activate” relationship: there is an arrow from n1 to n2 if n1 can activate n2. Such an arrow tells us something about the neural network, namely that there is a path between n1 and n2, and that all the synaptic connections along this path are strong enough to propagate a signal from n1 to n2.
This wouldn’t be interesting unless the categories Conc and Neur were related, and indeed, Healy lets us define functors from Conc to Neur. In effect, each functor maps a network of concepts, represented as theories, to regions in a neural network where these concepts are represented.
What’s nice is that we can combine these neural-network representations by using natural transformations. I hope I have this right, because Healy doesn’t give much detail, but the point seems to be that a single neural network might learn different instances of the same concept.
For example, imagine an ambidextrous robot with mirror-identical arms and hands. Each hand has touch sensors: the left hand’s sensors are wired to one set of input nodes in the network, and the right hand’s to another set of input nodes. Assume the robot is taught to grasp an orange with its left hand, and then, independently, with its right hand. The region of its neural network fed from the left hand sensors learns a set of associations between finger pressure and movement; the region fed from the right hand learns, independently, a very similar set. This is an example I made up, by the way, so blame me if it turns out to be misleading.
The left hand and right hand associations can then be regarded as two functors, L and R, from Conc to Neur. I’m not sure what the logical content of the objects in Conc would be — theories describing what an orange is for, or what its shape is, or how to grasp it — but at any rate, they’d say something about oranges, and they’d be represented in two different regions in the neural network. We can merge these, Healy says, by setting aside a third region in the network which combines corresponding outputs from each of the first two.
But is this in fact a valid representation of the learnt concepts? Yes, if we let natural transformations guide us. As well as functors L and R, let there be a functor M which maps Conc to this third region. Set up mappings from L to M and R to M. Then, Healy shows, the third region is a valid representation if these mappings are natural transformations.
Can such ideas be applied to other kinds of neural network? Can we show other kinds of neural network to be reflections, in suitably warped Eganian mirrors, of logical theories? Can we characterise neural networks by properties of the categories containing them? (I suspect many of these categories won’t have products, because of interference between subnetworks, but that’s only a passing thought.)
And — very importantly, because we engineers have such small minds, so need to break problems into tiny pieces before we can understand them — can we use colimits and other tools to help us build modular neural nets?
… The next section of this comment will follow shortly.
Re: A Categorical Manifesto
This is the third part of my comment. It continues from here.
My third example is similar in spirit to the last. In the early days of programming, one of Artificial Intelligence’s motivating ideas was the Physical Symbol System Hypothesis put forward by Allen Newell and Herbert A. Simon in the following paper:
This states that: “A physical symbol system has the necessary and sufficient means for intelligent action”. I suppose every hacker believes this, and hopes in their deepest soul that, one day, they will write the program that shows this to be true.
More recently, various researchers have proposed using dynamical systems ideas to understand cognition. This is the Dynamical Systems Hypothesis, and there’s some background on it in these references:
I was reminded of this by a nice paper I found, written by David Tall about the learning of mathematical concepts. He wrote it before the Dynamical Systems Hypothesis became popular, but it uses dynamical systems ideas to explain how learning one concept might be blocked by a conflicting earlier version of the same concept:
So, in a spirit similar to that of my neural network section, I ask: can we use category theory to help us map between logical descriptions of knowledge and their implementation as dynamical systems. Can we use it to help us build modular dynamical systems implementations?
Re: A Categorical Manifesto
This is the fourth part of a long comment I’m posting; it continues from here. Just to recap, the theme is that category theory has heaps to offer Artificial Intelligence and cognitive science. We ought to be doing all we can to promote it, and to make it easy to learn. And we should be showing people the kinds of task it could be useful for. Which includes unifying deeply disparate areas of computing. Which AI and cognitive science are full of!
I’m going to ratchet up the stakes, and turn — I’ll explain why in a minute — to Margaret Thatcher. In the 1980s, she was Prime Minister of Britain, and Denis Thatcher was her husband. Ronald Reagan was President of America. So, if you were to ask an American “Who is First Lady of Britain?”, what could they reply? A reasonable answer is “Denis Thatcher”. But to make it, you must be willing to “slip” the concept of First Lady, weakening it to First Spouse as you transfer from an American frame of reference to a British.
Now consider the following two number patterns:
A: 1 2 3 4 5 5 4 3 2 1
B: 1 2 3 4 4 3 2 1
What is to B as 4 is to A? Most people would answer 3. It’s to the left of the central pair of numbers in B, just as 4 is to the left of the central pair of numbers in A. 4 in A and 3 in B are like Nancy Reagan and Denis Thatcher: they’re not the “most distinguished point” in their landscape, but they are closely related to that point. They fill the same rôle.
Here is another number problem:
A: 1 2 3 4 5 5 4 3 2 1
D: 1 1 2 2 3 3 4 4 5 4 4 3 3 2 2 1 1
What is to D as 4 is to A? A brute-force attempt to find D’s central pair won’t work. But it seems reasonable to “slip” the concepts of “pair” and “singleton” as we transfer from A’s frame of reference to D’s. Which gives us the answer 4 4.
These problems come from a column that Douglas Hofstadter wrote for Scientific American. He collected them, with later additions, into a book, and I’m particularly interested here in two of its chapters:
- Douglas Hofstadter, Metamagical Themas, Basic Books, 1985. See Variations on a Theme as the Crux of Creativity (Chapter 12) and Analogies and Roles in Human and Machine Thinking (Chapter 24).
Hofstadter argues that analogy is at the core of human thought and creativity. To quote from Chapter 24:
Don’t press an analogy too far, because it will always break down. In that case, what good are analogies? Why bother with them? What is the purpose of trying to establish a mapping between two things that do not map onto each other in reality? The answer is surely very complex, but the heart of it must be that it is good for our survival (or our genes’ survival) , because we do it all the time. Analogy and reminding, whether they are accurate or not, guide all our thought patterns. Being attuned to vague resemblances is the hallmark of intelligence, for better or for worse.
Therefore, he says, human analogical thought is what we need to study. And we should do so by within “microdomains” such as the number problems, because then — in proper scientific spirit — we can isolate analogical thought from other phenomena.
He was particularly interested in why we slip some concepts more than others, and in what one might call rôles versus literals. As an example of the first: I’m in the pub, and I accidentally spill my pint of beer over my shoes. Why is it natural for me to “slip” my situation to “Good job I don’t have to go anywhere smart tonight!”, but not to “Shame gravity isn’t weaker, so the beer would have stayed in the air!”?
As an example of the second, why, when solving the two number problems, do we perceive 4 as filling a rôle at all in pattern A, then try to find a value for the equivalent rôle in patterns B and D? Why don’t we just say, “Well, the obvious answer is the same value, 4”?
There are some lovely examples in those two chapters, which I do recommend to anyone interested. But what’s the connection with categories?
Hofstadter and his colleagues wrote several programs to implement their theories of analogical thought in microdomains. These programs include include Copycat:
And they include Tabletop:
- Robert French, The Subtlety of Sameness: a Theory and Computer Model of Analogy-Making, MIT Press, 1995.
Doesn’t that title sound like something to interest every red-blooded n-categorist? Anyway, Tabletop solves problems such as “I am sitting at a table and I touch my nose. What should you touch?”. Or “I touch the flower vase in the centre. What should you touch?”. Or, one more: “I touch the cup near me. You have a glass but no cup. What should you touch?”
Now, these programs are complicated. They contain swarms of biologically-inspired enzyme-like entities swimming around in a kind of symbol-filled cytoplasmic glop, glomming onto bits of concepts, and slowly crystallising out into a finished answer to the problem. They lack — as far as I know — any easily understandable semantics.
So, I’m going to suggest, can we use categories to give them such a semantics? Or, more generally, to formalise our ideas about the best solutions to the kind of analogy problems that Hofstadter and his colleagues discuss? A good starting point here would be Goguen’s ideas on conceptual blending, already mentioned earlier in this thread. For example:
Let me finish this section of my comment with two more Hofstadter quotes. As I said at the beginning, the stakes are high:
[…] I have been emphasising the idea of the internal structure of one concept. In my view, the way that concepts can bond together and form conceptual molecules on all levels of complexity is a consequence of their internal structure. What results from a bond may surprise us, but it will nonetheless always have been completely determined by the concepts involved in the fusion, if only we could understand how they are structured. Thus the crux of the matter is the internal structure of a single concept, and how it “reaches out” towards things it is not. The crux is not some magical, mysterious process that occurs when two indivisble concepts collide; it is a consequence of the divisibility of concepts into subconceptual elements. As must be clear from this, I am not one to believe that wholes elude descriptions in terms of their parts. I believe that if we come to understand the “physics of concepts”, then perhaps we can derive from it a “chemistry of creativity”, just as we can derive the principles of the chemistry of atoms and molecules from those of the physics of quanta and particles. But as I said earlier, it is not just around the corner. Mental bonds will probably turn out to be no less subtle than chemical bonds.
You can think of concepts as start, and knob-twiddling as carrying you from one point on an orbit to another point. If you twiddle enough, you may well find yourself deep within the attractive zone of an unexpected but interesting concept and be captured by it. You may thus migrate from concept to concept. In short, knob-twiddling is a device that carries you from one concept to another, taking advantage of their overlapping orbits.
Of course, all this cannot happen with a trivial model of concepts. We see it happening all the time in minds, but to make it happen in computers or to locate it physically in brains will require a fleshing-out of what concepts really are. It’s fine to talk of “orbits around concepts” as a metaphor, but developing it into a full scientific notion that either can be realised in a computer model or can be located inside a brain is a huge task. This is the task that faces cognitive scientists if they wish to make “concept” a legitimate scientific term. This goal, suggested at the start of this article, could be taken to be the central goal of cognitive science, although such things are often forgotten in the inane hoopla that is surrounding artificial intelligence more and more these days
“These days” were in the 1980s, so perhaps we can disregard the remark about the inane hoopla. Though there seems to be a lot of inane hoopla surrounding computing in general. But what about the rest of the quotes. Does it make sense to think about a “chemistry of concepts”; to make “concept” a legitimate scientific notion?
Actually, Michael Healy, who I mentioned earlier for his work on mapping concepts to neural nets, has with coauthors Thomas Caudell and Timothy Goldsmith, just released another report:
This is their abstract:
Categorization and the judgement of similarity are fundamental in cognition. We propose that these and other activities are based upon an underlying structure of knowledge, or concept representation, in the brain. Further, we propose that this structure can be represented mathematically in a declarative form via category theory, the mathematical theory of structure. We test the resulting mathematical model in an experiment in which human subjects provide judgements of similarity for pairs of line drawings using a numerical scale to represent degrees of similarity. The resulting numerical similarities are compared with those derived from the category-theoretic model by comparing diagrams. The diagrams represent distributed concept structures underlying the line drawings. To compare with a more conventional analysis technique, we also compare the human judgements with those provided by a two-dimensional feature space model equipped with a distance metric for the line drawings. The results are equally favorable for both models. Because of this and the putative explanatory power of the category-theoretic model, we propose that this model is worthy of further exploration as a mathematical model for cognitive science.
Which is extremely interesting, and reminiscent of Hofstadter’s remark: “Thus the crux of the matter is the internal structure of a single concept […]”.
I’ll finish this comment, Web demons permitting, in my next post.
Re: A Categorical Manifesto
This is the final part of a comment on things I believe category theory has to offer Artificial Intelligence and cognitive science. It continues from here.
I’ll finish by talking about a Lawvere paper. It’s the following reference — which, I’m pleased to note, is public at Google Books — and I mention it because I found it in a book on cognition:
Here’s how Lawvere concludes the paper:
Despite some simplification in the above, needed for rapid description, I hope that I have made clear that there is a great deal of useful precision lying behind my illustrations, and a great deal to be developed on the same basis. Thus I believe to have demonstrated the plausibility of my thesis that category theory will be a necessary tool in the construction of an adequately explicit science of knowing.
Now, I don’t know enough to describe these “illustrations” succinctly, but they include his notion of unity-and-identity-of-opposites, which I discovered from this reference:
is also known as an adjoint cylinder. I mention this only because it sounds like something wonderful.
Lawvere’s illustrations also include the notion of an adequacy comonad. Now, I hadn’t come across adequacy comonads before, but they sound as though they might be very useful in understanding approximations to concepts. Moreover, the adequacy comonad sounds like an exquisite and intricate piece of mathematical machinery, and I hope someone will give me a working model. One I can pick up, prod, peer into, pull to pieces, and generally play with. Steam or electric.
However, I’ve another reason for mentioning the notion. Lawvere uses it to show how the study of closed categories contributes to the study of metric spaces (in this case, via the use of adequacy comonads in the geodesic remetrisation of a metric space). He then says that, conversely, we can take notions such as “radius” and “engulfing” from the study of metric spaces and create “precise analogues [of these notions] for concrete generals enriched in any (even non-posetal) closed categories V”.
I don’t need to fear the strange phrase “concrete general”, because John Baez explains it in:
A “concrete general” is the category of all the models of a theory. (Is there more to it than that?)
Now, I don’t know whether Lawvere is using the words “radius” and “engulfing” in some sense far removed from their everyday meaning. But if not, can his remark also apply to other spatial and dynamic notions? Such as “orbit” (around a concept); “attractive zone” (of a concept); “migrate” (from concept to concept). In other words, to the metaphors that Hofstadter uses, and that he says will need to be made precise before we can create a “chemistry of concepts”. If so, then category theory will be an invaluable tool for cognitive science.
Even if I’m completely wrong in that last paragraph, Lawvere says category theory “will be a necessary tool in the construction of an adequately explicit science of knowing”, which is good enough for me. So I did an experiment. I searched Google and CiteseerX on his main title, Tools for the Advancement of Objective Logic.
Apart from citations in library catalogues, and papers on philosophy or pure category theory. I found only two authors whose papers seemed, on a quick skim, to depend in some important way on Lawvere’s paper. For interest, I give the references below. (The second author, Zippora Arzi Gonczarowski, had one paper on a pay-to-view site that I couldn’t read, but the abstract shows me that it’s about the same topic as the paper below):
Lawvere’s paper seems filled with crystals of compressed insight, glittering through the surface of the text like quartz through a covering of dry pine-needles. I find them intensely satisfying to contemplate, in the same way I do some El Lissitzky posters and De Stijl typography. Perhaps, high in my brain’s abstraction hierarchy, the same neurons are activated by both: category theory as the ultimate abstract sculpture.
But, aesthetics aside, how can we bring such power to more researchers than the two who cited it? I’m using Lawvere’s paper as a representative of category theory generally, of course. Ask a first-year LISP hacker to iterate over a list, and they’ll reach naturally for recursion or a MAP function. Can we bring up a generation of category-theory hackers, who, when you ask them to model putting something together, will reach naturally for a colimit?
Assuming that category theory really does have something to offer AI and cognitive science.
Read the post
Synergism
Weblog: The n-Category Café
Excerpt: Are we maximising the collaborative power of our blog?
Tracked: March 10, 2009 12:20 PM
Re: A Categorical Manifesto
For me, two of the most interesting aspects of category theory in computer science have been monads and generalised folds/unfolds.
If M is a functor that happens to be monad, then given an arrow (ie. a function because we’re working in the category of types and functions) A->MB and an arrow B->MC we can compose them to make an arrow A->MC, even though the tail of the first arrow is incompatible with the head of the second. This pattern is incredibly common when programming, especially in functional programming languages. Rather than have to write the glue to compose these functions over and over again, monads give a really nice way to express the glue and simplify the composition. And to grasp monads it really helps to understand some category theory, after all, they are categorical constructs.
Folds are a ubiquitous operation in computing. A fold is easiest to understand by example. Given an ordered list of numbers [1,2,3,4] we can turn it into an expression 1+2+3+4+0. We put a + between each pair of values and a value at the end (often the identity for +). Clearly we can generalise to any binary operator and any ‘end’ value. This is a fold and clearly summing a list is an example of one (and you have to admit, summing a list is a pretty ubiquitious operation). Many (all?) functions of lists can be expressed as folds so understanding folds gives a nice way to abstract your code to see commonality between different algorithms. Additionally, folds generalise to many datastructures besides lists. To fully grasp this needs a little category theory and the notion of an F-algebra.
And the fun thing is that as soon as we view this stuff through category theory it becomes clear that these notions have duals. On the one hand we find comonads, and on the other we find F-coalgebras and unfolds. It turns out that many algorithms are naturally structured via comands and that unfolds are as ubiquitous as folds.
And of course the whole Curry-Howard isomorphism is really about category theory and I’ve found that to be quite illuminating.
A nice readable account on fold/unfold is contained in here. There’s no shortage of tutorials on monads in computer science so I won’t bother with a link to one of those.