## September 28, 2007

### Progic III

#### Posted by David Corfield

So we’ve seen, in this thread that, even when we work with nice basic finite sets, the probability monad doesn’t get along too well with logical structure, there being none of that pleasant adjointness between categories of predicates over sets.

However, the probability monad, $P$, does come along with other structure of its own. In particular $Hom_{Kleisli(P)}(1, Y)$ is a Riemannian manifold, with the Fisher information metric, which is crucially invariant under reparameterization. Recall that this space is composed of maps $1 \to P Y$, i.e., probability distributions over $Y$. The Fisher information metric (see, e.g., section 4 of Guy Lebanon’s thesis) takes on a simple form for finite $Y$, (p. 21).

Then there’s all sorts of fun you can have with geodesics and (non-metric) connections in the subject called information geometry, which, if you want to understand it, it seems that it helps to be Japanese.

But how about $Hom_{Kleisli(P)}(X, Y)$? Well sticking with finite sets, Lebanon showed that, broadly speaking, the only ‘sensible’ metric to put on this space of conditional distributions, a product of simplices, is the product Fisher information metric (see p. 22 of his thesis). We have then in particular that the space of probabilistic predicates on $Y$, $Hom_{Kleisli(P)}(Y, 2)$, is a Riemannian manifold.

Now, a common thing to do with the space of distributions on $Y$ is to look at subspaces satisfying various constraints. So we might have a function $f: Y \to \mathbb{R}$, and look at distributions, $p$, which satisfy $f \dot p = \int p(x) f(x) = c$, for some constant $c$.

[Aside: That composition could have been taken as happening in the Kleisli category, if we interpret the function $f$ as a probability distribution on $\mathbb{R}$ conditional on $Y$, which for each $y$ in $Y$ is all concentrated at a single real value, and then we take the mean of the ensuing distribution over $\mathbb{R}$. So I suppose $f$ could have been any arrow in $Hom_{Kleisli(P)}(Y, \mathbb{R})$.]

Anyway, these submanifolds of $P Y$ have some ‘logical’ structure to them. The submanifold satisfying all of a set of constraints is the same as the intersection of the set of submanifolds satisfying at least one of the constraints.

Final question, is there anything interesting to say about what’s happening geometrically in the space of probabilistic predicates, $Hom_{Kleisli(P)}(Y, 2)$?

Posted at September 28, 2007 10:37 AM UTC

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/1444

### Re: Progic III

David wrote:

So we’ve seen, in this thread that, even when we work with nice basic finite sets, the probability monad doesn’t get along too well with logical structure, there being none of that pleasant adjointness between categories of predicates over sets.

Both probability theory and logic are much too good to ‘not get along well’ with each other. They get along fine. It’s merely our job to discover how.

In such fundamental mathematical subjects, nothing can be less than perfect! Any apparent imperfection is merely an imperfection in our own understanding. If we follow the tao, all will be well.

But how do we find the tao? I still think it’s a great project to understand precisely how the probability monad gets along with the ‘logical’ operations on the category of sets (or Polish spaces, or whatever): products, coproducts, more general limits and colimits, exponentials, etcetera.

It would also be nice, if you plan to use Polish spaces for your probability monad, to determine how good the category of Polish spaces actually is. I can’t tell if this category is really ‘right’ or just a sloppy stopgap measure: the first thing somebody happened to try.

For example, is the category of Polish spaces cartesian closed? That is, roughly speaking: does the space of maps between Polish spaces become a Polish space in a nice way? If not, we should switch to something better.

I want to work in a category that’s cartesian closed. In other words — for those who don’t grok this jargon — given spaces $X$ and $Y$, I want to have a space of maps from $X$ to $Y$:

$Y^X$

with a natural isomorphism

$Z^{X \times Y} \cong {(Z^Y)^X}$

just like we’re used to in the category of sets.

Furthermore, I want a probability measure on this space of maps to give map from $X$ to probability measures on $Y$. In other words, I want a map

$P(Y^X) \to (P Y)^X$

Doesn’t that seem sensible?

I don’t know if this map wants to be an isomorphism — I could figure it out in five minutes, but I’ll let someone else. This map should make some diagrams commute… but I don’t know what those diagrams are. Someone who has thought about monads on cartesian closed categories might know.

As mentioned before, I also want a map

$P X \times P Y \to P(X \times Y)$

Of course this map should not be an isomorphism, since only product measures are in its range. I bet this map will make $P$ into a symmetric monoidal monad.

$P(X + Y)?$

This one seems sort of funny. There’s not one best map

$P X + P Y \to P(X + Y)$

Instead, there’s one natural map for each $p \in [0,1]$: given a probability measure on $X$ and a probability measure on $Y$, you can flip a $p$-weighted coin to pick either $X$ or $Y$, then randomly pick an element from the space you’ve chosen.

All these structures should fit together into some nice entity — which may or may not have been given a name and studied to death.

Posted by: John Baez on September 28, 2007 11:12 PM | Permalink | Reply to this

### Re: Progic III

For example, is the category of Polish spaces cartesian closed?

I don’t know, but my gut tells me almost certainly not. The reason is that if you forget Polish spaces for a minute and consider more general spaces, then there’s one theorem (see here for instance) that says the “exponentiable” objects in the category of Hausdorff spaces (i.e., objects $X$ such that $X \times (-)$ has a right adjoint $(-)^X$) are precisely the locally compact Hausdorff spaces. I strongly suspect that a similar statement could be made for the category of Polish spaces: that the exponentiable ones are the locally compact ones. But we know there are Polish spaces which are not locally compact, e.g., $L^2(\mathbb{R})$.

I want a map $P(Y^X)→(P Y)^X$ Doesn’t that seem sensible?

It would come from a strength on $P$, by “currying” the composite

$P(Y^X) \times X \to P(Y^X \times X) \to P(Y)$

where the first map is a component of strength and the second is $P$ applied to an evaluation map.

I don’t know if this map wants to be an isomorphism — I could figure it out in five minutes, but I’ll let someone else. This map should make some diagrams commute.

Not an isomorphism: consider when $X$ is a sum of two terminal objects; you’d get $P(Y \times Y) \cong P Y \times P Y$.

The commutative diagrams (which I won’t put down here) are easily derivable from the coherence conditions associated with a strength, involving compatibility between the strength and associativity, unit isomorphisms for the product. They’re easy to figure out even if you’ve never seen them.

Posted by: Todd Trimble on September 29, 2007 3:21 AM | Permalink | Reply to this

### Re: Progic III

This map $P(Y^X) \to (P Y)^X$ is what’s used when you think of a Gaussian process as a distribution over functions. From this distribution you generate a conditional probability of $Y$ given $X$.

With the ‘function’ picture, you can set priors by favouring, say, smooth functions.

Posted by: David Corfield on September 30, 2007 4:00 PM | Permalink | Reply to this

### Re: Progic III

This might not help, but what about restricting the morphisms? My (rather unreliable) recollection is that the category of compactly generated Hausdorff spaces and proper maps between them is the right’ cartesian closed category to start doing algebraic topology in, f’rinstance.

Posted by: Yemon Choi on October 2, 2007 12:25 AM | Permalink | Reply to this

### Re: Progic III

I’m sort of glad you brought this up. Polish spaces (more generally, metric spaces) are compactly generated Hausdorff, and compactly generated Hausdorff spaces and continuous maps between them famously form a cartesian closed category (I hadn’t even thought about proper maps! – interesting). So the rationale I gave for my gut feeling was actually somewhat stupid, and I was wondering whether anyone would remind us of CGHaus before I got around to it.

So while I’m still betting against Polish spaces (and continuous maps) forming a ccc, I don’t have a good rationale at present. Anyone?

Posted by: Todd Trimble on October 2, 2007 2:38 AM | Permalink | Reply to this

### Re: Progic III

Are we talking about continuous maps here? (Sort of weird, actually, if we’re doing measure theory.) Then what topology will our desire for cartesian closedness force us to put on $Y^X$? It might be possible to show that this topology can’t possibly be separable and metrizable if $X$ and $Y$ are fairly big. For example, even $X = Y = \mathbb{R}$.

Are Polish spaces really a convenient setting for measure theory, or have category theorists already found a better one? Inquiring minds want to know.

Posted by: John Baez on October 2, 2007 6:57 AM | Permalink | Reply to this

### Re: Progic III

Yes, Polish spaces are a natural setting for measure theory. From the measure-theoretic point-of-view, all uncountable Polish spaces are isomorphic to R.

Posted by: Walt on October 2, 2007 2:26 PM | Permalink | Reply to this

### Re: Progic III

John, at the MIT Science Library there is a book called “Random Probability Measures on Polish Spaces” by Hans Crauel which I have not yet looked at.

Also, see this wikipedia entry on polish spaces.

The continuous image of a polish space is an analytic set or Souslin set. In a separable metric space, any Souslin set is universally measurable.

A universally measurable set means a set E that is measurable for each Borel measure on a topological space. Thus, given a measure u, there exist G subset of E subset of F with u(F\G) = 0, where G and F are Borel sets, these being taken as the sigma algebra generated by all closed subsets.

Posted by: Charlie Stromeyer Jr on October 5, 2007 3:10 PM | Permalink | Reply to this

### Re: Progic III

There’s not one best map

(1)$P X+P Y\to P(X+Y)$

I’m faintly puzzled by this remark, in that there is certainly a canonical map of that type, regardless of what the functor $P$ might be. It’s just $[P(i_1), P(i_2)]$, where $i_1: X \to X+Y$ and $i_2: Y\to X+Y$ are the injection maps for the coproduct.

Posted by: Robin on September 29, 2007 11:18 AM | Permalink | Reply to this

### Re: Progic III

Robin wrote:

John wrote:

There’s not one best map

$P X+P Y\to P(X+Y)$

I’m faintly puzzled by this remark…

I was just being silly. For some reason I was talking about ways to get a probability measure on $X + Y$ from a probability measure on $X$ and a probability measure on $Y$. These are maps

$P X \times P Y \to P(X + Y)$

The mix of $\times$ and $+$ here is awkward, so it’s not shocking that there’s no ‘best’ map of this sort. Instead, we get one for each $p \in [0,1]$: given a probability measure on $X$ and a probability measure on $Y$, you can flip a $p$-weighted coin to pick either $X$ or $Y$, then randomly pick an element from the space you’ve chosen.

I should have been thinking about ways to get a probability measure on $X + Y$ from a probability measure on $X$ or a probability measure on $Y$. These are maps

$P X + P Y \to P(X + Y),$

a much more sensible thing to ponder.

There’s an obvious best map of this sort, which is just the one you described!

Posted by: John Baez on October 2, 2007 6:49 AM | Permalink | Reply to this

### Re: Progic III

Hi David. Can I request a “progic iv” post in which you synthesize and linearize projics i-iii and associated comments? It looks very interesting but I find it difficult to follow in its current format; for instance it took me a long time to find the definition of the “probability monad.” Of course it might be a chore.

Cheers,

Posted by: anonymous on September 29, 2007 9:53 PM | Permalink | Reply to this

### Re: Progic III

Progic I-III looks interesting. I sure wish I could read it. MathML, yarghgh!

Posted by: Chris Hillman on September 30, 2007 4:13 AM | Permalink | Reply to this

### Re: Progic III

The big question is: why can all of us read MathML, and not you? You’ve never said what the problem is.

Posted by: John Baez on October 2, 2007 6:34 AM | Permalink | Reply to this

### Re: Progic III

I just wanted David to know that I’m interested, in re the discussion of possibly migrating the Cafe to a wiki which possibly would forgo dependence upon MathML or other “non-standard” fonts/software at the user end. By coincidence, a few months ago I posted elsewhere a very vague suggestion that probability should emerge from logic, so naturally I was intrigued to see that some researcher seem to be hard at work on realizing this idea!

David, are you considering turning these posts into an expository(?) eprint I can download somewhere?

Posted by: Chris Hillman on October 4, 2007 5:52 PM | Permalink | Reply to this

### Re: Progic III

I think I’m still at the fishing around stage. But when I get a moment, it might be an idea to piece together what we’ve got so far. There was a lot of good material in our 2-geometry series which lies languishing.

There’s the fascinating similarity of probability theory, logic and other forms of ‘matrix’ mechanics. But then there’s the peculiarity of probability theory, e.g., where does its geometry come from? Or is it that the Fisher information metric has its counterparts for other rigs or generalized rings?

I recently came across something of interest for the thread in Cartier’s Mad Day’s Work, on p. 396 he’s talking about valuations, i.e., mappings of sets of propositions to {0, 1}:

…instead of requiring the valuation $v(A)$ of a proposition to assume only the values 0 and 1, one may postulate more generally that $v(A)$ is a real number between 0 and 1. This is roughly the strategy of “fuzzy” logic.

To which he appends the note:

Carathéodory and Kappos used this method to construct an alternative to the axiomatization of probability due to Kolmogorov. Despite the philosophical interest of this method, it is technically more cumbersome than Kolmogorov’s approach, especially for the study of stochastic processes.

What is known of Kappos, D. Probability Algebras and Stochastic spaces, Academic Press, (1969)?

Posted by: David Corfield on October 5, 2007 10:25 AM | Permalink | Reply to this

### Re: Progic III

The work by Carathéodory and Kappos and the relation to von Neumann’s work on quantum probability is explained very nicely by Rota.
[Twelve problems in probability no one likes to bring up

Algebraic Combinatorics and Computer Science: A Tribute to Gian-Carlo Rota
Edition: 1st - Author(s): Rota, Gian-Carlo; Crapo, Henry H.; Senato, D.
2001]

It is his first problem in probability that no one likes to bring up’.
Scans can be found here:p1p2p3

Recently, Coquand and Palmgren made the connection between the Kappos-Carathéodory point-free measure theory and point-free topology, aka locale theory, topology in a topos. They used metric Boolean algebras, but this can be extended to use Heyting algebras i.e. locales. (I do this in forthcoming work with Coquand).

Lawvere has observed that much of the theory of stochastic processes can be captured using the Kleisli category of the Giry monad. We define this monad on the category of compact regular locales (the formal counterpart of compact Hausdorff spaces). By the Stone-Yosida representation theorem the Riesz space of bounded measurable functions may be represented by continuous functions on a compact regular locale. So, not much seems to be lost by restricting to this category. Combined with Lawvere’s insight, it is not clear to me why it should a priori be difficult to treat stochastic processes formally/algebraically. [Which does not mean that I claim to have done this already.]

To our surprise it turned out that this (commutative!) algebraic treatment of probability theory can be translated to quantum probability by using a different topos. For instance, integrals become states on a C*-algebra and valuations become measures on projections. The details can be found here.

Finally, as I pointed out before the use of valuations on a locale interpreted in a topos extends ordinary probability theory.

Posted by: Bas Spitters on October 5, 2007 12:48 PM | Permalink | Reply to this

### Re: Progic III

Bas Spitters, this post you have made is awesome, and what you say is related to what Steve Vickers said about Scott domains in my last post in the thread “Deep Beauty: Understanding the Quantum World”.

You can see how these concepts are related by reading this wikipedia entry on complete Heyting algebras and their other entry on Scott domains.

Posted by: Charlie Stromeyer Jr on October 5, 2007 8:47 PM | Permalink | Reply to this

### Re: Progic III

Bas, I have found an entire paper about domain theory via locales called “Domain theory in logical form” by S. Abramsky in Annals of Pure and Applied Logic 51: 1-77 (1991) (although you might be able to find a copy online).

I am now reading about the specialization order that Steve Vickers mentioned as well as the Scott topology in the book “Continuous lattices and domains” by G Gierz et al. You can see the reference at the bottom of this wikipedia entry about domain theory.

Posted by: Charlie Stromeyer Jr on October 6, 2007 6:15 PM | Permalink | Reply to this

### Re: Progic III

To our surprise it turned out that this (commutative!) algebraic treatment of probability theory can be translated to quantum probability by using a different topos. For instance, integrals become states on a C*-algebra and valuations become measures on projections. The details can be found here.

Posted by: Bas Spitters on October 5, 2007 1:37 PM | Permalink | Reply to this

### Re: Progic III

Dear Charlie,

The paper by Abramsky is a good start. It is part of a circle of ideas saying that
topology=logic of observations.
Two other related sources are Vickers book:
Topology via Logic
and Smyth’s chapter Topology in Handbook of Logic in Computer Science.

Recently, Steve Vickers wrote a deeper chapter on topology via constructive logic connecting these ideas with topos theory. In short, a Grothendieck topos is a model of a geometric predicate logic.

These aspects (and information systems) play a key role in my paper with Chris Heunen on the application of the internal (and geometric) logic of a topos to Isham’s et.al.’s ideas on topos theory and physics. In that paper we emphasize that geometric logic, as the logic of observations, should also be relevant for physics.

Finally, I expect you know about the book by Segal/Kunze. They continued the development of von Neumann’s ideas on continuous geometries to a complete development of integration theory. References can be found in our paper above.

Bas

Posted by: Bas Spitters on October 6, 2007 9:15 PM | Permalink | Reply to this

### Re: Progic III

Thank you, Bas, for providing these references and I will look at your paper with Chris Heunen. Actually, I was not aware of the book by Segal and Kunze, and so I will read some of it next time I go to the library.

Perhaps you might be interested to know that the mathematician Tsemo Aristide defines a notion of a sheaf of n-categories over a topos in Section 8 page 37 of this paper.

Posted by: Charlie Stromeyer Jr on October 6, 2007 10:55 PM | Permalink | Reply to this

### Re: Progic III

Dear Bas,

I forgot to mention something else which might interest you:

In his paper “A regular completion for the variety generated by the three-element Heyting algebra”, John Harding remarks on how this algebra is related to the Pierce sheaf (a type of sheaf which I briefly mentioned in the thread “Deep Beauty: Understanding the Quantum World”). See Harding’s remark at the bottom of page 7 of this paper.

Posted by: Charlie Stromeyer Jr on October 7, 2007 4:09 PM | Permalink | Reply to this

Post a New Comment