Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

May 5, 2018

The Fisher Metric Will Not Be Deformed

Posted by Tom Leinster

The pillars of society are those who cannot be bribed or bought, the upright citizens of integrity, the incorruptibles. Throw at them what you will, they never bend.

In the mathematical world, the Fisher metric is one such upstanding figure.

What I mean is this. The Fisher metric can be derived from the concept of relative entropy. But relative entropy can be deformed in various ways, and you might imagine that when you deform it, the Fisher metric gets deformed too. Nope. Bastion of integrity that it is, it remains unmoved.

You don’t need to know what the Fisher metric is in order to get the point: the Fisher metric is a highly canonical concept.

Let’s start with Shannon entropy. Given a finite probability distribution p=(p 1,,p n)p = (p_1, \ldots, p_n), its Shannon entropy is defined as

H(p)= ip ilogp i. H(p) = - \sum_i p_i \log p_i.

(I’ll assume all probabilities are nonzero, so there are no problems with things being undefined.)

This is the most important type of “entropy” for finite probability distributions: it has uniquely good properties. But it admits a couple of families of deformations that share most of those properties. One is the family of Rényi entropies, indexed by a real parameter qq:

H q(p)=11qlogp i q. H_q(p) = \frac{1}{1 - q} \log \sum p_i^q.

Another is the family of entropies that I like to call the qq-logarithmic entropies (because they’re what you get if you replace the logarithm in the definition of Shannon entropy by a qq-logarithm), and that physicists call the Tsallis entropies (because Tsallis was about the tenth person to discover them). They’re defined by

S q(p)=11q(p i q1). S_q(p) = \frac{1}{1 - q} \biggl( \sum p_i^q - 1 \biggr).

There’s obviously a problem with the definitions of the Rényi entropy H q(p)H_q(p) and the qq-logarithmic entropy S q(p)S_q(p) when q=1q = 1. They don’t make sense. But both converge to the Shannon entropy H(p)H(p) as q1q \to 1, and that’s what I mean by “deformation”.

An easy way to prove this is to use l’Hôpital’s rule. And that l’Hôpital argument just as easily shows that it’s easy to dream up new deformations of Shannon entropy (not that they’re necessarily interesting). For any function λ:(0,)\lambda : (0, \infty) \to \mathbb{R}, define a kind of “entropy of order qq” as

11qλ(p i q). \frac{1}{1 - q} \cdot \lambda \biggl( \sum p_i^q \biggr).

If you want to show that this converges to H(p)H(p) as q1q \to 1, all you need to assume about λ\lambda is that λ(1)=0\lambda(1) = 0 and λ(1)=1\lambda'(1) = 1.

Taking λ=log\lambda = \log satisfies these conditions and gives Rényi entropy. The simplest function λ\lambda satisfying the conditions is the linear approximation to the function log\log at 11, namely, λ(x)=x1\lambda(x) = x - 1. And that gives qq-logarithmic entropy.

That’s entropy, defined for a single probability distribution. But there’s also relative entropy, defined for a pair of distributions on the same finite set. The formula is

H(pr)= ip ilog(p i/r i), H(p \| r) = \sum_i p_i \log(p_i/r_i),

where pp and rr are probability distributions on nn elements.

I won’t explain here why relative entropy is important. But very roughly, you can think of it as measuring the difference between pp and rr. It’s always nonnegative, and it’s equal to zero just when p=rp = r. However, it would be a bad idea to use the word “distance”: it’s not symmetric, and more importantly, it doesn’t satisfy the triangle inequality.

Actually, relative entropy is slightly more like a squared distance. A little calculus exercise shows that when pp and rr are close together,

H(pr)=12p i(p ir i) 2+o(pr 2). H(p \| r) = \sum \frac{1}{2p_i} (p_i - r_i)^2 + o(\|p - r\|^2).

The sum here is just the Euclidean squared distance scaled by a different factor along each coordinate axis.

But it’s still wrong to think of relative entropy as a squared distance. Its square root fails the triangle inequality. So, it’s not a metric in the sense of metric spaces.

However, you can use the square root of relative entropy as an infinitesimal metric — that is, a metric in the sense of Riemannian geometry. It’s called the Fisher metric, at least up to a constant factor that I won’t worry about here. And it makes the set of all probability distributions on {1,,n}\{1, \ldots, n\} into a Riemannian manifold.

This works as follows. The set of probability distributions on {1,,n}\{1, \ldots, n\} is the (n1)(n - 1)-simplex Δ n\Delta_n (whose boundary points I’m ignoring). It’s a smooth manifold in the obvious way, and every one of its tangent spaces can naturally be identified with

T={t=(t 1,,t n) n:t 1++t n=0}. T = \{ t = (t_1, \ldots, t_n) \in \mathbb{R}^n : t_1 + \cdots + t_n = 0 \}.

The “little calculus exercise” above tells us that when you treat H()H(-\|-) as an infinitesimal squared distance, the resulting norm on the tangent space TT at pp is given by

t 2= i12p it i 2. \|t\|^2 = \sum_i \frac{1}{2p_i} t_i^2.

Or equivalently, by the polarization identity, the resulting inner product on TT is given by

t,u= i12p it iu i. \langle t, u \rangle = \sum_i \frac{1}{2p_i} t_i u_i.

And that’s the Riemannian metric on Δ n\Delta_n. By definition, it’s the Fisher metric.

(Well: it’s actually 1/21/2 times what’s normally called the Fisher metric, but as I said, I’m not going to worry too much about constant factors in this post.)

Summary so far: We’re working on the space Δ n\Delta_n of probability distributions on nn elements.There is a machine which takes as input anything that looks vaguely like a squared distance on Δ n\Delta_n, and produces as output a Riemannian metric on Δ n\Delta_n. When you give this machine relative entropy as its input, what it produces as output is the Fisher metric.

Now the fun starts. Just as the entropy of a single distribution can be deformed in at least a couple of ways, the relative entropy of a pair of distributions has interesting deformations. Here are two families of them. The Rényi relative entropies are given by

H q(pr)=1q1logp i qr i 1q, H_q(p \| r) = \frac{1}{q - 1} \log \sum p_i^q r_i^{1 - q},

and the qq-logarithmic relative entropies are given by

S q(pr)=1q1(p i qr i 1q1). S_q(p \| r) = \frac{1}{q - 1} \biggl( \sum p_i^q r_i^{1 - q} - 1 \biggr).

Again, qq is a real parameter here. Again, both H q(pr)H_q(p \| r) and S q(pr)S_q(p \| r) converge to the standard relative entropy H(pr)H(p \| r) as q1q \to 1. And again, it’s easy to write down other families of deformations in this sense: define a kind of “relative entropy of order qq” by

H q λ(pr)=1q1λ(p i qr i 1q) H^\lambda_q(p \| r) = \frac{1}{q - 1} \lambda \biggl( \sum p_i^q r_i^{1 - q} \biggr)

where λ\lambda is any function satisfying the same two conditions as before: λ(1)=0\lambda(1) = 0 and λ(1)=1\lambda'(1) = 1. This generalizes both the Rényi and qq-logarithmic relative entropies, by taking λ(x)\lambda(x) to be either logx\log x or x1x - 1.

Let’s feed this very general kind of relative entropy into the machine. A bit of calculation shows that

H q λ(pr)=q i12p i(p ir i) 2+o(pr 2) H^\lambda_q(p \| r) = q \sum_i \frac{1}{2p_i} (p_i - r_i)^2 + o(\|p - r\|^2)

for any function λ\lambda satisfying those same two conditions. The right-hand side is just what we saw before, multiplied by qq. So, the output of the machine — the Riemannian metric on Δ n\Delta_n that comes from this generalized entropy — is just qq times the Fisher metric!

So: when you deform the notion of relative entropy and feed it into the machine, the same thing always happens. No matter which deformation you put in, the machine spits out the same Riemannian metric on Δ n\Delta_n (at least, up to a constant factor). It’s always the Fisher metric.

A thrill-seeker would call that result disappointing. They might have been hoping that deforming relative entropy would lead to interestingly deformed versions of the Fisher metric. But there are no such things. Try as you might, the Fisher metric simply refuses to be deformed.

Posted at May 5, 2018 12:00 AM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3035

25 Comments & 0 Trackbacks

Re: The Fisher Metric Will Not Be Deformed

I haven’t been at all scholarly in this post. I’ve skipped a whole bunch of calculations and differential-geometric details. I haven’t given references. I haven’t said what (if anything) is new.

Let me remedy some of those defects here. There’s a general theory of how to take a kind of faux squared distance on a manifold (e.g. relative entropy) and extract from it a Riemannian metric. It’s apparently due to Eguchi, whose work I haven’t seen yet; I’ve just read the summary in section 3.2 of Amari and Nagaoka’s book Methods of Information Geometry. The term they use for a “faux squared distance” is contrast function.

In my post, I sketched the proof of the fact that if you start with any of the qq-logarithmic (Tsallis) relative entropies, the resulting Riemannian metric on the simplex is just the Fisher metric, up to a constant factor. This is certainly known, and can be found in information geometry texts such as the new book by Ay, Jost, Lê and Schwachhöfer. I think they use the term “α\alpha-divergence” for this relative entropy (where their α\alpha is essentially our qq), and the “α\alpha-connection” is also an important part of the story.

I don’t know whether the same fact for the Rényi entropies is widely known. Last summer, I met Nihat Ay, one of the authors of this book, at Luminy. I asked him whether he knew what Riemannian metric on Δ n\Delta_n came out of the Rényi entropies, and he said he didn’t, but he correctly guessed that it would essentially be the Fisher metric again. So maybe it’s somehow intuitive to experts.

Posted by: Tom Leinster on May 5, 2018 12:31 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

There is a literature on q-generalized versions of Fisher information that may interest you. I don’t know whether this might have implications for the arguments that you make about invariance. In particular, have a look at:

Casas, M., Chimento, L., Pennini, F., Plastino, A., and Plastino, A. R. (2002). Fisher information in a Tsallis non-extensive environment. Chaos, Solitons and Fractals, 13: 451-459.

Posted by: James Juniper on May 5, 2018 4:13 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Thanks for the reference!

As you say, that’s about Fisher information rather than the Fisher metric. But the two things are closely related, and a priori, I’d expect a deformation or generalization of Fisher information to go hand in hand with a deformation or generalization of the Fisher metric. I’ve only just downloaded the paper, so at the moment I have no idea what could be going on. If you figure it out, please let me know!

Posted by: Tom Leinster on May 5, 2018 1:16 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

I suppose my post was an implicit invitation for people to tell me that the Fisher metric can be deformed! Let’s see if I can begin to figure out what’s happening in the paper that James linked to.

If I’m understanding correctly, they do something like the following. For a real parameter qq, they define the qq-Fisher information of a family (f(;θ)) θ(f(-; \theta))_\theta of probability distributions to be

f(x;θ) q2(fθ) 2dx. \int f(x; \theta)^{q - 2} \biggl( \frac{\partial f}{\partial \theta} \biggr)^2 d x.

When q=1q = 1, this gives the standard Fisher info. They prove some kind of generalized Cramér-Rao inequality for it.

They don’t discuss the Fisher metric, but given the definition above, the obvious thing to do is to put

t,u=p i q2t iu i. \langle t, u \rangle = \sum p_i^{q - 2} t_i u_i.

Here t,u nt, u \in \mathbb{R}^n are tangent vectors at a probability distribution pp on {1,,n}\{1, \ldots, n\}, that is, points in n\mathbb{R}^n such that t i=0=u i\sum t_i = 0 = \sum u_i. Again, this recovers the standard Fisher metric when q=1q = 1.

You can do this, but I haven’t yet understood why you’d want to.

Posted by: Tom Leinster on May 8, 2018 11:31 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

The nLab article information geometry needs a lot of work, but I’ve just added a link to John Baez’s great series of articles, Information Geometry.

Posted by: David Corfield on May 5, 2018 9:42 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Good. Part 7 of that series has an account of the basic fact at the heart of my post: that the Riemannian metric induced by the square root of ordinary relative entropy is the Fisher metric.

Posted by: Tom Leinster on May 5, 2018 1:08 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Here’s a puzzle I don’t know the answer to.

The Fisher metric makes the simplex of probability distributions on an nn-element set isometric to a subset of the unit sphere in n\mathbb{R}^n:

This fact makes the Fisher metric very canonical… or does it?

The usual round metric on the sphere is highly canonical because its symmetry group—technically, its isometry group—has the maximum possible dimension of any isometry group of any Riemannian manifold of that dimension.

In short, changing the round sphere in any way, except making it larger or smaller, makes it less symmetrical.

But subset of the sphere corresponding to the simplex with its Fisher metric inherits almost none of the sphere’s symmetries: only the discrete group S n\mathrm{S}_n that permutes the vertices of the simplex. Any metric you define on the space of probability distributions on the nn-element set will have those permutation symmetries, if you don’t favor some elements of that set over others.

So the puzzle is: what does the large symmetry group of the round sphere have to do with the Fisher metric? Is it just a red herring?

Posted by: John Baez on May 5, 2018 10:39 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

One could come at it from another angle:

  1. Relative entropy is a canonical concept, because it’s uniquely characterized by a short list of properties that you and I both know well.

  2. Since the Fisher metric on the simplex is derived from relative entropy by a canonical process, that means that the Fisher metric is also a canonical concept.

  3. And transporting the Fisher metric from the simplex to the positive orthant of the sphere along the diffeomorphism you implicitly mention (take square roots in each coordinate), which is also a natural enough move, gives the round metric on the orthant.

So, we should also accept the round metric on the orthant as a canonical concept.

That’s not a symmetry argument, except to the extent that symmetry (permuation of coordinates) is involved in the axiomatization of relative entropy. But it is an argument — if one was needed! — that the geodesic metric on an orthant of the sphere is a natural thing.

Posted by: Tom Leinster on May 5, 2018 1:23 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

What if I instead said the following?

The flat metric makes the simplex of probability distributions on an nn-element set isometric to a subset of the unit sphere in n+1\mathbb{R}^{n+1}.

The usual flat metric on n+1\mathbb{R}^{n+1} is highly canonical because its symmetry group—technically, its isometry group—has the maximum possible dimension of any isometry group of any Riemannian manifold of that dimension.

But it’s not at all clear to me that the flat metric is very canonical in the context of the simplex of probability distributions, as such.

Posted by: Mark Meckes on May 6, 2018 3:18 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Since I’m a radical at heart, I’m inclined to a solution like this: since the Fisher metric reveals the space of probability distributions to be a piece of the unit sphere, but the symmetries only become evident when we consider the whole sphere, not just this piece, let’s figure out how to do probability theory using the whole sphere.

So, instead of working with probabilities p i[0,1]p_i \in [0,1] with

i=1 np i=1 \sum_{i = 1}^n p_i = 1

we should work with some quantities ψ i[1,1]\psi_i \in [-1,1] with

i=1 nψ i 2=1 \sum_{i = 1}^n \psi_i^2 = 1

These new quantities should be related to the probabilities by

p i=ψ i 2 p_i = \psi_i^2

I believe the map ψp\psi \mapsto p maps the positive orthant of the unit sphere isometrically to the simplex with its Fisher metric. But if this is true, it seems every orthant of the unit sphere should be mapped isometrically to the simplex with its Fisher metric! After all, there’s nothing special about the positive orthant if what we’re doing to ψS n1\psi \in S^{n-1} is squaring each of its components.

(It’s a bit hard to visualize how the sphere gets folded down to the simplex by this map, but I can do it when n=2n = 2.)

Of course, physicists will notice that ψS n1\psi \in S^{n-1} looks like a quantum state, and the relation between ψ\psi and pp looks like the usual relation between amplitudes and probabilities… except that we’re doing real quantum theory instead of the more common complex quantum theory.

I’ve written about real quantum theory and it’s a perfectly fine thing, though not quite as nice as the complex version:

Just as the symmetries of an nn-dimensional complex Hilbert space form the group U(n)U(n), those of a real Hilbert space form the group O(n)O(n). Those are the symmetries I was after.

Mathematicians may flinch at this intrusion of “physics” in what had been a discussion of probability theory, but I think that would be unfair. Quantum theory is really just an alternative version of probability theory, different from “classical” probability theory but just as consistent. The only reason it’s part of “physics” is that our universe happens to use this alternative version, and the first people to discover this alternative version were people who carefully studied atoms and stuff.

There are plenty of other reasons to get worried about my radical attempt to “explain” why the Fisher metric makes the simplex round. In fact there are so many that I don’t think I’ll try to list them right now, or try to deal with them. I’ll just mention that physicists typically pass from quantum theory back to classical probability theory by choosing an orthonormal basis e ie_i of an nn-dimensional Hilbert space. This lets them define components

ψ i=e i,ψ \psi_i = \langle e_i, \psi \rangle

for any vector in the Hilbert space, and probabilities p i=|ψ i| 2p_i = |\psi_i|^2 for any unit vector ψ\psi. But when we have a real Hilbert space, this choice determines a “positive orthant” in the unit sphere, and breaks the symmetries from O(n)O(n) down to S nS_n. So, the extra symmetries of the sphere are lost when we pick this basis… but the roundness of the Fisher metric on the simplex may be a kind of ghostly hint that these symmetries had been there.

I realize all this sounds rather weird.

Posted by: John Baez on May 8, 2018 6:35 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Mathematicians may flinch at this intrusion of “physics” in what had been a discussion of probability theory, but I think that would be unfair.

John, I have a feeling you may very reasonably have had me in mind when you wrote that. I completely accept what you write in that paragraph.

Personally speaking, I have nothing against physics. I loved physics at school, though I was dismayed that when I was about 17 or 18, I suddenly dropped from being top of the class to not-so-top of the class. Apparently my ability to do it began to fall off a cliff, the way we often see happening with our mathematics students.

My only problem with physics is an extremely simple one: I don’t know much of it. So when I read explanations of mathematics in physical terms, I very often can’t understand them. That’s all.

If I learn some piece of mathematics, and it also happens to be a part of physics, then great: I’ve accidentally learned some physics (though usually without really understanding the physical background). For instance, I accidentally learned a bit about thermodynamics from working with entropy.

Mind you, I suppose there’s something else going on in the psychological background. Suppose someone were to tell me a story about points on the unit sphere, but insisted on calling them “quantum states” rather than “points on the sphere”. (You didn’t do that — it’s just an example.) Then I’d feel I was missing out on something important, because I know that “quantum state” means a great deal to some people and not much to me. The words we use matter. I’d prefer the other person to either stick to “points on the sphere”, so that the story had to be told and motivated purely in mathematical terms, or take the time to explain what a quantum state is physically, so that I could share that understanding.

Anyway, John, your comment makes perfect sense even to me :-)

Posted by: Tom Leinster on May 8, 2018 7:46 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

I’m glad my comment made sense—and yes, you are among the mathematicians I had in mind when I made that remark.

The buzzword “quantum state” is really relevant here, because there are two kinds of probability theory, classical and quantum, and thinking about them together sheds some new light on the Fisher metric.

Just as in “classical” probability theory we have “classical observables” (say real-valued functions on a finite set whose expected values we wish to compute) and “classical states” (probability distributions on that finite set, which allow us to compute an expected value for any real-valued function on that set), in “quantum” probability theory we have “quantum observables” (say self-adjoint operators AA on a finite-dimensional Hilbert space) and “quantum states” (say points ψ\psi on the unit sphere in that Hilbert space, which combine with observables to give expected values ψ,Aψ\langle \psi, A \psi \rangle). There’s a standard procedure for seeing the first setup as a special case of the second: restrict attention to self-adjoint operators that are diagonal in some orthonormal basis, so that

ψ,Aψ= iXA ii|ψ i| 2 \langle \psi, A \psi \rangle = \sum_{i \in X} A_{i i} |\psi_i|^2

where now A iiA_{i i} gives us our “classical observable” (a real-valued function on XX) and |ψ i| 2|\psi_i|^2 gives us our “classical state” (a probability distribution on XX).

This is stuff physicists think about a lot: in fact, if I had to stuff as much knowledge about quantum physics as possible into two sentences, so a future race of pure mathematicians could learn nonobvious facts about physics, the above paragraph might be my first try.

What I noticed was that all this stuff makes it very natural to think of the simplex of “classical states” with its Fisher metric as both a subspace and a quotient space of the sphere of “quantum states” with its maximally symmetrical—i.e., round—metric.

I feel this has got to be important, but I need to dream up a way to do something with it.

Posted by: John Baez on May 15, 2018 11:20 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Interesting, Tom! Do you know if something similar works for the space of probability distributions on infinite sets, especially countable ones?

Posted by: James Borger on May 5, 2018 10:57 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Thanks.

I can only guess at how this might work for probability distributions on infinite sets. Information geometers certainly consider this scenario, but I don’t know much about it. Maybe John can say something.

All I can say for sure is that relative entropy (unlike ordinary Shannon entropy) generalizes gracefully from finite sets to arbitrary measurable spaces. Given probability measures ν\nu and μ\mu on a measurable space Ω\Omega, one defines the entropy of ν\nu relative to μ\mu to be

H(νμ)= Ωlog(dνdμ)dν[0,], H(\nu \| \mu) = \int_\Omega \log \biggl( \frac{d\nu}{d\mu} \biggr) d\nu \in [0, \infty],

where that’s a Radon–Nikodym derivative on the right-hand side. For instance, when Ω=\Omega = \mathbb{N}, this reduces to the same formula as for finite sets but with the sum now countably infinite. But I haven’t attempted to go deeper.

Posted by: Tom Leinster on May 5, 2018 1:31 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

James wrote:

Do you know if something similar works for the space of probability distributions on infinite sets, especially countable ones?

It must. I describe part of the story here:

Like most people in this game, I skip right past countably infinite sets and work more generally with measure spaces, or more precisely measurable spaces: spaces equipped with a σ\sigma-algebra of subsets. (For a countable set you can just use all subsets.)

Given two probability measures μ,ν\mu,\nu on a measurable space Ω\Omega, their relative entropy is defined by

S(μ,ν)= Ωlog(dμdν)dμ S(\mu, \nu) = \int_\Omega \; \log(\frac{d\mu}{d\nu}) \; d\mu

where dμdν\frac{d\mu}{d\nu} is the Radon–Nikodym derivative. So, the relative entropy is, at least prima facie, only well-defined when μ\mu is absolutely continuous with respect to ν\nu.

To avoid worrying about this all the time, and get a bunch of probability measures whose relative entropies are all well-defined, I fix a σ\sigma-finite measure ω\omega on my measurable space Ω\Omega, and then consider probability measures of the form pωp \omega where pL 1(Ω,ω)p \in L^1(\Omega,\omega) is any nonnegative function that integrates to 1: that is, any probability distribution.

With these “boring technicalities” taken care of in an appendix, I then let myself derive the Fisher metric from relative entropy in a fairly free-wheeling way. Anyone who wants to dig deeper into the technicalities should read this:

  • Raymond Streater, Introduction to non-parametric estimation, in Algebraic and Geometric Methods in Statistics, eds. Paolo Gibilisco, Eva Riccomagno, Maria Piera Rogantin and Henry P. Wynn, Cambridge U. Press, Cambridge, 2009.

The technicalities are not actually boring if you like analysis.

One thing neither Streater or I do is to develop the picture of probability distributions with their Fisher metric as forming an ‘orthant’ of an infinite-dimensional sphere. This should just be

{ψL 2(Ω,ω):ψ0 and Ωψ 2dω=1} \{ \psi \in L^2(\Omega,\omega) : \psi \ge 0 \; \text{ and } \; \int_\Omega \psi^2 \, d\omega = 1 \}

and I’m hoping its interior is a Hilbert manifold. Here ψ\psi is just another way of talking about our probability distribution pp: we have p=ψ 2p = \psi^2.

Streater shoots off in another direction and makes probability distributions into a Banach manifold in a different way, but I fear I’m exhausting everyone’s patience with infinite-dimensional manifolds.

Posted by: John Baez on May 8, 2018 5:10 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

It’s interesting that both the (relative) Renyi and (relative) Tsallis entropies fall in a common family of deformations of (relative) Shannon entropy – those of the form H q λH_q^\lambda – and that this rigidity result holds for this family of deformations. It leaves me wondering:

  • Are there good theoretical reasons to be particularly interested in deformations of the form H q λH_q^\lambda?

I’d be happy with a theoretical motivation to study any family of deformations in this ballpark – more or less general, etc. I think I’m a bit hamstrung because I don’t even know a good theoretical motivation for studying p\ell_p spaces, which seems like a simpler question with a familial resemblance that I should learn about first.

Another question this leaves me with is:

  • Does the rigidity of the Fisher metric extend to any even more general family of deformations?
Posted by: Tim Campion on May 6, 2018 8:19 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Good questions.

There are various characterization theorems for the Rényi and qq-logarithmic entropies, both ordinary and relative. These are good theoretical reasons for being interested in those entropies.

However, I don’t know of any similar characterization theorems for quantities of the form H q λH_q^\lambda. It wouldn’t surprise me if such theorems were known to somebody, though, because this kind of thing has been investigated pretty thoroughly over the years.

Posted by: Tom Leinster on May 6, 2018 8:47 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

For reasons to study p\ell_p and L pL_p spaces, try checking out this MathOverflow question and this n-Cafe post.

Posted by: Mark Meckes on May 6, 2018 10:17 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

There is a way to get a q-deformed analog of the Fisher metric on the finite dimensional probability simplex along with an associated divergence. See e.g.:

https://arxiv.org/abs/0911.1764

and the source of many of the definitions (from the information geometry community):

https://pdfs.semanticscholar.org/3531/20af3a8be7586b89a37151879a32811c4546.pdf

Posted by: Marc Harper on May 8, 2018 2:35 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

In earlier discussions:

https://johncarlosbaez.wordpress.com/2012/08/24/more-second-laws-of-thermodynamics/#comment-35387

the following deformation of the relative Renyi entropy appeared: 11Tβ ip i Tβq i (1Tβ) \frac{1}{\frac{1}{T}-\beta}\sum_i p_i^{T\beta}q_i^{(\frac{1}{T}-\beta)}

On the first glance it doesn’t seem to fall into your class of deformations, but I might oversee something.

So I just dump this here in case this might be interesting for you.

Posted by: nad on May 9, 2018 8:08 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

I just typed in a comment and got the reply:

Movable Type An error occurred This shouldn’t happen at /System/Library/Perl/5.18/Text/Wrap.pm line 84.

Anyways I mentioned that in earlier discussions

https://johncarlosbaez.wordpress.com/2012/08/24/more-second-laws-of-thermodynamics/#comment-35387

a deformation of the relative Renyi entropy appeared 11Tβ ip i Tβq i (1Tβ) \frac{1}{\frac{1}{T}-\beta} \sum_i p_i^{T \beta}q_i^{(\frac{1}{T}-\beta)} which doesn’t seem to be the same as yours-but I might oversee something. I just wanted to dump this here in case this is interesting for you.

Posted by: nad on May 9, 2018 8:17 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

By the way: on a much more silly note, the title of this blog article reminds me of a certain song.

Posted by: John Baez on May 15, 2018 11:22 PM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

I had it in mind too :-)

Posted by: Tom Leinster on May 16, 2018 10:45 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

Hello, thank you, nice… however one could give exactly the opposite title and interpretation of the results you presented: “Fisher Metric Deforms nicely”. Conformally to one’s naive intuition, the fisher metric nicely scales with the q-deformation: I mean there is a nice multiplicative q (that you call “constant”) in the last expression of the metric that may not be forgotten, no?

Posted by: Pierre Baudot on January 6, 2019 10:29 AM | Permalink | Reply to this

Re: The Fisher Metric Will Not Be Deformed

… A question could be: can the results you show be interpreted as a statement of the kind:”q-deformations preserve angles”?

Posted by: Pierre Baudot on January 6, 2019 11:08 AM | Permalink | Reply to this

Post a New Comment