Enrichment and the Legendre-Fenchel Transform II

May 22, 2014

Enrichment and the Legendre-Fenchel Transform II

Posted by Simon Willerton

$MathML-enabled post (click for more details).$

Last time, I described the Legendre-Fenchel transform. For a real vector space $V$ , the transform gives a bijection between the convex functions on $V$ and the convex functions on its dual $V^{#}$ . This time I’ll show how this is naturally an example of the profunctor nucleus construction in enriched category theory.

The key to this relationship is that the natural pairing $V\times V^{#}\to \mathbb{R}$ between a vector space and its dual can be thought of as profunctor between discrete $\overline{\mathbb{R}}$ -categories, where $\overline{\mathbb{R}}=[-\infty ,\infty ]$ is the extended real numbers thought of as a closed, symmetric monoidal category.

As I’ll explain, we can then apply the machinary of the nucleus construction to get a pair of maps between the $\overline{\mathbb{R}}$ -presheaves on $V$ , which just means the $\overline{\mathbb{R}}$ -valued functions on $V$ , and the $\overline{\mathbb{R}}$ -copresheaves, ie the $\overline{\mathbb{R}}$ -valued functions, on $V^{#}$ . Restricting to the part on which the pair of maps is bijective, we get that the transform gives a bijection between the respective sets of convex functions.

This category theoretic treatment gives us a little more than the standard treatment. Usually people put an ordering on the functions and observe that Legendre-Fenchel transform is order reversing. The category theory tells us that we can refine this and put a generalized, generalized metric space structure on the space of functions so that the transform is distance non-increasing, and is in fact distance preserving in the case of convex functions. This is illustrated by the above picture, as we will discover below!

$MathML-enabled post (click for more details).$

1. Enriching over the extended real numbers

The basics

We will be thinking about enriched categories over $\overline{\mathbb{R}}\coloneqq [-\infty ,+\infty ]$ , the extended real numbers. We make $\overline{\mathbb{R}}$ into a category by using the ordering $\ge$ , so there is a morphism $m\to n$ if and only if $m\ge n$ . (The opposite ordering is considered in Lawvere’s paper State categories, closed categories, and the existence of semi-continuous entropy functions.)

We make $\overline{\mathbb{R}}$ into a symmetric monoidal category using addition ‘ $+$ ’ and we can use subtraction ‘ $-$ ’ as the internal hom to make it into a closed symmetric monoidal category: this is because $m+n\ge p\qquad \text {if and only if}\qquad m\ge p-n.$ It’s important here to note how this arithmetic extends to the infinite elements, in particular, despite notational appearances, $-\infty$ is not the additive inverse of $+\infty$ , indeed $\infty -\infty$ is a rather ambiguous thing to write as $(+\infty )-(+\infty )=-\infty \qquad \text {and}\qquad (+\infty )+(-\infty )=+\infty .$ Full addition and subtraction tables can be found in the nice Bachelor’s thesis of Soichiro Fujii.

We’ve got a closed symmetric monoidal category $\overline{\mathbb{R}}$ so we can consider categories enriched over $\overline{\mathbb{R}}$ . An $\overline{\mathbb{R}}$ -category is a generalized, generalized metric space!

Remember that a generalized metric space in the sense of Lawvere is like a metric space but the main difference is that we don’t impose that the ‘distances’ are symmetric, so in general $\mathrm{d}(x,y)\ne \mathrm{d}(y,x)$ .

An $\overline{\mathbb{R}}$ -category is further generalized in that we allow negative ‘distances’. For want of a better name, these generalized, generalized metric spaces I will call ‘ $\overline{\mathbb{R}}$ -spaces’. This means that an $\overline{\mathbb{R}}$ -space is a set $X$ with a function $\mathrm{d}\colon X\times X\to \overline{\mathbb{R}}$ such that

$0\ge \mathrm{d}(x,x)$ for all $x\in X$ ;
$\mathrm{d}(x,y)+\mathrm{d}(y,z)\ge \mathrm{d}(x,z)$ for all $x,y,z\in X$ .

Putting $x=y=z$ into the second inequality gives that $\mathrm{d}(x,x)$ is either $0$ or $-\infty$ , with latter only possible if $\mathrm{d}(x,y)$ and $\mathrm{d}(y,x)$ are infinite for every $y\in X$ .

An example of an $\overline{\mathbb{R}}$ -space is $\mathbb{R}^{n}$ equipped with $\overline{\mathbb{R}}$ -distance $\mathrm{d}(x,y):=\max _{i}\{ y_{i}-x_{i}\} .$ If you know about the Hilbert metric you might wish to think how this is related.

Vector spaces

Now suppose that we have a real vector space $V$ . We think of this as a discrete $\overline{\mathbb{R}}$ -spaces so that $\mathrm{d}(v,w)=+\infty$ for $v\ne w$ and $\mathrm{d}(v,v)=0$ . Similarly, we think of the dual vector space $V^{#}\coloneqq \text {Hom}(V,\mathbb{R})$ as a discrete $\overline{\mathbb{R}}$ -space. The tautological pairing $\langle \, ,\, \rangle \colon V\times V^{#}\to \overline{\mathbb{R}}$ can be considered as an $\overline{\mathbb{R}}$ -profunctor between these $\overline{\mathbb{R}}$ -spaces.

We can now interpret what the $\overline{\mathbb{R}}$ -spaces of presheaves and opcopresheaves on our vector spaces are. As the vector spaces are considered as discrete $\overline{\mathbb{R}}$ -metric spaces the $\overline{\mathbb{R}}$ -space $\hat{V}$ of presheaves on $V$ and $\check{V^{#}}$ the $\overline{\mathbb{R}}$ -space of opcopresheaves on $V^{#}$ , will just be the sets of $\overline{\mathbb{R}}$ -valued functions with no constraints on them, the interesting pieces of structure are the $\overline{\mathbb{R}}$ metrics: $\begin{aligned} \mathrm{d}(f_{1},f_{2})&\coloneqq \sup _{x\in V}\{ f_{2}(x)-f_{1}(x)\} \in [-\infty ,+\infty ]\qquad \text {for all}\, \, f_{1},f_{2}\colon V\to \overline{\mathbb{R}}\\ \mathrm{d}(g_{1},g_{2})&\coloneqq \sup _{k\in V^{#}}\{ g_{1}(k)-g_{2}(k)\} \in [-\infty ,+\infty ]\qquad \text {for all}\, \, g_{1},g_{2}\colon V^{#}\to \overline{\mathbb{R}}. \end{aligned}$ The distance from a function $f_{1}$ to a function $f_{2}$ is the most that you have to go ‘up’ from $f_{1}$ to get to $f_{2}$ , where going down means going ‘up’ a negative amount.

For example, for the functions illustrated at the top of the post we have $\mathrm{d}(f_{1},f_{2})=1,\qquad \mathrm{d}(f_{2},f_{1})=3,$ and $\mathrm{d}(f_{1}^{\ast },f_{2}^{\ast })=1,\qquad \mathrm{d}(f_{2}^{\ast },f_{1}^{\ast })=3.$ The fact that these distances coincide is not mere coincidence, as we shall below.

2. The nuclear construction

Now I will remind you of the general construction of profunctor nucleus (this needs a better name!) that I blogged about last year. We start with a $\mathcal{V}$ -profunctor between two $\mathcal{V}$ -categories and obtain a duality between certain subcategories of presheaves on one category and copresheaves on the other. In practice, these often turn out to be interesting dualities.

For instance, if we work over the monoidal category $\{ \mathrm{true}, \mathrm{false}\}$ then a profunctor between discrete categories amounts to a relation between sets, whereas a presheaf amounts to a subset, so from a relation between between sets you get a bijection between certain distinguished sets of subsets. This underpins many dualities in mathematics. Here is a classic example.

Suppose we take one set to be $\mathbb{C}^{n}$ , the other set to be polynomials in $n$ -variables, and take the relation to be that $x\in \mathbb{C}^{n}$ is related to $p\in \mathbb{C}[x_{1},\dots ,x_{n}]$ if $p(x)=0$ . Then the resulting duality is the classical algebraic geometry correspondence between algebraic subsets of $\mathbb{C}^{n}$ and radical ideals of $p\in \mathbb{C}[x_{1},\dots ,x_{n}]$ .

I should remind you what process leads us here. Starting with a closed symmetric category $\mathcal{V}$ , two $\mathcal{V}$ -categories $\mathcal{A}$ and $\mathcal{B}$ , and a profunctor $\mathcal{M}\colon \mathcal{A}^{\mathrm{op}}\otimes \mathcal{B}\to \mathcal{V}$ between them, we obtain an adjunction between presheafs on $\mathcal{A}$ and opcopresheaves on $\mathcal{B}$ . $\mathcal{M}^{\ast }\colon \hat{\mathcal{A}}\leftrightarrows \check{\mathcal{B}}\, \colon \!\mathcal{M}_{\ast }.$ The left and right adjoints can be given by end formulas: for instance if $p\colon \mathcal{A}^{\mathrm{op}}\to \mathcal{V}$ is a presheaf on $\mathcal{A}$ then $(\mathcal{M}^{\ast }p)(b) = \int _{a} [p(a),\mathcal{M}(a,b)].$ The nucleus is defined as the centre of this adjunction. This means that it consists of the induced isomorphism between the part on which the monad is an isomorphism and the comonad is an isomorphim.

3. The synthesis

We can now feed our vector space pairing in to the profunctor nucleus machine. What pops out of the machine first is an $\overline{\mathbb{R}}$ -adjunction between function spaces: $\Fun(V,\overline{\mathbb{R}})\leftrightarrows \Fun(V^{#},\overline{\mathbb{R}}).$ We can denote both of the maps by asterisks, and we find that both of them have the same form, so for $f\colon V\to \overline{\mathbb{R}}$ a function on $V$ and $g\colon V^{#}\to \overline{\mathbb{R}}$ a function on $V^{#}$ we find $\begin{aligned} f^{\ast }(k)\coloneqq \sup _{x\in V}\big \{ \langle k,x\rangle -f(x)\big \} ;\qquad g^{\ast }(x)\coloneqq \sup _{k\in V^{#}}\big \{ \langle k,x\rangle -g(k)\big \} . \end{aligned}$ These are, of course, precisely the Legendre-Fenchel transform and its reverse!

The standard theory of the profunctor nucleus tells us that $f^{\ast }=f^{\ast \ast \ast }$ and $f\mapsto f^{\ast \ast }\, \, \text {is an idempotent operation.}$ This means that if we restrict to the image inside $\Fun(V,\overline{\mathbb{R}})$ then we get an isomorphism. The image consists of the (lower semi-continuous) convex functions, so we get $\Cvx(V,\overline{\mathbb{R}})\cong \Cvx(V^{#},\overline{\mathbb{R}}).$ This is the usual Fenchel-Legendre duality.

From our machine we get a little more than this. As we have an adjunction we know that $\mathrm{d}(f^{\ast },g)=\mathrm{d}(f,g^{\ast }),$ : spelling that out we have the following theorem.

Theorem: Suppose that we have functions $f\colon V\to \overline{\mathbb{R}}$ and $g\colon V^{#}\to \overline{\mathbb{R}}$ then $\sup _{k\in V^{#}}\{ f^{\ast }(k)-g(k)\} = \sup _{x\in V}\{ g^{\ast }(x)-f(x)\} .$

I would welcome any nice interpretation of this result!

On top of that, we know that the transform is a distance non-increasing map so we have $\mathrm{d}(f_{1},f_{2})\ge \mathrm{d}(f_{1}^{\ast },f_{2}^{\ast }),$ . Spelling this out as well gives the following.

Theorem: Suppose that we have functions $f_{1},f_{2}\colon V\to \overline{\mathbb{R}}$ then $\sup _{x\in V}\{ f_{2}(x)-f_{1}(x) \} \ge \sup _{k\in V^{#}}\{ f_{1}^{\ast }(k)-f_{2}^{\ast }(k) \} .$

As the reverse transform is also distance non-increasing, and as we know that $f=f^{\ast \ast }$ when $f$ is convex, we get the immediate corollary that distance is preserved for convex functions.

Corollary: Suppose that we have (lower semi-continuous) convex functions $f_{1},f_{2}\colon V\to \overline{\mathbb{R}}$ then $\mathrm{d}(f_{1},f_{2})=\mathrm{d}(f_{1}^{\ast },f_{2}^{\ast })$ , in other words, $\sup _{x\in V}\{ f_{2}(x)-f_{1}(x) \} = \sup _{k\in V^{#}}\{ f_{1}^{\ast }(k)-f_{2}^{\ast }(k) \} .$

This is illustrated by the example at the top of the post (click for a larger picture), where we have already observed that $\mathrm{d}(f_{1},f_{2})=1=\mathrm{d}(f_{1}^{\ast },f_{2}^{\ast }),\qquad \mathrm{d}(f_{2},f_{1})=3=\mathrm{d}(f_{2}^{\ast },f_{1}^{\ast }).$

This $\overline{\mathbb{R}}$ -metric on the space of (convex) functions seems to be overlooked in literature that I’ve seen. (In many cases the distance between convex functions will be infinite.) However, people do consider the partial order $\ge$ on the set of convex functions: $f_{1}\ge f_{2}$ when $f_{1}(x)\ge f_{2}(x)$ for all $x$ . The $\overline{\mathbb{R}}$ -distance is a refinement of this partial order in the $f_{1}\ge f_{2}$ if and only if $\mathrm{d}(f_{1},f_{2})\le 0$ .

The standard fact that the Legendre-Fenchel transform is order reversing is then a corollary of the above corollary: in other words we get that for convex functions $f_{1},f_{2}\colon V\to \overline{\mathbb{R}}$ we have $f_{1}\ge f_{2}\quad \Leftrightarrow \quad f_{2}^{\ast }\ge f_{1}^{\ast }.$

If anyone has seen these $\overline{\mathbb{R}}$ -metrics being used in this context I’d be happy to hear about it.

Anyway, the encroachment of category theory into analysis continues…

Posted at May 22, 2014 9:58 PM UTC

TrackBack URL for this Entry: https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2743

15 Comments & 1 Trackback

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

Nice post!

The corollary (with $f_1$ an arbitrary function) is called Toland–Singer duality, and seems to be known in the field of continuous optimization. But the simpler first theorem yields it immediately and it’s pleasant that the theorem is just the adjunction isomorphism!

Posted by: Soichiro Fujii on May 23, 2014 7:45 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

Soichiro is too modest to mention it, but readers of this blog may want to have a look at his/her bachelor’s thesis, A categorical approach to L-convexity, which begins like this:

The study presented in this thesis is first motivated and inspired by a recent paper by Simon Willerton [10].

Here [10] is Simon’s paper Tight spans, Isbell completions and semi-tropical modules, in which Simon shows that the invariant part of the Isbell conjugacy adjunction of a metric space $X$ is closely related to the so-called tight span or injective envelope of $X$ .

Posted by: Tom Leinster on May 25, 2014 2:52 AM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

Soichiro Fuji observed

The corollary (with $f_1$ an arbitrary function) is called Toland–Singer duality

That’s great, thanks! From a quick google these results do seem to be used.

In case people are interested, the Toland is John Toland who I’ve only known in his guise as Director of the Isaac Newton Institute and had no idea what sort of mathematics he did. The Singer is Ivan Singer who I’ve not come across before.

Posted by: Simon Willerton on May 25, 2014 10:18 AM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

At the risk of sounding like a spammer, very interesting post! I wish all blogs were like yours. You make Internet better place.

I got the deluxe version of the post by asking Simon to explain it to me in person. Reading through it now, I’m pleased to see that he didn’t miss anything out.

One thing I particularly like is how convexity appears naturally. It’s bothered me before that when dealing with $[0, \infty]$ as a monoidal category (as you do when understanding metric spaces as enriched categories), the lax monoidal functors are the subadditive functions (those satisfying $f(x) + f(y) \geq f(x + y)$ ), and the colax monoidal functors are the superadditive functions. Now subadditivity and superadditivity are sometimes important conditions, but they don’t seem as important as convexity and concavity, which feel rather similar. (For instance, if $f: [0, \infty] \to [0, \infty]$ is concave and fixes $0$ then it’s subadditive.) So it’s good to get convexity to simply materialize in an unforced way.

Posted by: Tom Leinster on May 25, 2014 2:43 AM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

At the risk of sounding like a spammer, very interesting post! I wish all blogs were like yours. You make Internet better place.

You wrote:

Theorem: Suppose that we have functions $f\colon V\to \overline{\mathbb{R}}$ and $g\colon V^{#}\to \overline{\mathbb{R}}$ . Then

$\sup _{k\in V^{#}}\{ f^{\ast }(k)-g(k)\} = \sup _{x\in V}\{ g^{\ast }(x)-f(x)\} .$

I would welcome any nice interpretation of this result!

Have you tried some classic examples? In classical mechanics we can take $f$ to arise from a Lagrangian so that $f^*$ arises from the corresponding Hamiltonian, as explained here. In thermodynamics there’s a whole host of examples — and indeed my second link here was written by the Committee on Legendre Transforms in Chemical Thermodynamics.

You are saying something about all these examples. I don’t know quite what it means: it could be important or it could be ‘trivial’ from a physics point of view. It might be easier to understand if you restricted to functions $f$ and $g$ on $V^{#}$ with $f^{\ast\ast} = f$ , $g^{\ast\ast} = g$ and wrote

$\sup_{k\in V^{#}}\{ f(k)-g(k)\} = \sup_{x\in V}\{ g^{\ast }(x)-f^{\ast}(x)\} .$

Functions that are their own double transforms are good enough to understand what’s going on in most applications, I think — physicists assume functions have this property.

Posted by: John Baez on May 25, 2014 2:39 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

Now I’m afraid that from a classical mechanics viewpoint your equation says that

$p \cdot \dot{q} = p \cdot \dot{q}$

where $p$ is momentum and $\dot{q}$ is velocity.

Posted by: John Baez on May 25, 2014 2:44 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

John said

I’m afraid that from a classical mechanics viewpoint your equation says that $p \cdot \dot{q} = p \cdot \dot{q}$ .

Can you explain where this came from?

And why is it a bad thing? We’re saying that two calculations give the same answer – it’s not unusual in these situations to discover that they are two ways of calculating a thing which has a natural description in another way.

Posted by: Simon Willerton on May 25, 2014 7:04 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

I think that all John means by “I’m afraid that” is: we already knew it.

Posted by: Jesse C. McKeown on May 26, 2014 3:32 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

John, your rephrasing is my Corollary. Having $f=f^{\ast\ast}$ is precisely my condition about $f$ being (lower semi-continuous) convex. The lower semi-continuous bit might sound scary, but it only comes into play when thinking about convex functions which take infinite values in places: it’s just saying that if $f$ is finite on an interval and infinite elsewhere, then it needs to be finite on the closed interval.

Posted by: Simon Willerton on May 25, 2014 7:11 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

It seems to me that this identity should have something to do with the semiclassical limit (the one passing from the Fourier/Laplace to the Legendre transform) of Parseval’s identity. There’s a relevant discussion over at MathOverflow. Parseval’s identity states that the Fouerier transform preserves the $L^2$ inner product. On the other hand, it seems that the Legendre transform preserves something lie a signed $\sup$ “metric”.

Posted by: Igor Khavkine on May 27, 2014 3:45 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

John wrote:

I’m afraid that from a classical mechanics viewpoint your equation says that $p \cdot \dot{q} = p \cdot \dot{q}$ .

Simon wrote:

Can you explain where this came from?

In physics we often start with a Lagrangian $L(q,v)$ that’s a function of position $q \in \mathbb{R}^n$ and velocity $v \in \mathbb{R}^n$ , and take its Legendre transform in the $v$ variables and get the Hamiltonian

$H(q,p) = p \cdot v - L(q,v)$

where the momentum $p \in \mathbb{R}^n$ is defined by

$p_i = \frac{\partial L}{\partial v_i}$

and we then solve for velocity in terms of momentum to think of $p \cdot v - L(q,v)$ as a function of $q$ and $p$ .

(Before I was writing $\dot{q}$ for $v$ , as physicists are wont to do, but now I’m afraid you’re unfamiliar with this stuff and will interpret $\dot{q}$ as the time derivative of something, which tends to trip people up in this context.)

Conversely, we can start with the Hamiltonian and take its Legendre transform to get the Lagrangian.

This may not look like the Legendre transform you’re talking about, but it is! More precisely, it’s a special case (where all the functions are so nice that taking the Legendre transform twice gets you back where you started) of a generalization (where all the functions depend on some extra variables $q_i$ , which don’t affect the Legendre transform). A quick intro is here.

And why is it a bad thing?

I was hoping your theorem would give some relations between Lagrangians and Hamiltonians which physicists would find exciting. But when I tried to see what it said in the above situation, it seemed to become something well-known, namely $p \cdot v = p \cdot v$ .

I’m not sure it gives this, which is why I used the word ‘afraid’.

John, your rephrasing is my Corollary. Having $f=f^{\ast\ast}$ is precisely my condition about $f$ being (lower semi-continuous) convex.

Okay, good. In a typical physics intro course, all these ‘niceness properties’ are taken for granted. Only very bold physicists, or very mathematical ones, would think about a Lagrangian that’s not lower semi-continuous as a function of velocity! Indeed, Legendre transforms as practiced by physicists typically assume that functions are nice enough to take on their minimum at a unique point, which is the point where the derivative vanishes. I’m sure you can find more careful treatments, but this is good enough for government work.

(You seem to be using maxima, but that’s just some difference in conventions.)

Posted by: John Baez on May 29, 2014 12:51 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

It maybe worth pointing out that the Legendre-Fenchel transform John is talking about takes a function on a tangent bundle and turns it into a function on the cotangent bundle. In other words we are talking about families of vector spaces, not a single vector space (and its dual).

Posted by: Eugene Lerman on May 29, 2014 6:53 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

Unwrapping the definitions, the first Theorem is that both sides are

(1)

\sup_{k,x} \Big( \langle k,x\rangle - (f(x)+g(k)) \Big)

or, in other words, that limits commute with limits; and maybe it’s saying that this two-variable supremum is what one should be interested in, rather than the two ways to calculate it; and what it looks like is comparing a bilinear thing with a sum of independent things.

Posted by: Jesse C. McKeown on May 26, 2014 3:57 PM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

I’ve finally finished the paper on this material and uploaded it to the arxiv.

The Legendre-Fenchel transform from a category theory perspective arxiv:1501.03791

Posted by: Simon Willerton on January 16, 2015 10:19 AM | Permalink | Reply to this

Re: Enrichment and the Legendre-Fenchel Transform II

$MathML-enabled post (click for more details).$

Congratulations! I’ve printed it out, which as we all know, is as good as reading it.

Posted by: Tom Leinster on January 16, 2015 2:17 PM | Permalink | Reply to this

Read the post The Space of Physical Frameworks (Part 2)
Weblog: The n-Category Café
Excerpt: What's a thermostatic system, and what are some basic things we can do with them?
Tracked: September 9, 2024 4:03 PM

The n-Category Café

Skip to the Main Content

May 22, 2014