Optimal Transport and Enriched Categories II
Posted by Simon Willerton
The dual transport problem from general linear programming duality
Last time I described the basic, discrete optimal transport problem in which there are some suppliers wanting to supply certain amounts of some resource (eg loaves of bread) and some receivers wanting to receive a certain amount of that resource. There’s a cost associated to transporting a unit of the resource from a supplier to a receiver, this cost is fixed for each pair of supplier and receiver. You want to plan how much resource to send from each supplier to each receiver so as to minimize the total cost.
I then said that finding this minimum cost was equivalent to finding the maximum revenue of a different problem, where someone, for each supplier, fixes a price that they’re willing to pay for a unit of the resource and, for each receiver, they fix a price that they’ll charge for a unit of resource. These prices are subject to a constraint from the original transportation cost. You’ll see again what that is in the post below.
This time I want to show how this second problem is the ‘linear programming dual’ of the first, ‘primal’ problem – this general theory of such linear programming duality is likely to be in every undergraduate course on optimization. I’ll end with some extra comments on general programming duality.
The primal linear programming problem
One formulation of a standard linear programming problem is that, for some given sets of constants , and , you wish to find non-negative quantities which minimize the sum
subject to the inequality constraints
We can phrase this problem more compactly as follows: given an -by- matrix and two vectors , we wish to find a vector which minimizes the cost
subject to the constraint
where this last inequality is interpreted componentwise. Similarly, I’ll sometimes write for .
People generally say is a feasible solution if it satisfies the constraint. Of course, there might not be any feasible solutions, eg. if with the constraint .
We can write for the minimum cost over all feasible solutions and for an optimal solution.
The eagle-eyed will have seen that I shouldn’t really say minimum here, I should really say greatest lower bound, as either there may be no feasible solutions, or the cost function might be unbounded on the feasible set, eg. minimize subject to . So an optimal solution might not exist for either of those two reasons, and if it does exist then it might not be unique.
The dual linear programming problem
Given the data , and as above, we can form the dual problem which is to find a which maximizes the value
subject to the constraint
Again, people say is a feasible solution if it satisfies the constraint.
Similar to above, write for the maximum value and for a maximizer, or optimal solution.
Note that in the dual problem you have a variable for each constraint in the primal problem, and a constraint for each variable in the primal problem.
Linear programming duality
It is easy to see that any feasible solution of the dual problem gives a value which is a lower bound for the cost of any feasible solution of the primal problem. This is simply because
This immediately gives weak linear programming duality which is that the maximum value of the dual problem is a lower bound for the maximum cost of the primal problem:
In fact, we have a stronger result.
Strong Linear Programming Duality. Given the data , and as above we have that if either the primal or the dual problem have an optimal solution then both have one, and the maximum value of the dual problem is equal to the minimum cost of the primal problem, ie. .
Perhaps unsuprisingly, given how I’ve written this, strong duality is a lot harder to prove. One way to prove it involves the use of a hyperplane separation theorem – this is hinting at the fact that the convexity of the set of feasible solutions, makes things nice in this situation. I won’t go into the proof here.
Deriving the dual transport problem
I’m now in a position to show you how to derive the dual optimal transport problem from last time. I’ll rewrite the optimal transport problem in the standard form of a linear programming problem as above and then read off the dual problem.
The primal problem
Recall that given a transport cost matrix with supply vector and demand vector , we wish to minimize the transport cost
subject to the supply and demand constraints
Note that I’ve used negation to reverse the supply constraint.
Now we define the following
Then our transport problem is precisely find which minimizes subject to . This is in the standard form so we can now write down the dual problem.
The dual problem
The dual problem is then immediately written down: given , and as above, maximize over subject to . Writing , this becomes the following.
Maximize
subject to
This is precisely the dual optimal transport problem as I gave last time.
Next time, I want to explain how you get a duality between the prices at suppliers/bakeries, and the prices at receivers/cafés. This will lead into a connection with the profunctor nucleus that I’ve posted about before.
I will end now with a digression that grew out of a reply to a question of Tom’s from the last post.
Appendix: general Lagrangian programming duality
This part is not strictly part of the narative of this series of posts, so feel free to skip it. I will mention a more general type of duality for optimization problems here, leave you to show how it generalizes the linear programming duality above and I will give some hints as to how this might tie in to enriched category theory. (The problem I’ll state will have non-negative input variables and inequality constraints, but the analysis below easily adapts to having inputs ranging over all of and having equality constraints. The case I give is just the nicest formulation for the linear programming duality example I’m interested in in this post.)
Suppose you want to minimize an arbitrary function over subject to the arbitrary constraints for . Equivalently, you could consider minimizing the following function over the whole of , ie. without any constraints:
Another way of expressing this function is via where is the indicator function on the feasible set – the subset of points satisfying the constraints – meaning is on the feasible set and off it.
It shouldn’t take you long to see that you can write the indicator function in the following form, because of the nature of the constraints:
where I’ve written for the vector .
This gives that the function you are trying to minimize over is
You can define the Lagrangian function . So what you wish to find is
And by the max-min inequality you get
If the Lagrangian function is particularly nice in some sense then you will get an equality here and you can say that a minimax theorem holds. For instance, you might expect this to happen when you have the cost function and the constraints being convex in some sense.
You are perhaps led at this point to define the Lagrangian dual function as follows:
Then the Lagrangian dual problem to the original constrained problem is to maximize for . As we’ve seen by the max-min inequality,
this means that the maximum of the dual problem is a lower bound for the minimum of the original constrained problem: this is weak duality. In nice cases, where the minimax theorem holds, the optima of the two problems will be equal: this is strong duality.
Before I go on to what this might all have to do with enriched category theory, you should check that this ties into the linear programming duality that this post is supposed to be about! Feel free to put your answer in the comments.
Nice, easy exercise. For the primal linear programming problem “Minimize for subject to .” find the Lagrangian function and the Lagrangian dual function . Show that the Lagrangian dual problem “Maximize for .” is equivalent to the dual linear programming problem “Maximize for subject to .” from earlier in this post.
Now, how should this Lagrangian duality picture relate to enriched category theory? What you will see in the next few posts is that you should be thinking in terms of enriching over the monoidal category of extended real numbers, where an object is either a real number or , and there is a unique morphism precisely when . In this category coproducts are infima and products are suprema. You should think of and as discrete -categories and functions on them as functors (or presheaves if you like).
Viewed in this way, the max-min inequality
is a manifestation of the canonical morphism that exists between expressions with a limit and colimit interchanged: for a functor , there is a canonical morphism
see the nLab. Similarly the minimax theorem holding is perhaps to be thought of as a statement about certain colimits and limits commuting.
Furthermore, the function gives us the pairing given by . This can be viewed as a profunctor. In the usual way of things profunctorial, which we’ll see more of in this series of posts, this profunctor gives rise to a functor from functors (presheaves) on to functors on given by a coend , which translated back into the language here is which is precisely the definition of the Lagrangian dual function.
At the moment, it is somewhat speculative whether there’s anything deeper categorical going on here, and, as I intimated, this is a slight digression from the main thread of this series of posts. So I will stop here for now.
Re: Optimal Transport and Enriched Categories II
Transportation problems have the special property that if the supply and demand conditions are integers, then there is an optimal assignment which is also integral. (See, for example, Kathleen Trustrum, Linear Programming (1971) p.35). Is this reflected in the categorical point of view?