Characterizing the Generalized Means
Posted by Tom Leinster
Generalized means are things like arithmetic means and geometric means. They can be ‘fair’, giving all their inputs equal status, or they can be weighted. I guess the first major result on them was the theorem that the arithmetic mean is always greater than or equal to the geometric mean.
Another, later, landmark was the 1934 book Inequalities of Hardy, Littlewood and Pólya, where they proved a characterization theorem for generalized means. It looks like this:
If you have some sort of ‘averaging operation’ with all the properties you’d expect of something called an averaging operation, then there aren’t many options: it must be of a certain prescribed form.
That’s ancient history. It could be even more ancient than Hardy, Littlewood and Pólya: I don’t know whether the characterization in their book is due to them, or whether it’s older still.
Yesterday, however, I posted about a new theorem of Guillaume Aubrun and Ion Nechita that gives a startlingly simple characterization of the $p$-norms. Since $p$-norms and generalized means are closely related, I wondered, out loud, whether it might be possible to deduce from their result a simple new characterization of generalized means. And if I’m not mistaken, the answer is yes.
To say it more precisely: unless I’ve made a mistake, it’s possible to derive from their result a simple characterization of the generalized means of orders $\geq 1$. (I’ll explain what that means in a moment.) Whether it’s new is harder to know; I’m not familiar enough with the literature to be confident on that.
A quick write-up is here.
Here’s the statement. I’ll write $\Delta_I = \{ w \in \mathbb{R}^I : w_i \geq 0, \sum_i w_i = 1 \}$ for the set of probability distributions on a finite set $I$. For any real $p \geq 1$, $w \in \Delta_I$ and $x \in \mathbb{R}^I$, we can form the mean of order $p$ of the numbers $x_i$s, weighted by the $w_i$s: $M_p(w, x) = (\sum_i w_i x_i^p)^{1/p}.$ There’s a sensible way to do it for $p = \infty$ too. These operations $M_p$ are the generalized means that we’re going to characterize.
Here we go. A system of means $M$ consists of a function $M: \Delta_I \times [0, \infty)^I \to [0, \infty)$ for each finite set $I$, satisfying:
- Normalization: $M((1), (c)) = c$ for all $c \in [0, \infty)$. (This tells you what the average of a one-member family is.)
- Monotonicity: $M(w, x) \leq M(w, y)$ if $x_i \leq y_i$ for all $i$.
- Functoriality: $M(f w, x) = M(w, x f)$ whenever $f: I \to J$ is a map of finite sets, $w \in \Delta_I$ and $x \in [0, \infty)^J$. I haven’t defined the notation $f w$ and $x f$, nor have I explained the intuitive idea behind this slick notation; you can figure it out for yourself or click the link.
A system of means is multiplicative if $M((v_i w_j)_{i, j}, (x_i y_j)_{i, j}) = M(v, x) M(w, y)$ whenever $v \in \Delta_I$, $w \in \Delta_J$, $x \in [0, \infty)^I$, $y \in [0, \infty)^J$. It satisfies the triangle inequality if $M(w, x + y) \leq M(w, x) + M(w, y).$
For each $p \in [1, \infty]$, we have the generalized mean $M_p$ of order $p$. It’s a multiplicative system of norms satisfying the triangle inequality. And there are no others:
Theorem Every multiplicative system of means satisfying the triangle inequality is of the form $M_p$ for some $p \in [1, \infty]$.
Proof here, I hope.
Re: Characterizing the Generalized Means
Very nice! I haven’t had a chance to look at the proof yet, but here’s a tiny correction to the post: in the statement of the theorem you wrote “norms” when you meant “means”.