**Guest post by Nadezda Chernyavskaya - ETH Zurich**

*Bavarian Evening at the Lindau Nobel Laureate Meetings - Picture Credit: Christian Flemming*

I'm stuck in bed with a bad back. I have been for a few days now. I am using the time to write in my summer writing projects, and talk to students and postdocs by *Skype* (tm). But it is hard to work when out sick, and it isn't necessarily a good idea. I'm not advocating it!

I resolved (for now, to my current satisfaction) my issues from a few days ago, about likelihoods for catalogs. I showed that the likelihood that I advocate does not give biased inferences, and does permit inference of the selection function (censoring process) along with the inference of the world. I did this with my first ever use of the Google (tm) Collaboratory (tm). I wanted to see if it works, and it does. My notebook is here (subject to editing and changing, so no promises about its state when you go there). If your model includes the censoring process—that is, if you want to parameterize and learn the catalog censoring along with the model of the world—then (*contra* Loredo, 2004) you have to use a likelihood function that depends on the selection function at the individual-source level. And I think it is justified, because it is the assumption that the universe plus the censoring is the thing which is generating your catalog. That's a reasonable position to take.

*guest post by Sophie Libkind and David Jaz Myers*

This post continues the series from the Adjoint School of Applied Category Theory 2019.

*guest post by Sophie Libkind and David Jaz Myers*

This post continues the series from the Adjoint School of Applied Category Theory 2019.

Now that we have self-driving cars and talking watches, holographic data analysis should be the next innovation inspired by Hollywood spy films. Think of when Tom Cruise in *Minority Report* shuffles information around a 3D screen. As a step towards this future, Fong and Spivak formulated a graphical system for doing regular logic in their paper Graphical Regular Logic. You can see Spivak give a talk about it here. This visual knowledge representation system has the potential to be an intuitive user-interface for regular logic. It would allow laypeople to harness the power of regular logic without the clunkiness of the formal language.

The road from regular logic to graphical regular logic has two major stops: regular categories and regular calculi.

In this post, we’ll start on the right with graphical regular logic and work our way leftwards to regular categories.

Graphical Regular Logic is meant to be broadly applicable and easy-to-use. So in this user’s guide we will apply it to something as non-mathy as a neighborhood watch program that is run by beings as gibberishy as a clan of minions.

When we imagine the headquarters of the neighborhood watch
program, there’s a big screen with glowing blue dots to
move, connect, and zoom in and out of. In graphical regular
logic, these orbs are called **shells**. A shell has finitely many ports
labeled with a type and wires can connect ports of the same
type. In the neighborhood watch program, we care about types
like `person`

and `location`

and use shells like the
following.

A shell is the foundation for talking about a relationship between the types labeled on the ports. A relationship between a person and a location (more formally *a predicate in context $\Gamma_{\text{person}, \text{location}}$*) is like a Madlibs with a blank space for a person and a location:

To represent such a relationship, we take a shell with a port for each blank space and fill the shell with the corresponding predicate.

We would read the right cell as “There exists a person $p$ and a location $\ell$ such that $p$ works at $\ell$”. Every shell represents a possible relationship between the wires coming out of it. There is no implied directionality on these wires.

Now comes the fun part where we really get to feel like Tom Cruise. How do we manipulate the filled shells in a way that reflects logical moves?

The most basic operation is to substitute one filled shell for another. If someone lives in a location, then it’s also true that they have visited that location. Therefore, if we see the filled shell corresponding to *(person) lives in (location)* then we should be able to replace it with the filled shell corresponding to *(person) has visited (location)*. The red arrows below indicate when we can substitute one filled shell with another.

In general, we are allowed to substitute in the midst of a more complicated diagram.

Sometimes we want a coarser view of the neighborhood, so we will zoom out and blur two shells together. For example, suppose we have two relationships:

- Bob works at Gru’s house
- Edith has a roommate

And we want to turn them in to one relationship:

- Bob works at Gru’s house and Edith has a roommate

So we start with a relationship between a person and a location and a relationship on one person. And we want to transform these into one relationship about two people and a location. This process is called **trench coating** and goes as follows: (1) stack the two shells, (2) enclosed them in a trench coat, (3) ignore the coat’s interior.

At last the true purpose of this example comes to light:

Now filling the inner shells with predicates, determines a predicate for the trench coat in a way that corresponds with logical AND.

We will discuss this parallel more deeply when we show how to put a logic on the graphical calculus.

If we can zoom out, then we should also be able to zoom in to get a more fine-grained view of the neighborhood. Suppose we have a predicate which fills a trench coat. Then it defines a predicate on the inner shells by a process called **dotting off**.

Note that zooming in and out do not compose to the identity. In fact, in the next section we will show that they are adjoint operations.

To really tell a story about the neighborhood we need to be able to combine information and build complicated relationships out of the basic ones. For example, suppose we have two relationships:

- Agnes lives with Margo
- Margo lives in Gru’s house

And we want to derive the relationship:

- Agnes lives in Gru’s house

**Step 1** Draw the shells corresponding to the input, inside of the shell corresponding to the output

**Step 2** Draw an intermediate shell labeled with ports for the wires to attach through.

**Step 3** Connect the wires.

Once we have hooked up shells, any choice of predicates for the inner shells defines a relationship for the outer shell. In the example above, we have the entailment:

However if we fill the inner shells differently, we may get a different filling for the outer shell.

We would read the bottom left diagram as “There exists a person $p1$, a person $p2$ and a location $\ell$ such that $p1$ lives with $p2$ and $p1$ works at $\ell$”.

With these four operations, we can visually represent any deduction in regular logic. Next, we give a taste of how a graphical calculus presents a regular logic.

Before we dive into the particular relationship between the graphical calculus and regular categories, let’s take a little bit to dwell on some abstract ruminations.

Lawvere divides logic into two opposing modes: *subjective* and *objective*.

*Subjective*logic is the logic we use to reason with – the syntax and the rules of inference. For example, the axioms of a boolean algebra.*Objective*logic is the logical arrangement of the objects we are interested in studying. For example, the power set of a given set.

We use the subjective logic to describe and reason about the objective logic at hand. This gives us a transformation which is sometimes called *semantics* from subjective to objective reasoning. For example, we can use the axioms of a boolean algebra to reason about the subsets of a given a set.

In nice cases, we can “objectify” our subjective logic by freely generating objective situations of our syntax. By stretching what we mean by syntax, we can also build subjective logics out of our objective situations. If everything goes perfectly, these two processes are inverse. A case where everything does go perfectly is Seely’s equivalence between first order (constructive) theories and hyperdoctrines, where the subjective first order constructive logic is interpreted into the objective logic of a *hyperdoctrine*.

In the situation at hand, the graphical calculus is our subjective logic for dealing with the objective logic of a regular category. We would like to put a logic back on our calculus – that is, build a regular category from a graphical calculus. However, we can’t do it with the pictures alone; we have to give a way for the pictures to *mean something*. How should we go about this?

The objects under consideration in predicate logic are the
*parts of a domain of discourse*. The association of each
object to its order of parts gives something akin to a
*hyperdoctrine*. A
hyperdoctrine is just an axiomatization of this association
for first order constructive logic. We need something like
this to add meaning to our pictures.

But hyperdoctrines are notoriously complex; they have a
number of axioms that on a first read seem unmotivated (or,
at least, they seemed that way to us). While they all turn
out to directly correspond to features of the logic being
modeled, it would be nice to have a more basic, conceptual
axiomatization from which the desired properties follow. In
the case of regular logic—the logic of “true”, “and”,
and “exists”—David and Brendan find this conceptual
axiomatization in the form of *ajax functors*. The
definition of an ajax functor is not terribly complicated,
but to understand why it does what we want takes a bit of
work. So we’ll meander our way to the definition.

Regular logic is the logic of “truth”, “and” and “exists” – let’s bring them into the ajax frame in that order. Our ambient setting will be the category $\textbf{Order}$ of ordered sets and order preserving maps. In fact, we will think of this as a *2-category* where the arrows between order preserving maps $f,\, g : A \to B$ are *dominations* $f \vdash g$, that is, $\forall a.\, f(a) \vdash g(a)$ (since we are thinking logically, I’ll denote the order relation by $a \vdash b$, which you can read as $a$ *entails* $b$ if you like). If you are familiar with enriched category theory, this is the 2-category of Boolean-enriched categories.

“Truth”, or $\top$, in an order theoretic sense, is the most general proposition. It has a mapping-in property; namely, from any proposition $p$, one can infer $\top$. If $E$ is our ordered set of propositions and entailments, then $\top : 1 \to E$ would be an order preserving function from the terminal order. To say that $\top$ has the universal property we expect, we can ask that $\top : 1 \to E$ be *right adjoint* to the terminal functor $! : E \to 1$. In other words, we ask that that the top holds if and only if the bottom holds

$\frac{p \vdash \top}{\ast \vdash \ast}$

where $\vdash$ is our ordering and $\ast$ is the unique element of $1$. Since per the axioms of an ordering, $\ast \vdash \ast$ always, we always have that $p \vdash \top$, as we wanted.

“And”, or $\wedge$, in an order theoretic sense, is the *meet* of two elements. The proposition $p \wedge q$ is the most general proposition which proves both $p$ and $q$. Its mapping in property can be expressed by

$\frac{x \vdash p,\quad x \vdash q}{x \vdash p \wedge q}$

or, in other words, that $\wedge : E \times E \to E$ is right adjoint to the diagonal map $\Delta : E \to E \times E$.

Now, it won’t come as a surprise that “truth” and “and”
together form a monoid, but we can give a slick proof of
this knowing that they are right adjoints. This is because
$! : E \to 1$ and $\Delta : E \to E \times E$ give $E$ the
structure of a *comonoid* and taking right adjoints is
contravariantly functorial. Monoids like this are common
enough that we have a name for them:

**Definition:** In a monoidal 2-category, an *adjoint
monoid* is a monoid whose unit and multiplication are right
adjoints.

We just saw that all meet semi-lattices are adjoint monoids
in the 2-category of orders, but we can flip the argument on
its head to see that every adjoint monoid is a meet
semi-lattice as well. This is because the 2-category of
orders is *cartesian*, and in a cartesian category every
object has a unique comonoid structure given by $!$ and
$\Delta$. Since the left adjoints of the structure maps of
an adjoint monoid form a comonoid, the right adjoints must
be the top and meet.

“Exists” is a quantifier, and quantification is all about changing the domain of discourse. So we can’t really talk about “exists” without bringing in a system of domains of discourse, and this means bringing in a regular category. Let’s get back to regular categories now. A regular category is a category in which we can use the logic of “truth” (a terminal object), “and”, (pullbacks), and “exists” (images), and it all works the right way (see the nLab page for a formal defintion).

Recall that a *subobject* of an object $A$ is a monomorphism $S \hookrightarrow A$, considered up to isomorphism over $A$. A subobject $S$ is contained in $T$ when there is a map $S \to T$ over $A$; this makes the subobjects of any object into an order. A relation $R : A \to B$ is a subobject $R \hookrightarrow A \times B$ of a product; using variables, we say $R(a, b)$ when the pair $(a, b)$ is in $R$.

In a regular catgory, relations can be composed by “existing out the middle” $RS(a, c) \coloneqq \exists b.\, R(a, b) \wedge S(b, c).$ This gives a category of relations, which we will call $\mathcal{R}$.

Note that a subobject of $A$ is equivalently a relation $S \in \mathcal{R}(I, A)$, where $I$ is the terminal object of the regular category. Therefore, the functor

${Sub}(-) \coloneqq \mathcal{R}(I, -) : \mathcal{R} \to\textbf{Order}$

represented by $I$ sends an object to its order of subobjects.

This functor (or, really, 2-functor, since it respects the ordering of relations) gets a lax monoidal structure from the cartesian product of the regular category from which we built the category of relations $\mathcal{R}$. I’ll write this product as $\otimes$ as a reminder that it isn’t cartesian in the category of relations; that is, $A \otimes B = A \times B$ on objects, but it doesn’t have the same universal property in the category of relations. So, $I \otimes I \cong I$, and therefore we get a lax monoidal structure

$\mathcal{R}(I, A) \times \mathcal{R}(I, B) \to \mathcal{R}(I \otimes I, A \otimes B) \cong \mathcal{R}(I, A \otimes B)$

which on the level of logic looks like $(\phi(a),\, \psi(b)) \mapsto \phi(a) \wedge \psi(b)$. We also have the identity relation on $I$

$\top : \ast \to \mathcal{R}(I,I)$

which is logically “truth”.

There’s something special about these laxators: they have universal properties! In particular, they are both *right adjoints*, with left adjoints

$(\pi_1, \pi_2) : \mathcal{R}(I, A \otimes B) \to \mathcal{R}(I, A) \times \mathcal{R}(I, B)$

and

$\mathcal{R}(I, I) \to \ast$

or, logically, $\chi(a, b) \mapsto (\exists b.\, \chi (a, b), \exists a.\, \chi (a, b))$, and $\nu \mapsto \ast$. That these are adjoint means we have the following correspondences:

$\frac{\exists b.\, \chi(a, b) \vdash \varphi(a) \quad \exists a.\, \chi(a, b) \vdash \psi(b)}{\chi(a, b) \vdash \varphi(a) \wedge \psi(b)}$

These are just the introduction rules for truth and existensial quantification!

We call these kinds of functors *ajax* for *adjoint lax*.

**Definition:** A lax monoidal 2-functor $F: \mathcal{A} \to \mathcal{B}$ between monoidal 2-categories is *ajax* if its laxators are right adjoints.

Ajax functors are very interesting objects in their own right, and we’d love to just ramble on about them for a good while. But we won’t.

Except for one thing. Just as lax functors preserve monoids, ajax functors preserve adjoint monoids. Now a regular category is in particular cartesian, so every object has a unique comonoid structure given by $!$ and $\Delta$, and the regular category can be embedded contravariantly in its category of relations by sending $f : A \to B$ to its “backwards graph” $b = f(a)$. This gives every object of the category of relations a (not necessarily unique) *monoid* structure. But, there is also a covariant embedding into the category of relations, sending $f : A \to B$ to its “graph” $f(a) = b$, and the “graph” and “backwards graph” of a function are adjoints in the category of relations. So every object of $\mathcal{R}$ is in fact an *adjoint monoid*, and is therefore preserved by the ajax functor ${Sub}$!

This shows that ${Sub}(A)$ is in fact always a meet semi-lattice. So, in conclusion, the fact that ${Sub}$ is ajax not only gives us “exists”, but “truth” and “and” as well. This is the structure we will adorn our graphical calculus with.

If you’re a mathematician (and not a minion), then while reading the user’s guide to graphical regular logic you may have wondered “where does the entailment between predicates come from?” or “when a choice of predicates for some shells determines the predicate for another shell, where does that mapping live?” A regular calculus explicitly defines the structure between predicates that we use implicitly in the game of graphical calculus. In this section, we will see how different aspects of a regular calculus underpin the building blocks and operations of the graphical calculus. The major parallels are:

- labels for ports and wires - a set of types $T$
- shells - objects of $\mathbb{F}\text{Rb}(T)$, the category of relations of the free regular category on the set of “types” $T$
- filling shells with predicates - a pofunctor $P: \mathbb{F}\text{Rb}(T)\to\textbf{Order}$
- zooming in and out - $P$ satisfies the
*ajax condition* - connecting shells - morphisms in $\mathbb{F}\text{Rb}(T)$

Intuitively, $\mathbb{F}\text{Rb}(T)$ is isomorphic to the category with (1) objects given by shells (2) morphisms given by connecting shells, and (3) monoidal structure given by trench coating. An overview of this isomorphism can be found in detail in Graphical Regular Logic and in pictures below:

So this is the regular categorical incarnation of our pictures, and we will adorn it with a structure that mimics the behavior of the ajax functor ${Sub}$ and corresponds to the graphical actions of filling shells.

**Definition:** A *regular calculus* is a pair $(T, P)$ where $T$ is a set and $P : \mathbb{F}\text{Rb}(T) \to \textbf{Order}$ is an ajax functor. A morphism of regular calculi is a pair $(F, F^{\sharp})$ where $F : T \to T'$ and $F^{\sharp}$ is a monoidal natural transformation

that is strict in every respect.

The functor $P$ defines an ordering on the predicates that can fill each shell. Given a context $\Gamma$, the elements of $P(\Gamma)$ behave as predicates on $\Gamma$ and the partial ordering corresponds to entailment. For example,

Thus filling a shell $\Gamma$ with a predicate is equivalent to choosing an element of the poset $P(\Gamma)$. Furthermore, we are allowed to substitute one predicate for another according to the ordering of $P(\Gamma)$.

The next operations are trench coating and dotting off. Recall that in graphical regular logic filling the inner shells, $\Gamma$ and $\Gamma'$, of a trench coat defines a predicate for the coat, $\Gamma \oplus \Gamma'$. Now that we know filling a shell $\Gamma$ corresponds to picking an element of the poset $P(\Gamma)$, the process of trench coating translates to: Picking an element of $P(\Gamma) \times P(\Gamma')$ gives an element of $P(\Gamma \oplus \Gamma')$. In other words we have a monotonic laxator $\rho: P(\Gamma) \times P(\Gamma') \to P(\Gamma \oplus \Gamma')$. Monotonicity implies that $\rho$ commutes with entailment.

Since dotting off is a pseudo-inverse to trench coating, dotting off corresponds to an adjoint $\lambda: P(\Gamma \oplus \Gamma') \to P(\Gamma) \times P(\Gamma')$.

Note that dotting off and then trench coating possibly loses information. In the example below, this composition maps the predicate “live together” to the weaker predicate “both have roommates”.

Since $a$ entails $\rho \circ \lambda (a)$, $\lambda$ is a right adjoint for $\rho$. This is why $P$ must be *ajax*!

The last operation is connecting shells. Suppose we have an wiring diagram corresponding to $\omega: \Gamma_{\text{in}} \to \Gamma_{\text{out}}$. Like in trench coating, filling the inner shells of a wired diagram defines a predicate for the outer shell. Thus we have a map $P(\Gamma_{\text{in}})$ to $P(\Gamma_{\text{out}})$ which is given by $P(\omega)$.

If the free regular category is an incarnation of our graphical syntax, then we can think of a regular calculus as the way we stretch our definition of syntax to encompass semantics.

In particular, every regular category gives us an example of a regular calculus. If $\mathcal{R}$ is the category of relations of a regular category, then there is an “interpretation” functor $\mathbb{F}\text{Rb}(\text{ob}\mathcal{R}) \to \mathcal{R}$; composing this with ${Sub} : \mathcal{R} \to \textbf{Order}$ gives us a regular calculus whose “types” are just the objects of $\mathcal{R}$.

**Definition:** The functor ${prd} : \text{RegCat} \to \text{RegCalc}$ is the functor sending a regular category whose category of relations is $\mathcal{R}$ to ${Sub} \circ {Interpret} : \mathbb{F}\text{Rb}(\text{ob}\mathcal{R}) \to \textbf{Order}$.

The bulk of the paper is devoted to constructing a left adjoint to this functor, and proving that it is fully faithful. We’ll just give a sketch of the idea.

If we have a regular calculus $P : \mathbb{F}\text{Rb}(T) \to \textbf{Order}$, we can try and think of it as assigning to each context (object of $\mathbb{F}\text{Rb}(T)$) its order of “subobjects”. So the objects of our “syntactic” regular category ${syn}(P)$ should be these “subobjects”, reified. So, for any $\Gamma \in \mathbb{F}\text{Rb}(T)$ and any $\phi \in P(\Gamma)$, we should have an object $\{ x : \Gamma \mid \phi(x) \}$ of ${syn}(P)$. This is just a fancy name for the pair $(\Gamma, \phi)$, one that tells us how to think of this pair.

A relation $R : \{x : \Gamma_1 \mid \phi_1(x)\} \to \{x : \Gamma_2 \mid \phi_2(x)\}$ should be a subobject of the product of the two objects represented by $\phi_1$ and $\phi_2$, or $R \in P(\Gamma_1 \oplus \Gamma_2)$. This should restrict nicely to $\phi_1$ and $\phi_2$ in the sense that $P(\pi_1)(R) \vdash \phi_1$ and $P(\pi_2)(R) \vdash \phi_2$. Functions can now be defined as functional relations, and with no small effort one can check that this gives a regular category.

The syntactic regular category of a predicate regular calculus is equivalent to the original regular category. This means that ${prd}$ is a reflective inclusion of regular categories into regular calculi, and justifies us using the graphical calculus, equipped with an order of “propositions” for each object varying with each signature, as a tool for reasoning about regular categories.

The reason that ${prd}$ is not an equivalence is that ${syn}$ changes the “types” of a graphical calculus. Perhaps if types were considered up to isomorphism in a “univalent” way, we could get an equivalence of categories; but as we move toward an equivalence between our subjective and objective logics, the less “syntactical” our subjective logic becomes.

I’m at the Aspen Center for Physics this week, for a workshop on Scattering Amplitudes and the Conformal Bootstrap.

Aspen is part of a long and illustrious tradition of physics conference sites located next to ski resorts. It’s ten years younger than its closest European counterpart Les Houches School of Physics, but if anything its traditions are stricter: all blackboard talks, and a minimum two-week visit. Instead of the summer schools of Les Houches, Aspen’s goal is to inspire collaboration: to get physicists to spend time working and hiking around each other until inspiration strikes.

This workshop is a meeting between two communities: people who study the Conformal Bootstrap (nice popular description here) and my own field of Scattering Amplitudes. The Conformal Boostrap is one of our closest sister-fields, so there may be a lot of potential for collaboration. This week’s talks have been amplitudes-focused, I’m looking forward to the talks next week that will highlight connections between the two fields.

I have written before (here, here) about the IBM Zurich group that has used atomic force microscopy in ultrahigh vacuum to image molecules on surfaces with amazing resolution. They've done it again. Starting with an elaborate precursor molecule, the group has been able to use voltage pulses to strip off side groups, so that in the end they leave behind a ring of eighteen carbon atoms, each bound to its neighbor on either side. The big question was, would it be better to think of this molecule as having double bonds between each carbon, or would it be energetically favorable to break that symmetry and have the bonds alternate between triple and single. It turns out to be the latter, as the final image shows a nonagon with nine-fold rotational symmetry. Here is a video where the scientists describe the work themselves (the video is non-embeddable for some reason). Great stuff.

I have a few questions about equalizers. I have my own reasons for wanting to know the answers, but I’ll admit right away that these questions are evil in the technical sense. So, investigating them requires a certain morbid curiosity… and have a feeling that some of you will be better at this than I am.

Here are the categories:

$Rex$ = [categories with finite colimits, functors preserving finite colimits]

$SMC$ = [symmetric monoidal categories, strong symmetric monoidal functors]

Both are brutally truncated stumps of very nice 2-categories!

My questions:

1) Does $Rex$ have equalizers?

2) Does $SMC$ have equalizers?

3) We can get a functor $F \colon Rex \to SMC$ by arbitrarily choosing an initial object for each category in $Rex$ to serve as the unit of the tensor product and arbitrarily choosing a binary coproduct for each pair of objects to serve as their tensor product. Do this. Does the resulting $F$ preserve equalizers?

You see, I have some equalizers that exist in $Rex$, and I know $F$ preserves them, but I know this by direct computation. If 1) were true I’d know these equalizers exist on general grounds, and if 3) were true I’d know they’re preserved on general grounds. This might help me avoid a bit of extra fiddliness as I walk on the dark side of mathematics and engage in some evil maneuvers… especially if there’s some reference I can just cite.

2) is just my morbid curiosity acting up. I’m pretty sure the category of symmetric monoidal categories and *strict* symmetric monoidal functors has equalizers, but *strong*?

I’ve just uploaded to the arXiv my paper “Quantitative bounds for critically bounded solutions to the Navier-Stokes equations“, submitted to the proceedings of the Linde Hall Inaugural Math Symposium. (I unfortunately had to cancel my physical attendance at this symposium for personal reasons, but was still able to contribute to the proceedings.) In recent years I have been interested in working towards establishing the existence of classical solutions for the Navier-Stokes equations

that blow up in finite time, but this time for a change I took a look at the other side of the theory, namely the conditional regularity results for this equation. There are several such results that assert that if a certain norm of the solution stays bounded (or grows at a controlled rate), then the solution stays regular; taken in the contrapositive, they assert that if a solution blows up at a certain finite time , then certain norms of the solution must also go to infinity. Here are some examples (not an exhaustive list) of such blowup criteria:

- (Leray blowup criterion, 1934) If blows up at a finite time , and , then for an absolute constant .
- (Prodi–Serrin–Ladyzhenskaya blowup criterion, 1959-1967) If blows up at a finite time , and , then , where .
- (Beale-Kato-Majda blowup criterion, 1984) If blows up at a finite time , then , where is the vorticity.
- (Kato blowup criterion, 1984) If blows up at a finite time , then for some abs}{olute constant .
- (Escauriaza-Seregin-Sverak blowup criterion, 2003) If blows up at a finite time , then .
- (Seregin blowup criterion, 2012) If blows up at a finite time , then .
- (Phuc blowup criterion, 2015) If blows up at a finite time , then for any .
- (Gallagher-Koch-Planchon blowup criterion, 2016) If blows up at a finite time , then for any .
- (Albritton blowup criterion, 2016) If blows up at a finite time , then for any .

My current paper is most closely related to the Escauriaza-Seregin-Sverak blowup criterion, which was the first to show a critical (i.e., scale-invariant, or dimensionless) spatial norm, namely , had to become large. This result now has many proofs; for instance, many of the subsequent blowup criterion results imply the Escauriaza-Seregin-Sverak one as a special case, and there are also additional proofs by Gallagher-Koch-Planchon (building on ideas of Kenig-Koch), and by Dong-Du. However, all of these proofs rely on some form of a compactness argument: given a finite time blowup, one extracts some suitable family of rescaled solutions that converges in some weak sense to a limiting solution that has some additional good properties (such as almost periodicity modulo symmetries), which one can then rule out using additional qualitative tools, such as unique continuation and backwards uniqueness theorems for parabolic heat equations. In particular, all known proofs use some version of the backwards uniqueness theorem of Escauriaza, Seregin, and Sverak. Because of this reliance on compactness, the existing proofs of the Escauriaza-Seregin-Sverak blowup criterion are qualitative, in that they do not provide any quantitative information on how fast the norm will go to infinity (along a subsequence of times).

On the other hand, it is a general principle that qualitative arguments established using compactness methods ought to have quantitative analogues that replace the use of compactness by more complicated substitutes that give effective bounds; see for instance these previous blog posts for more discussion. I therefore was interested in trying to obtain a quantitative version of this blowup criterion that gave reasonably good effective bounds (in particular, my objective was to avoid truly enormous bounds such as tower-exponential or Ackermann function bounds, which often arise if one “naively” tries to make a compactness argument effective). In particular, I obtained the following triple-exponential quantitative regularity bounds:

Theorem 1If is a classical solution to Navier-Stokes on with

and

for and .

As a corollary, one can now improve the Escauriaza-Seregin-Sverak blowup criterion to

for some absolute constant , which to my knowledge is the first (*very* slightly) supercritical blowup criterion for Navier-Stokes in the literature.

The proof uses many of the same quantitative inputs as previous arguments, most notably the Carleman inequalities used to establish unique continuation and backwards uniqueness theorems for backwards heat equations, but also some additional techniques that make the quantitative bounds more efficient. The proof focuses initially on points of concentration of the solution, which we define as points where there is a frequency for which one has the bound

for a large absolute constant , where is a Littlewood-Paley projection to frequencies . (This can be compared with the upper bound of for the quantity on the left-hand side that follows from (1).) The factor of normalises the left-hand side of (2) to be dimensionless (i.e., critical). The main task is to show that the dimensionless quantity cannot get too large; in particular, we end up establishing a bound of the form

from which the above theorem ends up following from a routine adaptation of the local well-posedness and regularity theory for Navier-Stokes.

The strategy is to show that any concentration such as (2) when is large must force a significant component of the norm of to also show up at many other locations than , which eventually contradicts (1) if one can produce enough such regions of non-trivial norm. (This can be viewed as a quantitative variant of the “rigidity” theorems in some of the previous proofs of the Escauriaza-Seregin-Sverak theorem that rule out solutions that exhibit too much “compactness” or “almost periodicity” in the topology.) The chain of causality that leads from a concentration (2) at to significant norm at other regions of the time slice is somewhat involved (though simpler than the much more convoluted schemes I initially envisaged for this argument):

- Firstly, by using Duhamel’s formula, one can show that a concentration (2) can only occur (with large) if there was also a preceding concentration
at some slightly previous point in spacetime, with also close to (more precisely, we have , , and ). This can be viewed as a sort of contrapositive of a “local regularity theorem”, such as the ones established by Caffarelli, Kohn, and Nirenberg. A key point here is that the lower bound in the conclusion (3) is precisely the same as the lower bound in (2), so that this backwards propagation of concentration can be iterated.

- Iterating the previous step, one can find a sequence of concentration points
with the propagating backwards in time; by using estimates ultimately resulting from the dissipative term in the energy identity, one can extract such a sequence in which the increase geometrically with time, the are comparable (up to polynomial factors in ) to the natural frequency scale , and one has . Using the “epochs of regularity” theory that ultimately dates back to Leray, and tweaking the slightly, one can also place the times in intervals (of length comparable to a small multiple of ) in which the solution is quite regular (in particular, enjoy good bounds on ).

- The concentration (4) can be used to establish a lower bound for the norm of the vorticity near . As is well known, the vorticity obeys the vorticity equation
In the epoch of regularity , the coefficients of this equation obey good bounds, allowing the machinery of Carleman estimates to come into play. Using a Carleman estimate that is used to establish unique continuation results for backwards heat equations, one can propagate this lower bound to also give lower bounds on the vorticity (and its first derivative) in annuli of the form for various radii , although the lower bounds decay at a gaussian rate with .

- Meanwhile, using an energy pigeonholing argument of Bourgain (which, in this Navier-Stokes context, is actually an enstrophy pigeonholing argument), one can locate some annuli where (a slightly normalised form of) the entrosphy is small at time ; using a version of the localised enstrophy estimates from a previous paper of mine, one can then propagate this sort of control forward in time, obtaining an “annulus of regularity” of the form in which one has good estimates; in particular, one has type bounds on in this cylindrical annulus.
- By intersecting the previous epoch of regularity with the above annulus of regularity, we have some lower bounds on the norm of the vorticity (and its first derivative) in the annulus of regularity. Using a Carleman estimate first introduced by theorem of Escauriaza, Seregin, and Sverak, as well as a second application of the Carleman estimate used previously, one can then propagate this lower bound back up to time , establishing a lower bound for the vorticity on the spatial annulus . By some basic Littlewood-Paley theory one can parlay this lower bound to a lower bound on the norm of the velocity ; crucially, this lower bound is uniform in .
- If is very large (triple exponential in !), one can then find enough scales with disjoint annuli that the total lower bound on the norm of provided by the above arguments is inconsistent with (1), thus establishing the claim.

The chain of causality is summarised in the following image:

It seems natural to conjecture that similar triply logarithmic improvements can be made to several of the other blowup criteria listed above, but I have not attempted to pursue this question. It seems difficult to improve the triple logarithmic factor using only the techniques here; the Bourgain pigeonholing argument inevitably costs one exponential, the Carleman inequalities cost a second, and the stacking of scales at the end to contradict the upper bound costs the third.

I worked more on my selection-function paper with Rix. I continued to struggle with understanding the controversy (between Loredo on one hand and various collaboration of my own on the other) about the likelihood function for a catalog. In my view, if you take a variable-rate Poisson process, and then censor it, where the censoring depends only on the individual properties of the individual objects being censored, you get a new variable-rate Poisson process with just a different rate function. If I am right, then there is at least one way of thinking about things such that the likelihood functions in the Bovy *et al* and Foreman-Mackey *et al* papers are correct. My day ended with a very valuable phone discussion of this with Foreman-Mackey. He (and I) would like to understand what is the difference in assumptions between us and Loredo.

I also worked today with Soledad Villar (NYU) to develop capstone projects for the masters program in the Center for Data Science. The Masters students do research projects, and we have lots of ideas about mashing up deep learning and astrophysics.

A friend asks:

A quick question: if somebody wants to donate money to reduce his or her carbon footprint, which org(s) would you recommend that he or she donate to?

Do you have a good answer to this? I don’t want answers that deny the premise. We’re *assuming* someone wants to donate money to reduce his or her carbon footprint, and choosing an organization based on this. We’re not comparing this against other activities, like cutting personal carbon emissions or voting for politicians who want to cut carbon emissions.

Here’s my best answer so far:

The Gold Standard Foundation is one organization that tackles my friend’s question. See for example:

• Gold Standard, Offset your emissions.

Here they list various ways to offset your carbon emissions, currently with prices between $11 and $18 per tonne.

The Gold Standard Foundation is a non-profit foundation headquartered in Geneva that tries to ensure that carbon credits are real and verifiable and that projects make measurable contributions to sustainable development.

Peter Denton, Stephen Parke, Xining Zhang, and I have just uploaded to the arXiv the short unpublished note “Eigenvectors from eigenvalues“. This note gives two proofs of a general eigenvector identity observed recently by Denton, Parke and Zhang in the course of some quantum mechanical calculations. The identity is as follows:

where is the Hermitian matrix formed by deleting the row and column from .

For instance, if we have

for some real number , -dimensional vector , and Hermitian matrix , then we have

assuming that the denominator is non-zero.

Once one is aware of the identity, it is not so difficult to prove it; we give two proofs, each about half a page long, one of which is based on a variant of the Cauchy-Binet formula, and the other based on properties of the adjugate matrix. But perhaps it is surprising that such a formula exists at all; one does not normally expect to learn much information about eigenvectors purely from knowledge of eigenvalues. In the random matrix theory literature, for instance in this paper of Erdos, Schlein, and Yau, or this later paper of Van Vu and myself, a related identity has been used, namely

but it is not immediately obvious that one can derive the former identity from the latter. (I do so below the fold; we ended up not putting this proof in the note as it was longer than the two other proofs we found. I also give two other proofs below the fold, one from a more geometric perspective and one proceeding via Cramer’s rule.) It was certainly something of a surprise to me that there is no explicit appearance of the components of in the formula (1) (though they do indirectly appear through their effect on the eigenvalues ; for instance from taking traces one sees that ).

One can get some feeling of the identity (1) by considering some special cases. Suppose for instance that is a diagonal matrix with all distinct entries. The upper left entry of is one of the eigenvalues of . If it is equal to , then the eigenvalues of are the other eigenvalues of , and now the left and right-hand sides of (1) are equal to . At the other extreme, if is equal to a different eigenvalue of , then now appears as an eigenvalue of , and both sides of (1) now vanish. More generally, if we order the eigenvalues and , then the Cauchy interlacing inequalities tell us that

for , and

for , so that the right-hand side of (1) lies between and , which is of course consistent with (1) as is a unit vector. Thus the identity relates the coefficient sizes of an eigenvector with the extent to which the Cauchy interlacing inequalities are sharp.

** — 1. Relating the two identities — **

We now show how (1) can be deduced from (2). By a limiting argument, it suffices to prove (1) in the case when is not an eigenvalue of . Without loss of generality we may take . By subtracting the matrix from (and from , thus shifting all the eigenvalues down by , we may also assume without loss of generality that . So now we wish to show that

The right-hand side is just . If one differentiates the characteristic polynomial

at , one sees that

Finally, (2) can be rewritten as

so our task is now to show that

By Schur complement, we have

Since is an eigenvalue of , but not of (by hypothesis), the factor vanishes when . If we then differentiate (4) in and set we obtain (3) as desired.

** — 2. A geometric proof — **

Here is a more geometric way to think about the identity. One can view as a linear operator on (mapping to for any vector ); it then also acts on all the exterior powers by mapping to for all vectors . In particular, if one evaluates on the basis of induced by the orthogonal eigenbasis , we see that the action of on is rank one, with

for any , where is the inner product on induced by the standard inner product on . If we now apply this to the -form , we have , while is equal to plus some terms orthogonal to . Since , Theorem 1 follows.

** — 3. A proof using Cramer’s rule — **

By a limiting argument we can assume that all the eigenvalues of are simple. From the spectral theorem we can compute the resolvent for as

Extracting the component of both sides and using Cramer’s rule, we conclude that

or in terms of eigenvalues

Both sides are rational functions with a simple pole at the eigenvalues . Extracting the residue at we conclude that

and Theorem 1 follows. (Note that this approach also gives a formula for for , although the formula becomes messier when because the relevant minor of is no longer a scalar multiple of the identity .)

For a number of projects, my group has been trying to compare point sets to point sets, to determine transformations. Some contexts have been calibration (like photometric and astrometric calibration of images, where stars need to align, either on the sky or in magnitude space) and others have been in dynamics. Right now Suroor Gandhi (NYU), Adrian Price-Whelan (Flatiron), and I have been trying to find transformations that align phase-space structures (and especially the Snail) observed in different tracers: What transformation between tracers matches the phase-space structure? These projects are going by our code name *MySpace*.

Projects like these tend to have a pathology, however, related to a pathology that Robyn Sanderson (Flatiron) and I found in a different context in phase space: If you write down a naive objective for matching two point clouds, the optimal match often has one point cloud shrunk down to zero size and put on top of the densest location on the other point cloud! Indeed, Gandhi is finding this so we decided (today) to try symmetrizing the objective function to stop it. That is, don't just compare points A to points B, but also symmetrically compare points B to points A. Then (I hope) neither set can shrink to zero usefully. I hope this works! Now to make a symmetric objective function...

I spent my research time today writing in a paper Rix (MPIA) and I are preparing about selecting sources for a catalog or target selection. The fundamental story is that you need to make a likelihood function at the end of the day. And this, in turn, means that you need a tractable and relatively accurate selection function. This all took me down old paths I have traveled with Bovy (Toronto) and Foreman-Mackey (Flatiron).

In email correspondence, Foreman-Mackey reminded me of past correspondence with Loredo (Cornell), who disagrees with our work on these things for very technical reasons. His (very nice) explanation of his point is around equations (8) through (10) in this paper: It has to do with how to factorize a probability distribution for a collection of objects obtained in a censored, variable-rate Poisson process. But our historical view of this (and my restored view after a day of struggling) is that the form of the likelihood depends on fine details of how you believe the objects of study were selected for the catalog, or censored. If they were censored only by your detector, I think Loredo's form is correct. But if they were censored for physical reasons over which you have no dominion (for example a planet transit obscured by a tiny fluctuation in a star's brightness), the selection can come in to the likelihood function differently. That is, it depends on the causal chain involved in the source censoring.

Ever since telescopes were first invented, by some dutch lens grinder in the late XVIth century, and then demonstrated to be invaluable tools for investigating the cosmos around us by Galileo Galilei in the early 1600s, there has been a considerable, steady effort to construct bigger and better ones. Particularly bigger ones.

While I've rotated out of my "member-at-large" spot on the APS DCMP executive committee, I still want to pass this on. Having helped evaluate invited symposium proposals for the last three years, I can tell you that the March Meeting benefits greatly when there is a rich palette.

---

---

The DCMP invited speaker program at March Meeting 2020 is dependent on the invited sessions that are nominated by DCMP members. All invited speakers that we host must be nominated in advance by a DCMP member.

The deadline to submit a nomination is coming up soon:**August 23, 2019**.

Please take a moment to submit an invited session nomination.

Notes regarding nominations:

The deadline to submit a nomination is coming up soon:

Please take a moment to submit an invited session nomination.

Notes regarding nominations:

- All nominations must be submitted through ScholarOne by August 23, 2019.
- In ScholarOne, an invited session should be submitted as an "Invited Symposium Nomination".
- An invited session consists of 5 speakers. You may include up to 2 alternate speakers in your nomination, in the event that one of the original 5 does not work out.
- While no invited speakers will be guaranteed placement in the program until after all nominations have been reviewed, please get a tentative confirmation of interest from your nominated speakers. There will be a place in the nomination to indicate this.
- A person cannot give technical invited talks in two consecutive years. A list of people that gave technical invited talks in 2019, and are therefore ineligible for 2020, can be found on this page.
- Nominations of women, members of underrepresented minority groups, and scientists from outside the United States are especially encouraged.

Be sure to select a DCMP Category in your nomination. DCMP categories are:

07.0 TOPOLOGICAL MATERIALS (DCMP)

09.0 SUPERCONDUCTIVITY (DCMP)

11.0 STRONGLY CORRELATED SYSTEMS, INCLUDING QUANTUM FLUIDS AND SOLIDS (DCMP)

12.0 COMPLEX STRUCTURED MATERIALS, INCLUDING GRAPHENE (DCMP)

13.0 SUPERLATTICES, NANOSTRUCTURES, AND OTHER ARTIFICIALLY STRUCTURED MATERIALS (DCMP)

14.0 SURFACES, INTERFACES, AND THIN FILMS (DCMP)

15.0 METALS AND METALLIC ALLOYS (DCMP)

Thank you for your prompt attention to this matter.

Daniel Arovas, DCMP Chair, and Eva Andrei, DCMP Chair-Elect

07.0 TOPOLOGICAL MATERIALS (DCMP)

09.0 SUPERCONDUCTIVITY (DCMP)

11.0 STRONGLY CORRELATED SYSTEMS, INCLUDING QUANTUM FLUIDS AND SOLIDS (DCMP)

12.0 COMPLEX STRUCTURED MATERIALS, INCLUDING GRAPHENE (DCMP)

13.0 SUPERLATTICES, NANOSTRUCTURES, AND OTHER ARTIFICIALLY STRUCTURED MATERIALS (DCMP)

14.0 SURFACES, INTERFACES, AND THIN FILMS (DCMP)

15.0 METALS AND METALLIC ALLOYS (DCMP)

Thank you for your prompt attention to this matter.

Daniel Arovas, DCMP Chair, and Eva Andrei, DCMP Chair-Elect

Some of the oddballs on the $n$-Café are interested in odd-dimensional balls, but here’s a nice thing about *even*-dimensional balls: the volume of the $2n$-dimensional ball of radius $r$ is

$\frac{(\pi r^2)^n}{n!}$

Dillon Berger pointed out that summing up over all $n$ we get

$\sum_{n=0}^\infty \frac{(\pi r^2)^n}{n!} = e^{\pi r^2}$

It looks nice. But *what does it mean?*

First, why is it true?

We can show the volume of the $2n$-dimensional ball of radius $r$ is $(\pi r^2)^n/n!$ inductively. How? It’s enough to show that the volume of the unit $2(n+1)$-ball is $\pi /(n+1)$ times the volume of the unit $2n$-ball. Wikipedia gives a proof, and Greg Egan recently summarized the argument in a couple of tweets.

The idea is to map the unit $2(n+1)$-ball to the unit disk in the plane via a projection $\mathbb{R}^{2n} \to \mathbb{R}^2$. Imagine a $2(n+1)$-ball hovering over a disk! I can’t draw it, because the first interesting example is 4-dimensional. But imagine it.

The points over any point in the disk at distance $r$ from the origin form a $2n$-ball of radius $\sqrt{1 - r^2}$. We can get the volume of the $2(n+1)$-ball by integrating the volumes of these $2n$-balls over the disk.

That’s the idea. The rest is just calculus. Let the volume of the unit $2n$-ball be $V_{2n}$. We do the integral using polar coordinates:

$V_{2(n+1)} = \int_0^1 V_{2n} (1 - r^2)^n \; 2 \pi r d r$

Make the substitution $u = 1 - r^2$, so $d u = -2 r d r$. Some minus signs and 2’s cancel and we get

$V_{2(n+1)} = \pi V_{2n} \int_0^1 u^n \, d u = \frac{\pi}{n+1} V_{2n}$

as desired!

So that’s simple enough: since $V_0 = 1$ we get

$V_{2n} = \frac{\pi^n}{n!}$

And maybe that’s all that needs to be said. But suppose we follow Dillon Berger and sum the volumes of all even-dimensional balls of radius $r$. We get

$\sum_{n=0}^\infty V_{2n} r^{2n} = \sum_{n=0}^\infty \frac{(\pi r^2)^n}{n!} = e^{\pi r^2}$

This looks cute, but what does it mean?

It seems odd to sum the volumes of shapes that have different dimensions. But we do it in classical statistical mechanics when considering a system of particles with a *variable* numbers of particles: a so-called ‘grand canonical ensemble’, like an open container of gas, where molecules can flow in and out.

If we have a single classical harmonic oscillator, with momentum $p$ and position $q$, its energy is

$\frac{1}{2}(p^2 + q^2)$

where I’m choosing units that get rid of distracting constants.

If we impose the constraint that the energy is $\le E$ for some number $E$, the state of the oscillator is described by a point $(p,q)$ with

$p^2 + q^2 \le 2 E$

This is a point in the disk of radius $r = \sqrt{2E}$.

Now suppose we have an arbitrary collection of such oscillators—and we don’t know how many there are! Then the state of the system is described first by a natural number $n$, the number of oscillators, and then momenta $p_1, \dots, p_n$ and positions $q_1, \dots, q_n$ obeying

$p_1^2 + q_1^2 + \cdots + p_n^2 + q_n^2 \le 2 E$

In other words, it’s a point in the *union* of all even-dimensional balls of radius $r = \sqrt{2E}$.

Physicists would call this union the ‘phase space’ of our collection of oscillators. Its volume is the sum of the volumes of the balls:

$\sum_{n=0}^\infty V_{2n} r^{2n} = e^{\pi r^2} = e^{2 \pi E}$

Actually physicists would include the constants that I’m hiding, and this is actually somewhat interesting. For example, to make the area 2-form $d p \wedge d q$ dimensionless, they divide it by Planck’s ‘quantum of action’. This is what allows them to add up volumes of balls (or other symplectic manifolds) of different dimensions and get something that doesn’t break the rules of dimensional analysis. But this also brings Planck’s constant into a question of *classical* statistical mechanics—a surprise, but a surprise that’s well-known among experts: it shows up in computations such as the entropy of an ideal gas.

And indeed, entropy is exactly what I’m interested in now! Why would a physicist care about the volume of phase space for a collection of arbitrarily many oscillators with total energy $\le E$? Here’s why: if *all we know* about this collection of oscillators is that it’s energy is $\le E$, its entropy is just the logarithm of this phase space volume! So, it’s

$\ln (e^{2 \pi E}) = 2 \pi E$

Nice!

Physicists will not like how I tucked all the constants under the rug, getting a formula for entropy that’s missing things like Boltzmann’s constant and Planck’s constant, as well as the mass of the particles in our harmonic oscillators, and their spring constant. I leave it as a challenge to restore all these constants correctly. I suspect that the $2\pi$ will get eaten by the $2 \pi$ hiding in Planck’s constant $\hbar = h / 2\pi$. This I know: we will get something proportional to $E$, but with units of entropy.

This is what I’ve got so far. It’s not yet a new *proof* that the volumes of all even-dimensional balls of radius $r$ sum to

$e^{\pi r^2}$

but rather a physical interpretation of this fact. Maybe someone can keep the ball rolling….

All of us living at a certain time on this planet together, and together experiencing all its earthly joys and sorrows, seeing the same sky, loving and hating what are, after all, the same things, each and every one of us condemned to suffer the same sentence, the same disappearance off the face of the earth, should really nurture the greatest tenderness towards each other, a feeling of the most heart-rending closeness, and should be literally screaming with terror and pain whenever we are parted by a fate which at any moment is fully capable of transforming every one of our separations, even if only meant to last ten minutes, into an eternal one. But, as you know, for most of the time, we are a long way from such sentiments, and often take leave of even those closest to us in the most thoughtless manner imaginable.

Ivan Bunin, “Long Ago,” 1921 (Sophie Lund, trans.)

I recently bumped into this nice paper:

• Theo Johnson-Freyd and David Treumann, $\mathrm{H}^4(\mathrm{Co}_0,\mathbb{Z}) = \mathbb{Z}/24$.

which proves just what it says: the 4th integral cohomology of the Conway group $\mathrm{Co}_0$, in the sense of group cohomology, is $\mathbb{Z}/24$. I want to point out a few immediate consequences.

The Leech lattice $\Lambda \subset \mathbb{R}^{24}$ is, up rotations and reflections, the only lattice in 24 dimensions that is:

**unimodular**: the volume of $\mathbb{R}^{24}/\Lambda$ is 1,**even**: $v \cdot v$ is even for every vector $v$ in the lattice,

and:

- contains no
**roots**: that is, vectors with $v \cdot v = 2$.

It comes in two chiral forms: that is, no transformation of determinant $-1$ maps this lattice to itself.

The group of rotations that preserves the Leech lattice is called the Conway group $\mathrm{Co}_0$. It has 8,315,553,613,086,720,000 elements. It’s not simple, because the transformation $v \mapsto -v$ lies in the Conway group and commutes with everything else. If you mod out the Conway group by its center, which is just $\mathbb{Z}/2$, you get a finite simple group called $\mathrm{Co}_1$. But never mind, that’s just a distraction for now: I’m more interested in $\mathrm{Co}_0$.

In fact you don’t need to understand anything I just said, except $\mathrm{Co}_0$ is a group. As soon as Johnson-Freyd and Treumann tell us that

$\mathrm{H}^4(\mathrm{Co}_0,\mathbb{Z}) = \mathbb{Z}/24,$

we know that there are 24 inequivalent ways to extend this group $\mathrm{Co}_0$ to a 3-group. Let us call these the **Conway 3-groups**. (Beware: they may not all be inequivalent as 3-groups. They are inequivalent as *extensions*.)

We can think of a 3-group as a one-object weak 3-groupoid. In any of the Conway 3-groups, the one object has 8,315,553,613,086,720,000 automorphisms. These 1-morphisms compose in a manner given by the group $\mathrm{Co}_0$. The only 2-morphisms are identities, but each 2-morphism has $\mathbb{Z}$’s worth of automorphisms, so there are infinitely many 3-morphisms.

In a general 3-group the composition of 1-morphisms is associative only up to an ‘associator’, which obeys the pentagon equation up to a ‘pentagonator’, which obeys an equation coming from the 3d associahedron. In the Conway 3-groups, the associator is trivial because there are only identity 2-morphisms. The 24 fundamentally different integral 4-cocycles on $\mathrm{Co}_0$ give us 24 choices for the pentagonator. The cocycle condition means these pentagonators obey the equation coming from the associahedron.

We can play various games with these data. For example, any group $G$ has a ‘Bockstein homomorphism’ relating its 4th cohomology with $\mathbb{Z}$ coefficients and its 3rd cohomology with $\mathrm{U}(1)$ coefficients:

$\beta \colon H^3(G, \mathrm{U}(1)) \to H^4(G, \mathbb{Z})$

and for some reason I think this is an isomorphism when $G$ is finite. Could someone check this out? I think it’s because $H^n(G,\mathbb{R})$ vanishes for $n ≥ 0$, and I vaguely seem to recall that’s because of some group averaging argument, but it’s been a long time since I’ve thought about this stuff.

If so, we also get a bunch of **Conway 2-groups**: 24 different ways of extending $\mathrm{Co}_0$ to a 2-group. These are one-object weak 2-groupoids where the morphisms form the group $\mathrm{Co}_0$, each morphism has a circle’s worth of automorphisms, and while composition of morphisms is associative there is still an associator given by some 3-cocycle $\alpha \colon \mathrm{Co}_0^3 \to \mathrm{U}(1)$.

For either the 2-groups or the 3-groups, we can also start with not just $\mathrm{Co}_0$ but with the whole semidirect product $\mathrm{Co}_0 \ltimes \Lambda$, which is the full symmetry group of the Leech lattice including translations.

Perhaps the 2-groups are a bit more intuitive. Here we start with the Leech lattice symmetries, but then decree that they compose associatively *only up to a phase*, which is the $\mathrm{U}(1)$-valued 3-cocycle. There are, nicely, 24 consistent ways to do this!

There’s a lot more to say, but maybe this will get the ball rolling.

I'll admit, I wanted to rather title this post "Billionaire Awards Prizes To Failed Theories", just for the sake of being flippant. But in any joke there is a little bit of truth, as I wish to discuss below.

The (not-so-anymore) news is that the "Special Breakthrough prize" in fundamental physics, instituted a decade ago by Russian philantropist Yuri Milner, and then co-funded by other filthy wealthy folks, recently went to three brillant theoretical physicists: Sergio Ferrara, Dan Freedman, and Peter van Nieuwenzhuizen, who in the seventies developed an elegant quantum field theory, SuperGravity.

The (not-so-anymore) news is that the "Special Breakthrough prize" in fundamental physics, instituted a decade ago by Russian philantropist Yuri Milner, and then co-funded by other filthy wealthy folks, recently went to three brillant theoretical physicists: Sergio Ferrara, Dan Freedman, and Peter van Nieuwenzhuizen, who in the seventies developed an elegant quantum field theory, SuperGravity.

This week, $3 Million was awarded by the Breakthrough Prize to Sergio Ferrara, Daniel Z. Freedman and Peter van Nieuwenhuizen, the discoverers of the theory of supergravity, part of a special award separate from their yearly Fundamental Physics Prize. There’s a nice interview with Peter van Nieuwenhuizen on the Stony Brook University website, about his reaction to the award.

The Breakthrough Prize was designed to complement the Nobel Prize, rewarding deserving researchers who wouldn’t otherwise get the Nobel. The Nobel Prize is only awarded to theoretical physicists when they predict something that is later observed in an experiment. Many theorists are instead renowned for their mathematical inventions, concepts that other theorists build on and use but that do not by themselves make testable predictions. The Breakthrough Prize celebrates these theorists, and while it has also been awarded to others who the Nobel committee could not or did not recognize (various large experimental collaborations, Jocelyn Bell Burnell), this has always been the physics prize’s primary focus.

The Breakthrough Prize website describes supergravity as a theory that combines gravity with particle physics. That’s a bit misleading: while the theory does treat gravity in a “particle physics” way, unlike string theory it doesn’t solve the famous problems with combining quantum mechanics and gravity. (At least, as far as we know.)

It’s better to say that supergravity is a theory that *links* gravity to other parts of particle physics, via supersymmetry. Supersymmetry is a relationship between two types of particles: bosons, like photons, gravitons, or the Higgs, and fermions, like electrons or quarks. In supersymmetry, each type of boson has a fermion “partner”, and vice versa. In supergravity, gravity itself gets a partner, called the gravitino. Supersymmetry links the properties of particles and their partners together: both must have the same mass and the same charge. In a sense, it can unify different types of particles, explaining both under the same set of rules.

In the real world, we don’t see bosons and fermions with the same mass and charge. If gravitinos exist, then supersymmetry would have to be “broken”, giving them a high mass that makes them hard to find. Some hoped that the Large Hadron Collider could find these particles, but now it looks like it won’t, so there is no evidence for supergravity at the moment.

Instead, supergravity’s success has been as a tool to understand other theories of gravity. When the theory was proposed in the 1970’s, it was thought of as a rival to string theory. Instead, over the years it consistently managed to point out aspects of string theory that the string theorists themselves had missed, for example noticing that the theory needed not just strings but higher-dimensional objects called “branes”. Now, supergravity is understood as one part of a broader string theory picture.

In my corner of physics, we try to find shortcuts for complicated calculations. We benefit a lot from toy models: simpler, unrealistic theories that let us test our ideas before applying them to the real world. Supergravity is one of the best toy models we’ve got, a theory that makes gravity simple enough that we can start to make progress. Right now, colleagues of mine are developing new techniques for calculations at LIGO, the gravitational wave telescope. If they hadn’t worked with supergravity first, they would never have discovered these techniques.

The discovery of supergravity by Ferrara, Freedman, and van Nieuwenhuizen is exactly the kind of work the Breakthrough Prize was created to reward. Supergravity is a theory with deep mathematics, rich structure, and wide applicability. There is of course no guarantee that such a theory describes the real world. What *is* guaranteed, though, is that someone will find it useful.

Here are some dates to help you plan your carbon emissions.

It seems ACT2019 just ended, but we’re already planning *next year’s* applied category theory conference and school! I’m happy to say that Brendan Fong and David Spivak have volunteered to run it at MIT on these dates:

- Applied Category Theory School: June 29–July 3, 2020.
- Applied Category Theory Conference: July 6–10, 2020.

The precise dates for the other big category theory conference, CT2020, have not yet been decided. However, it will take place in Genoa sometime in the interval June 18–28, 2020.

There may also be an additional applied theory school in Marrakesh during May 25–29, 2020. More on that later, with any luck!

Also: don’t forget to submit your abstracts for the November 2019 applied category theory special session at U. C. Riverside by **September 3rd!** We’ve got a great lineup of speakers, but anyone who wants to give a talk—including the invited speakers—needs to submit an abstract to the AMS website by September 3rd. The AMS has no mercy about this.

Yes, my last post was about ACT2019, but we’re already planning *next year’s* applied category theory conference and school! I’m happy to say that Brendan Fong and David Spivak have volunteered to run it at MIT on these dates:

• Applied Category Theory School: June 29–July 3, 2020.

• Applied Category Theory Conference: July 6–10, 2020.

The precise dates for the other big category theory conference, CT2020, have not yet been decided. However, it will take place in Genoa sometime in the interval June 18–28, 2020.

There may also be an additional applied theory school in Marrakesh from May 25–29, 2020. More on that later, with any luck!

And don’t forget to submit your abstracts for the November 2019 applied category theory special session at U. C. Riverside by **September 3rd!** We’ve got a great lineup of speakers, but anyone who wants to give a talk—including the invited speakers—needs to submit an abstract to the AMS website by September 3rd. The AMS has no mercy about this.

The American Mathematical Society is having their Fall Western meeting here at U. C. Riverside during the weekend of November 9th and 10th, 2019. Joe Moeller and I are organizing a session on Applied Category Theory! We already have some great speakers lined up:

• Tai-Danae Bradley

• Vin de Silva

• Brendan Fong

• Nina Otter

• Evan Patterson

• Blake Pollard

• Prakash Panangaden

• David Spivak

• Brad Theilman

• Dmitry Vagner

• Zhenghan Wang

Alas, we have no funds for travel and lodging. If you’re interested in giving a talk, please submit an abstract here:

• General information about abstracts, American Mathematical Society.

More precisely, please read the information there and then click on the link on that page to submit an abstract. It should then magically fly through the aether to me! Abstracts are due September 3rd, but the sooner you submit one, the greater the chance that we’ll have space.

For the program of the whole conference, go here:

• Fall Western Sectional Meeting, U. C. Riverside, Riverside, California, 9–10 November 2019.

I will also be running a special meeting on diversity and excellence in mathematics on Friday November 8th. There will be a banquet that evening, and at some point I’ll figure out how tickets for that will work.

We had a special session like this in 2017, and it’s fun to think about how things have evolved since then.

David Spivak had already written *Category Theory for the Sciences*, but more recently he’s written another book on applied category theory, *Seven Sketches*, with Brendan Fong. He already had a company, but now he’s helping run Conexus, which plans to award grants of up to $1.5 million to startups that use category theory (in exchange for equity). Proposals are due June 30th, by the way!

I guess Brendan Fong was already working with David Spivak at MIT in the fall of 2017, but since then they’ve written *Seven Sketches* and developed a graphical calculus for logic in regular categories. He’s also worked on a functorial approach to machine learning—and now he’s using category theory to unify learners and lenses.

Blake Pollard had just finished his Ph.D. work at U.C. Riverside back in 2018. He will now talk about his work with Spencer Breiner and Eswaran Subrahmanian at the National Institute of Standards and Technology, using category theory to help develop the “smart grid”—the decentralized power grid we need now. Above he’s talking to Brendan Fong at the Centre for Quantum Technologies, in Singapore. I think that’s where they first met.

Eswaran Subrahmanian will also be giving a talk this time! He works at NIST and Carnegie Mellon; he’s an engineer who specializes in using mathematics to help design smart networks and other complex systems.

Nina Otter was a grad student at Oxford in 2017, but now she’s at UCLA and the University of Leipzig. She worked with Ulrike Tillmann and Heather Harrington on stratifying multiparameter persistent homology, and is now working on a categorical formulation of positional and role analysis in social networks. Like Brendan, she’s on the executive board of the applied category theory journal *Compositionality*.

I first met Tai-Danae Bradley at ACT2018. Now she will talk about her work at Tunnel Technologies, a startup run by her advisor John Terilla. They model sequences—of letters from an alphabet, for instance—using quantum states and tensor networks.

Vin de Silva works on topological data analysis using persistent cohomology so he’ll probably talk about that. He’s studied the “interleaving distance” between persistence modules, using category theory to treat it and the Gromov-Hausdorff metric in the same setting. He came to the last meeting and it will be good to have him back.

Evan Patterson is a statistics grad student at Stanford. He’s worked on knowledge representation in bicategories of relations, and on teaching machines to understand data science code by the semantic enrichment of dataflow graphs. He too came to the last meeting.

Dmitry Vagner was also at the last meeting, where he spoke about his work with Spivak on open dynamical systems and the operad of wiring diagrams. He is now working on mathematically defining and implementing (in Idris) wiring diagrams for symmetric monoidal categories.

Prakash Panangaden has long been a leader in applied category theory, focused on semantics and logic for probabilistic systems and languages, machine learning, and quantum information theory.

Brad Theilman is a grad student in computational neuroscience at U.C. San Diego. I first met him at ACT2018. He’s using algebraic topology to design new techniques for quantifying the spatiotemporal structure of neural activity in the auditory regions of the brain of the European starling. (I bet you didn’t see those last two words coming!)

Last but not least, Zhenghan Wang works on condensed matter physics and modular tensor categories at U.C. Santa Barbara. At Microsoft’s Station Q, he is using this research to help design topological quantum computers.

In short: a lot has been happening in applied category theory, so it will be good to get together and talk about it!

**Guest post by Nadezda Chernyavskaya - ETH Zurich**

*Bavarian Evening at the Lindau Nobel Laureate Meetings - Picture Credit: Christian Flemming*

It is beyond belief that not only am I again writing a post about the premature death of a colleague whom I have known for decades, but that I am doing it about two of them.

Over the past weekend, two of the world’s most influential and brilliant theoretical high-energy physicists — Steve Gubser of Princeton University and Ann Nelson of the University of Washington — fell to their deaths in separate mountain accidents, one in the Alps and one in the Cascades.

Theoretical high energy physics is a small community, and within the United States itself the community is tiny. Ann and Steve were both justifiably famous and highly respected as exceptionally bright lights in their areas of research. Even for those who had not met them personally, this is a stunning and irreplaceable loss of talent and of knowledge.

But most of us did know them personally. For me, and for others with a personal connection to them, the news is devastating and tragic. I encountered Steve when he was a student and I was a postdoc in the Princeton area, and later helped bring him into a social group where he met his future wife (a great scientist in her own right, and a friend of mine going back decades). As for Ann, she was one of my teachers at Stanford in graduate school, then my senior colleague on four long scientific papers, and then my colleague (along with her husband David B. Kaplan) for five years at the University of Washington, where she had the office next to mine. I cannot express what a privilege it always was to work with her, learn from her, and laugh with her.

I don’t have the heart or energy right now to write more about this, but I will try to do so at a later time. Right now I join their spouses and families, and my colleagues, in mourning.

Binyamin Appelbaum wrote an article in the *New York Times* about my native county, Montgomery County, Maryland, and this is what he tweeted about it:

A lot of affluent liberals in Montgomery County, outside Washington, D.C., fiercely opposed a plan to build a little more affordable housing. “Affordable is not what people move here for,” one of them told me.

The plan in question was approved unanimously by the County Council, all nine of whom were Democrats, but, as Appelbaum reported, not everyone in progressive Montgomery County was happy about it. He quoted several residents making remarks that made them look like, well, uptight snobs:

Ellen Paul, 59, said in-law suites were bad enough: “It’s changing suburbia to allow two homes on each lot. You’ll have strangers walking by your house all the time now.”

…

“That’s where the backyard trailers are going to go,” said Dale Barnhard, one of the more than 1,500 people who signed a petition opposing the “dramatic” change in rules.

or worse:

One county resident, Katherine C. Gugulis, wrote a protest letter in The Washington Post that concluded, “Just because others flee crime-ridden and poverty-stricken areas doesn’t mean Montgomery County has to be turned into a slum to accommodate them.”

I was interested in these affluent liberals and wanted to learn about them. A few minutes of Googling later, here’s what I found out. Katherine Gangulis is a Republican appointed official. Ellen Paul, according to her public LinkedIn profile, is a former staff assistant to a Republican member of the House of Representatives, and her most recent listed activity was public relations for a Republican candidate for Montgomery County Board of Education in 2014. Dale Barnhard doesn’t have any political career, as far as I know, but he wrote a letter to the *Washington Post* last year complaining about their biased coverage of Donald Trump. Hessie Harris, who worries aloud in Appelbaum’s piece about “flophouses” and literally utters the words “There goes the neighborhood,” is listed by the FEC as contributing thousands of dollars a year to Americans for Legal Immigration; that’s a PAC which describes its mission as “our fight AGAINST the costly and deadly illegal immigration & illegal immigrant invasion of America.”

These people aren’t liberals!

I don’t doubt there are liberals in Montgomery County who oppose relaxation of zoning. (I grew up there, and I live in Madison, WI: liberal NIMBYism is not a foreign idea to me.) But why weren’t any of those people in Appelbaum’s article? Why run a piece featuring a bunch of conservatives protesting a decision by an all-Democratic county council and bill it as a portrait of progressivism failing to live up to its ideals?

Here’s my theory. I don’t think Appelbaum purposely gathered quotes from Montgomery County’s small but nonzero Republican population for his piece. I think he had a story already in mind, a story of rich liberals who profess a commitment to affordable housing but really have a lot of contempt for the kind of person who lives there, and who would certainly under no circumstances stand for such people residing in Potomac or the nice parts of Bethesda, you know, the Whitman part. Those people might *say* their opposition to density had to do with something other than snobbery. But their words would show how they truly felt about the poorly-to-mediocrely-heeled.

And he got the quotes he wanted, the quotes that supported this story. Good, salty quotes. But the people who said those things weren’t self-styled progressives. They were Republicans.

Maybe there’s a reason for that!

AB has curly hair, *really* curly hair, and strangers comment on it all the time. My stance on this is to tell her “it’s not really polite for people to randomly comment on your appearance, but it’s not impolite enough for you to be impolite back — just say thanks and move on.” Is that the right stance?

Anyway, though, today someone at the farmer’s market said “I would *die* to have hair like yours” and AB said, in a non-combative, sunny way, “How would that help you if you were dead?” and I was super proud.

[caption id="attachment_19335" align="alignright" width="215"] Diamond maser. Image from Jonathan Breeze, Imperial College[/caption]This is part 2 of a chat about some recent thoughts and results I had about de Sitter black holes, reported in this arxiv preprint. Part 1 is here, so maybe best to read that first.

Now let us turn to de Sitter black holes. I mean here any black hole for which the asymptotic spacetime is de Sitter spacetime, which is to say it has positive cosmological constant. This is of course also interesting since one of the most natural (to some minds) possible explanations for the accelerating expansion of our universe is a cosmological constant, so maybe all black holes in our universe are de Sitter black holes in some sense. This is also interesting because you often read here about explorations of physics involving negative cosmological constant, so this is a big change!

One of the things people find puzzling about applying the standard black hole thermodynamics is that there are two places where the standard techniques tell you there should be a temperature associated with them. There's the black hole horizon itself, and there's also the cosmological horizon. These each have temperature, and they are not necessarily the same. For the Schwarzschild-de Sitter black hole, for example, (so, no spins or charges... just a mass with an horizon associated with it, like in flat space), the black hole's temperature is always larger than that of the cosmological horizon. In fact, it runs from very large (where the black hole is small) all the way (as the black hole grows) to zero, where the two horizons coincide.

You might wonder, as many have, how to make sense of the two temperatures. This cannot, for a start, be an equilibrium thermodynamics system. Should there be dynamics where the two temperatures try to equalise? Is there heat flow from one horizon to another, perhaps? Maybe there's some missing ingredient needed to make sense of this - do we have any right to be writing down temperatures (an equilibrium thermodynamics concept, really) when the system is not in equilibrium? (Actually, you could ask that about Schwarzschild in flat space - you compute the temperature and then discover that it depends upon the mass in such a way that the system wants to move to a different temperature. But I digress.)

The point of my recent work is that it is entirely within the realm of physics we have to hand to make sense of this. The simple system described in the previous post - the three level maser - has certain key *interconnected* features that seem relevant:

- admits two distinct temperatures and
- a maximum energy, and
- a natural instability (population inversion) and a channel for doing work - the maser output.

My point is that these features are all present for de Sitter black holes too, starting with the two temperatures. But you won't see the rest by staring at just the Schwarzschild case, you need to add rotation, or charge (or both). As we shall see, the ability to reduce angular momentum, or to reduce charge, will be the work channel. I'll come back to the maximum [...] Click to continue reading this post

The post News from the Front XIX: A-Masing de Sitter appeared first on Asymptotia.

Earlier this month, Hao Huang (who, incidentally, was a graduate student here at UCLA) gave a remarkably short proof of a long-standing problem in theoretical computer science known as the sensitivity conjecture. See for instance this blog post of Gil Kalai for further discussion and links to many other online discussions of this result. One formulation of the theorem proved is as follows. Define the -dimensional hypercube graph to be the graph with vertex set , and with every vertex joined to the vertices , where is the standard basis of .

Theorem 1 (Lower bound on maximum degree of induced subgraphs of hypercube)Let be a set of at least vertices in . Then there is a vertex in that is adjacent (in ) to at least other vertices in .

The bound (or more precisely, ) is completely sharp, as shown by Chung, Furedi, Graham, and Seymour; we describe this example below the fold. When combined with earlier reductions of Gotsman-Linial and Nisan-Szegedy; we give these below the fold also.

Let be the adjacency matrix of (where we index the rows and columns directly by the vertices in , rather than selecting some enumeration ), thus when for some , and otherwise. The above theorem then asserts that if is a set of at least vertices, then the minor of has a row (or column) that contains at least non-zero entries.

The key step to prove this theorem is the construction of rather curious variant of the adjacency matrix :

Proposition 2There exists a matrix which is entrywise dominated by in the sense that

Assuming this proposition, the proof of Theorem 1 can now be quickly concluded. If we view as a linear operator on the -dimensional space of functions of , then by hypothesis this space has a -dimensional subspace on which acts by multiplication by . If is a set of at least vertices in , then the space of functions on has codimension at most in , and hence intersects non-trivially. Thus the minor of also has as an eigenvalue (this can also be derived from the Cauchy interlacing inequalities), and in particular this minor has operator norm at least . By Schur’s test, this implies that one of the rows or columns of this matrix has absolute values summing to at least , giving the claim.

Remark 3The argument actually gives a strengthening of Theorem 1: there exists a vertex of with the property that for every natural number , there are at least paths of length in the restriction of to that start from . Indeed, if we let be an eigenfunction of on , and let be a vertex in that maximises the value of , then for any we have that the component of is equal to ; on the other hand, by the triangle inequality, this component is at most times the number of length paths in starting from , giving the claim.

This argument can be viewed as an instance of a more general “interlacing method” to try to control the behaviour of a graph on all large subsets by first generating a matrix on with very good spectral properties, which are then partially inherited by the minor of by interlacing inequalities. In previous literature using this method (see e.g., this survey of Haemers, or this paper of Wilson), either the original adjacency matrix , or some non-negatively weighted version of that matrix, was used as the controlling matrix ; the novelty here is the use of signed controlling matrices. It will be interesting to see what further variants and applications of this method emerge in the near future. (Thanks to Anurag Bishoi in the comments for these references.)

The “magic” step in the above argument is constructing . In Huang’s paper, is constructed recursively in the dimension in a rather simple but mysterious fashion. Very recently, Roman Karasev gave an interpretation of this matrix in terms of the exterior algebra on . In this post I would like to give an alternate interpretation in terms of the operation of *twisted convolution*, which originated in the theory of the Heisenberg group in quantum mechanics.

Firstly note that the original adjacency matrix , when viewed as a linear operator on , is a convolution operator

where

is the counting measure on the standard basis , and denotes the ordinary convolution operation

As is well known, this operation is commutative and associative. Thus for instance the square of the adjacency operator is also a convolution operator

where the convolution kernel is moderately complicated:

The factor in this expansion comes from combining the two terms and , which both evaluate to .

More generally, given any bilinear form , one can define the *twisted convolution*

of two functions . This operation is no longer commutative (unless is symmetric). However, it remains associative; indeed, one can easily compute that

In particular, if we define the twisted convolution operator

then the square is also a twisted convolution operator

and the twisted convolution kernel can be computed as

For general bilinear forms , this twisted convolution is just as messy as is. But if we take the specific bilinear form

then for and for , and the above twisted convolution simplifies to

and now is very simple:

Thus the only eigenvalues of are and . The matrix is entrywise dominated by in the sense of (1), and in particular has trace zero; thus the and eigenvalues must occur with equal multiplicity, so in particular the eigenvalue occurs with multiplicity since the matrix has dimensions . This establishes Proposition 2.

Remark 4Twisted convolution is actually just a component of ordinary convolution, but not on the original group ; instead it relates to convolution on a Heisenberg group extension of this group. More specifically, define the Heisenberg group to be the set of pairs with group lawand inverse operation

(one can dispense with the negative signs here if desired, since we are in characteristic two). Convolution on is defined in the usual manner: one has

for any . Now if is a function on the original group , we can define the lift by the formula

and then by chasing all the definitions one soon verifies that

for any , thus relating twisted convolution to Heisenberg group convolution .

Remark 5With the twisting by the specific bilinear form given by (2), convolution by and now anticommute rather than commute. This makes the twisted convolution algebra isomorphic to a Clifford algebra (the real or complex algebra generated by formal generators subject to the relations for ) rather than the commutative algebra more familiar to abelian Fourier analysis. This connection to Clifford algebra (also observed independently by Tom Mrowka and by Daniel Matthews) may be linked to the exterior algebra interpretation of the argument in the recent preprint of Karasev mentioned above.

Remark 6One could replace the form (2) in this argument by any other bilinear form that obeyed the relations and for . However, this additional level of generality does not add much; any such will differ from by an antisymmetric form (so that for all , which in characteristic two implied that for all ), and such forms can always be decomposed as , where . As such, the matrices and are conjugate, with the conjugation operator being the diagonal matrix with entries at each vertex .

Remark 7(Added later) This remark combines the two previous remarks. One can view any of the matrices in Remark 6 as components of a single canonical matrix that is still of dimensions , but takes values in the Clifford algebra from Remark 5; with this “universal algebra” perspective, one no longer needs to make any arbitrary choices of form . More precisely, let denote the vector space of functions from the hypercube to the Clifford algebra; as a real vector space, this is a dimensional space, isomorphic to the direct sum of copies of , as the Clifford algebra is itself dimensional. One can then define a canonical Clifford adjacency operator on this space bywhere are the generators of . This operator can either be identified with a Clifford-valued matrix or as a real-valued matrix. In either case one still has the key algebraic relations and , ensuring that when viewed as a real matrix, half of the eigenvalues are equal to and half equal to . One can then use this matrix in place of any of the to establish Theorem 1 (noting that Schur’s test continues to work for Clifford-valued matrices because of the norm structure on ).

To relate to the real matrices , first observe that each point in the hypercube can be associated with a one-dimensional real subspace (i.e., a line) in the Clifford algebra by the formula

for any (note that this definition is well-defined even if the are out of order or contain repetitions). This can be viewed as a discrete line bundle over the hypercube. Since for any , we see that the -dimensional real linear subspace of of sections of this bundle, that is to say the space of functions such that for all , is an invariant subspace of . (Indeed, using the left-action of the Clifford algebra on , which commutes with , one can naturally identify with , with the left action of acting purely on the first factor and acting purely on the second factor.) Any trivialisation of this line bundle lets us interpret the restriction of to as a real matrix. In particular, given one of the bilinear forms from Remark 6, we can identify with by identifying any real function with the lift defined by

whenever . A somewhat tedious computation using the properties of then eventually gives the intertwining identity

and so is conjugate to .

** — 1. The Chung, Furedi, Graham, and Seymour example — **

The paper of by Chung, Furedi, Graham, and Seymour gives, for any , an example of a subset of of cardinality for which the maximum degree of restricted to is at most , thus showing that Theorem 1 cannot be improved (beyond the trivial improvement of upgrading to , because the maximum degree is obviously a natural number).

Define the “Möbius function” to be the function

for . This function is extremely balanced on coordinate spaces. Indeed, from the binomial theorem (which uses the convention ) we have

More generally, given any index set of cardinality , we have

Now let be a partition of into disjoint non-empty sets. For each , let be the subspace of consisting of those such that for all . Then for any , we have

and the right-hand side vanishes if and equals when . Applying the inclusion-exclusion principle, we conclude that

and thus also (assuming )

so that

Thus, if denotes the set of those with , together with those with , then has to have two more elements than its complement , and hence has cardinality .

Now observe that, if with and , then , and if then unless . Thus in this case the total number of for which is at most . Conversely, if with and , then , and for each there is at most one that will make lie in . Hence in this case the total number of for which is at most . Thus the maximum degree of the subgraph of induced by is at most . By choosing the to be a partition of into pieces, each of cardinality at most , we obtain the claim.

Remark 8Suppose that is a perfect square, then the lower bound here exactly matches the upper bound in Theorem 1. In particular, the minor of the matrix must have an eigenvector of eigenvalue . Such an eigenvector can be explicitly constructed as follows. Let be the vector defined by settingfor all . Expanding out the left-hand side, we wish to show that

First suppose that is of the form (3). One checks that lies in precisely when for one of the , in which case

Since , this simplifies using (3) as

giving (5) in this case. Similarly, if is of the form (4), then lies in precisely when , in which case one can argue as before to show that

and (5) again follows. Finally, if is not of either of the two forms (3), (4), one can check that is never of these forms either, and so both sides of (5) vanish.

The same analysis works for any of the other bilinear forms in Remark 6. Using the Clifford-valued operator from Remark 7, the eigenfunction is cleaner; it is defined by

when is of the form (3), and

when is of the form (4), with otherwise.

** — 2. From induced subgraph bounds to the sensitivity conjecture — **

On the hypercube , let denote the functions

The monomials in are then the characters of , so by Fourier expansion every function can be viewed as a polynomial in the (with each monomial containing at most one copy of ; higher powers of each are unnecessary since . In particular, one can meaningfully talk about the degree of a function . Observe also that the Möbius function from the preceding section is just the monomial .

Define the *sensitivity* of a Boolean function to be the largest number for which there is an such that there are at least values of with . Using an argument of Gotsman and Linial, we can now relate the sensitivity of a function to its degree:

Corollary 9 (Lower bound on sensitivity)For any boolean function , one has .

*Proof:* Write . By permuting the indices, we may assume that contains a non-trivial multiple of the monomial . By restricting to the subspace (which cannot increase the sensitivity), we may then assume without loss of generality that . The Fourier coefficient of is just the mean value

of times the Möbius function , so this mean value is non-zero. This means that one of the sets or has cardinality at least . Let denote the larger of these two sets. By Theorem 1, there is an such that for at least values of ; since , this implies that for at least values of , giving the claim.

The construction of Chung, Furedi, Graham, and Seymour from the previous section can be easily adapted to show that this lower bound is tight (other than the trivial improvement of replacing by ).

Now we need to digress on some bounds involving polynomials of one variable. We begin with an inequality of Bernstein concerning trigonometric polynomials:

Lemma 10 (Bernstein inequality)Let be a trigonometric polynomial of degree at most , that is to say a complex linear combination of for . Then

Observe that equality holds when or . Specialising to linear combinations of , we obtain the classical Bernstein inequality

for complex polynomials of degree at most .

*Proof:* If one is willing to lose a constant factor in this estimate, this bound can be easily established from modern Littlewood-Paley theory (see e.g., Exercise 52 of these lecture notes). Here we use an interlacing argument due to Boas. We first restrict to the case when has real coefficients. We may normalise . Let be a real parameter in . The trigonometric polynomial alternately takes the values and at the values . Thus the trigonometric polynomial alternates in sign at these values, and thus by the intermediate value theorem has a zero on each of the intervals . On the other hand, a trigonometric polynomial of degree at most can be expressed by de Moivre’s theorem as times a complex polynomial in of degree at most , and thus has at most zeroes. Thus we see that has exactly one zero in each . Furthermore, at this zero, the derivative of this function must be positive if is increasing on this interval, and negative if is decreasing on this interval. In summary, we have shown that if and are such that , then has the same sign as . By translating the function , we also conclude that if and are such that for some , then has the same sign as .

If , then we can find such that and is positive, and we conclude that ; thus we have the upper bound

A similar argument (with now chosen to be negative) similarly bounds . This gives the claim for real-valued trigonometric polynomials . (Indeed, this argument even gives the slightly sharper bound .)

To amplify this to complex valued polynomials, we take advantage of phase rotation invariance. If is a complex trigonometric polynomial, then by applying Bernstein’s inequality to the real part we have

But then we can multiply by any complex phase and conclude that

Taking suprema in , one obtains the claim for complex polynomials .

The analogue of Bernstein’s inequality for the unit interval is known as Markov’s inequality for polynomials:

Lemma 11 (Markov’s inequality for polynomials)Let be a polynomial of degree . Then

This bound is sharp, as is seen by inspecting the Chebyshev polynomial , defined as the unique polynomial giving the trigonometric identity

Differentiating (6) using the chain rule, we see that

the right-hand side approaches as , demonstrating that the factor here is sharp.

*Proof:* We again use an argument of Boas. We may normalise so that

The function is a trigonometric polynomial of degree at most , so by Bernstein’s inequality and the chain rule we have

for all . This already gives Markov’s inequality except in the edge regions (since ). By reflection symmetry, it then suffices to verify Markov’s inequality in the region .

From (6), the Chebyshev polynomial attains the values alternately at the different points . Thus, if , the polynomial changes sign at least times on , and thus must have all zeroes inside this interval by the intermediate value theorem; furthermore, of these zeroes will lie to the left of . By Rolle’s theorem, the derivative then has all zeroes in the interval , and at least of these will lie to the left of . In particular, the derivative can have at most one zero to once to the right of .

Since , is positive at , and hence positive as since there are no zeroes outside of . Thus the leading coefficient of is positive, which implies the same for its derivative . Thus is positive when .

From (9) one has , hence by (7) we see that is also positive at . Thus cannot become negative for , as this would create at least two zeroes to the right of . We conclude that in this region we have

From (7) we have , and the claim follows.

Remark 12The following slightly shorter argument gives the slightly weaker bound . We again use the normalisation (8). By two applications of Bernstein’s inequality, the function has first derivative bounded in magnitude by , and second derivative bounded in magnitude by . As this function also has vanishing first derivative at , we conclude the boundsand thus by the chain rule

For , one easily checks that the right-hand side is at most , giving the claim.

This implies a result of Ehlich-Zeller and of Rivlin-Cheney:

Corollary 13 (Discretised Markov inequality)Let be a polynomial of degree . If

*Proof:* We use an argument of Nisan and Szegedy. Assume for sake of contradiction that , so in particular . From the fundamental theorem of calculus and the triangle inequality one has

By a rescaled and translated version of Markov’s inequality we have

which when inserted into the preceding inequality gives after some rearranging

and then after a second application of (11) gives

Comparing with (10), we conclude that

and the claim follows after some rearranging.

Nisan and Szegedy observed that this one-dimensional degree bound can be lifted to the hypercube by a symmetrisation argument:

Corollary 14 (Multidimensional Markov inequality bound)Let be such that and for . Then

*Proof:* By averaging over all permutations of the indices (which can decrease the degree of , but not increase it), we may assume that is a symmetric function of the inputs . Using the Newton identities, we can then write

for some real polynomial of degree at most , where

is the Hamming length of . By hypothesis, , , and , hence by the mean-value theorem . Applying Corollary 13 with , we obtain the claim.

Define the *block sensitivity* of a Boolean function to be the largest number for which there is an such that there are at least disjoint subsets of with for . We have

Theorem 15 (Sensitivity conjecture)One has

More precisely, the sensitivity conjecture of Nisan and Szegedy asserted the bound ; Huang’s result thus gives explicit values for the exponents. It is still open whether the exponent in this theorem can be improved; it is known that one cannot improve it to below , by analysing a variant of the Chung-Furedi-Graham-Seymour example (see these notes of Regan for details). *Proof:* The lower bound for is immediate from the definitions, since the sensitivity arises by restricting the in the definition of block sensitivity to singleton sets. To prove the upper bound, it suffices from Proposition 9 to establish the bound

Let . By hypothesis, there are and disjoint subsets of such that for . We may normalise , and . If we then define the pullback boolean function by the formula

then it is easy to see that , , and for . The claim now follows from Corollary 14.

Remark 16The following slightly shorter variant of the argument lets one remove the factor . Let be as above. We again may normalise and . For any , let be iid Bernoulli random variable that equal with probability and with probability . The quantityis a trigonometric polynomial of degree at most that is bounded in magnitude by , so by two applications of Bernstein’s inequality

On the other hand, for small , the random variable is equal to zero with probability and equal to each with probability , hence

and hence . Combining these estimates we obtain and hence .

Remark 17The sensitivity of a Boolean function can be split as , where is largest number for which there is an such that there are at least values of with . It is not difficult to use the case of Remark 3 to improve Corollary 9 slightly to . Combining this with the previous remark, we can thus improve the upper bound in Theorem 15 slightly to

I have a friend who is shall we say, pessimistic, about science communication. He thinks it’s too much risk for too little gain, too many misunderstandings while the most important stuff is so abstract the public will never understand it anyway. When I asked him for an example, he started telling me about a professor who works on **the continuum hypothesis**.

The continuum hypothesis is about different types of infinity. You might have thought there was only one type of infinity, but in the nineteenth century the mathematician Georg Cantor showed there were more, the most familiar of which are **countable** and **uncountable**. If you have a countably infinite number of things, then you can “count” them, “one, two, three…”, assigning a number to each one (even if, since they’re still infinite, you never actually finish). To imagine something uncountably infinite, think of a **continuum**, like distance on a meter stick, where you can always look at smaller and smaller distances. Cantor proved, using various ingenious arguments, that these two types of infinity are different: the continuum is “bigger” than a mere countable infinity.

Cantor wondered if there could be something in between, a type of infinity bigger than countable and smaller than uncountable. His hypothesis (now called the continuum hypothesis) was that there wasn’t: he thought there was no type of infinite between countable and uncountable.

(If you think you have an easy counterexample, you’re wrong. In particular, fractions are countable.)

Kurt Gödel didn’t prove the continuum hypothesis, but in 1940 he showed that at least it couldn’t be *disproved*, which you’d think would be good enough. In 1964, though, another mathematician named Paul Cohen showed that the continuum hypothesis *also can’t be proved*, at least with mathematicians’ usual axioms.

In science, if something can’t be proved or disproved, then we shrug our shoulders and say we don’t know. Math is different. In math, we choose the axioms. All we have to do is make sure they’re consistent.

What Cohen and Gödel really showed is that mathematics is consistent either way: if the continuum hypothesis is true or false, the rest of mathematics still works just as well. You can add it as an extra axiom, and add-on that gives you different types of infinity but doesn’t change everyday arithmetic.

You might think that this, finally, would be the end of the story. Instead, it was the beginning of a lively debate that continues to this day. It’s a debate that touches on what mathematics is *for*, whether infinity is merely a concept or something out there in the world, whether some axioms are right or wrong and what happens when you change them. It involves attempts to codify intuition, arguments about which rules “make sense” that blur the boundary between philosophy and mathematics. It also involves the professor my friend mentioned, W. H. Woodin.

Now, can I explain Woodin’s research to you?

No. I don’t understand it myself, it’s far more abstract and weird than any mathematics I’ve ever touched.

Despite that, I can tell you something about it. I can tell you about the quest he’s on, its history and its relevance, what is and is not at stake. I can get you excited, *for the same reasons that I’m excited*, I can show you it’s important *for the same reasons I think it’s important*. I can give you the “flavor” of the topic, and broaden your view of the world you live in, one containing a hundred-year conversation about the nature of infinity.

My friend is right that the public will never understand everything. I’ll never understand everything either. But what we can do, what I strive to do, is to appreciate this wide weird world in all its glory. That, more than anything, is why I communicate science.

Writing writing writing. In the meantime:

- This is a very solid video about physics careers - especially at around the 10 minute mark.
- J. Robert Schrieffer has passed away. Truly a scientific pioneer.
- I came upon this article yesterday in
*Nat Rev Physics*, and it seems like an exceptionally clear review of strong coupling physics between light and matter, beyond the usual Purcell factor cavity QED effects. - SpaceX really proceeds in testing at a remarkable pace.
- The scanning Josephson microscope is a very impressive piece of kit!

For those who haven’t yet seen it, Erica Klarreich has a wonderful article in *Quanta* on Hao Huang’s proof of the Sensitivity Conjecture. *This* is how good popular writing about math can be.

Klarreich quotes my line from this blog, “I find it hard to imagine that even God knows how to prove the Sensitivity Conjecture in any simpler way than this.” However, even if God doesn’t know a simpler proof, that of course doesn’t rule out the possibility that **Don Knuth** does! And indeed, a couple days ago Knuth posted his own variant of Huang’s proof on his homepage—in Knuth’s words, fleshing out the argument that Shalev Ben-David previously posted on this blog—and then left a comment about it here, the first comment by Knuth that I know about on this blog or any other blog. I’m honored—although as for whether the variants that avoid the Cauchy Interlacing Theorem are actually “simpler,” I guess I’ll leave that between Huang, Ben-David, Knuth, and God.

In *Communications of the ACM*, Samuel Greengard has a good, detailed article on Ewin Tang and her dequantization of the quantum recommendation systems algorithm. One warning (with thanks to commenter Ted): the sentence “The only known provable separation theorem between quantum and classical is sqrt(*n*) vs. *n*” is mistaken, though it gestures in the direction of a truth. In the *black-box* setting, we can rigorously prove all sorts of separations: sqrt(*n*) vs. *n* (for Grover search), exponential (for period-finding), and more. In the non-black-box setting, we can’t prove any such separations at all.

Last week I returned to the US from the FQXi meeting in the Tuscan countryside. This year’s theme was “Mind Matters: Intelligence and Agency in the Physical World.” I gave a talk entitled “The Search for Physical Correlates of Consciousness: Lessons from the Failure of Integrated Information Theory” (PowerPoint slides here), which reprised my blog posts critical of IIT from five years ago. There were thought-provoking talks by many others who might be known to readers of this blog, including Sean Carroll, David Chalmers, Max Tegmark, Seth Lloyd, Carlo Rovelli, Karl Friston … you can see the full schedule here. Apparently video of the talks is not available yet but will be soon.

Let me close this post by sharing two important new insights about quantum mechanics that emerged from my conversations at the FQXi meeting:

(1) In Hilbert space, no one can hear you scream. Unless, that is, you scream the exact same way everywhere, or unless you split into separate copies, one for each different way of screaming.

(2) It’s true that, as a matter of logic, the Schrödinger equation does not imply the Born Rule. Having said that, if the Schrödinger equation were leading a rally, and the crowd started a chant of “BORN RULE! BORN RULE! BORN RULE!”—the Schrödinger equation would just smile and wait 13 seconds for the chant to die down before continuing.

I have just uploaded to the arXiv my paper “Sharp bounds for multilinear curved Kakeya, restriction and oscillatory integral estimates away from the endpoint“, submitted to Mathematika. In this paper I return (after more than a decade’s absence) to one of my first research interests, namely the Kakeya and restriction family of conjectures. The starting point is the following “multilinear Kakeya estimate” first established in the non-endpoint case by Bennett, Carbery, and myself, and then in the endpoint case by Guth (with further proofs and extensions by Bourgain-Guth and Carbery-Valdimarsson:

Theorem 1 (Multilinear Kakeya estimate)Let be a radius. For each , let denote a finite family of infinite tubes in of radius . Assume the following axiom:

- (i) (Transversality) whenever is oriented in the direction of a unit vector for , we have
for some , where we use the usual Euclidean norm on the wedge product .

The original proof of this proceeded using a heat flow monotonicity method, which in my previous post I reinterpreted using a “virtual integration” concept on a fractional Cartesian product space. It turns out that this machinery is somewhat flexible, and can be used to establish some other estimates of this type. The first result of this paper is to extend the above theorem to the curved setting, in which one localises to a ball of radius (and sets to be small), but allows the tubes to be curved in a fashion. If one runs the heat flow monotonicity argument, one now picks up some additional error terms arising from the curvature, but as the spatial scale approaches zero, the tubes become increasingly linear, and as such the error terms end up being an integrable multiple of the main term, at which point one can conclude by Gronwall’s inequality (actually for technical reasons we use a bootstrap argument instead of Gronwall). A key point in this approach is that one obtains optimal bounds (not losing factors of or ), so long as one stays away from the endpoint case (which does not seem to be easily treatable by the heat flow methods). Previously, the paper of Bennett, Carbery, and myself was able to use an induction on scale argument to obtain a curved multilinear Kakeya estimate losing a factor of (after optimising the argument); later arguments of Bourgain-Guth and Carbery-Valdimarsson, based on algebraic topology methods, could also obtain a curved multilinear Kakeya estimate without such losses, but only in the algebraic case when the tubes were neighbourhoods of algebraic curves of bounded degree.

Perhaps more interestingly, we are also able to extend the heat flow monotonicity method to apply directly to the multilinear restriction problem, giving the following global multilinear restriction estimate:

Theorem 2 (Multilinear restriction theorem)Let be an exponent, and let be a parameter. Let be a sufficiently large natural number, depending only on . For , let be an open subset of , and let be a smooth function obeying the following axioms:for any , , extended by zero outside of , and denotes the extension operator

Local versions of such estimate, in which is replaced with for some , and one accepts a loss of the form , were already established by Bennett, Carbery, and myself using an induction on scale argument. In a later paper of Bourgain-Guth these losses were removed by “epsilon removal lemmas” to recover Theorme 2, but only in the case when all the hypersurfaces involved had curvatures bounded away from zero.

There are two main new ingredients in the proof of Theorem 2. The first is to replace the usual induction on scales scheme to establish multilinear restriction by a “ball inflation” induction on scales scheme that more closely resembles the proof of decoupling theorems. In particular, we actually prove the more general family of estimates

where denotes the local energies

(actually for technical reasons it is more convenient to use a smoother weight than the strict cutoff to the disk ). With logarithmic losses, it is not difficult to establish this estimate by an upward induction on . To avoid such losses we use the heat flow monotonicity method. Here we run into the issue that the extension operators are complex-valued rather than non-negative, and thus would not be expected to obey many good montonicity properties. However, the local energies can be expressed in terms of the magnitude squared of what is essentially the Gabor transform of , and these are non-negative; furthermore, the dispersion relation associated to the extension operators implies that these Gabor transforms propagate along tubes, so that the situation becomes quite similar (up to several additional lower order error terms) to that in the multilinear Kakeya problem. (This can be viewed as a continuous version of the usual wave packet decomposition method used to relate restriction and Kakeya problems, which when combined with the heat flow monotonicity method allows for one to use a continuous version of induction on scales methods that do not concede any logarithmic factors.)

Finally, one can combine the curved multilinear Kakeya result with the multilinear restriction result to obtain estimates for multilinear oscillatory integrals away from the endpoint. Again, this sort of implication was already established in the previous paper of Bennett, Carbery, and myself, but the arguments there had some epsilon losses in the exponents; here we were able to run the argument more carefully and avoid these losses.

[caption id="attachment_19313" align="alignright" width="250"] Hubble photo of jupiter's aurorae.[/caption]Another title for this could be "Making sense of de Sitter black hole thermodynamics", I suppose. What I'm going to tell you about is either a direct correspondence or a series of remarkable inspiring coincidences. Either way, I think you will come away agreeing that there is certainly something interesting afoot.

It is an idea I'd been tossing around in my head from time to time over years, but somehow did not put it all together, and then something else I was working on years later, that was seemingly irrelevant, helped me complete the puzzle, resulting in my new paper, which (you guessed it) I'm excited about.

It all began when I was thinking about heat engines, for black holes in anti-de Sitter, which you may recall me talking about in posts here, here, and here, for example. Those are reciprocating heat engines, taking the system through a cycle that -through various stages- takes in heat, does work, and exhausts some heat, then repeats and repeats. And repeats.

I've told you the story about my realisation that there's this whole literature on quantum heat engines that I'd not known about, that I did not even know of a thing called a quantum heat engine, and my wondering whether my black hole heat engines could have a regime where they could be considered quantum heat engines, maybe enabling them to be useful tools in that arena...(resulting in the paper I described here)... and my delight in combining 18th Century physics with 21st Century physics in this interesting way.

All that began back in 2017. One thing I kept coming back to that really struck me as lovely is what can be regarded as the prototype quantum heat engine. It was recognized as such as far back as 1959!! It is a continuous heat engine, meaning that it does its heat intake and work and heat output all at the same time, as a continuous flow. It is, in fact a familiar system - the three-level maser! (a basic laser also uses the key elements).

A maser can be described as taking in energy as heat from an external source, and giving out energy in the form of heat and work. The work is the desired [...] Click to continue reading this post

The post News from the Front, XVIII: de Sitter Black Holes and Continuous Heat Engines appeared first on Asymptotia.

Our current understanding of the Universe includes the rather unsettling notion that most of its matter is not luminous - it does not clump into stars, that is. Nobody has a clue of what this Dark Matter (DM) really is, and hypotheses on what it could be made of are sold at a dime a dozen.

On the other hand, we clearly see the gravitational effects of DM on galaxies and clusters of galaxies, so the consensus of the scientific community is that one of those cheap theories must be true. What make this very close to a dream situation for an experimental scientist is the fact that we do have instruments capable of detecting, or ruling out, dark matter behaving according to most of the majority of those possibilities.

On the other hand, we clearly see the gravitational effects of DM on galaxies and clusters of galaxies, so the consensus of the scientific community is that one of those cheap theories must be true. What make this very close to a dream situation for an experimental scientist is the fact that we do have instruments capable of detecting, or ruling out, dark matter behaving according to most of the majority of those possibilities.

It’s the most famous equation in all of physics, written on thousands of chalkboard stock photos. Part of its charm is its simplicity: E for energy, m for mass, c for the speed of light, just a few simple symbols in a one-line equation. Despite its simplicity, is deep and important enough that there are books dedicated to explaining it.

What does mean?

Some will tell you it means mass can be converted to energy, enabling nuclear power and the atomic bomb. This is a useful picture for chemists, who like to think about balancing ingredients: this much mass on one side, this much energy on the other. It’s not the best picture for physicists, though. It makes it sound like energy is some form of “stuff” you can pour into your chemistry set flask, and energy really isn’t like that.

There’s another story you might have heard, in older books. In that story, tells you that in relativity mass, like distance and time, is relative. The more energy you have, the more mass you have. Those books will tell you that this is why you can’t go faster than light: the faster you go, the greater your mass, and the harder it is to speed up.

Modern physicists don’t talk about it that way. In fact, we don’t even write that way. We’re more likely to write:

“v” here stands for the velocity, how fast the mass is moving. The faster the mass moves, the more energy it has. Take v to zero, and you get back the familiar .

The older books weren’t lying to you, but they were thinking about a different notion of mass: “relativistic mass” instead of “rest mass” $m_0$, related like this:

which explains the difference in how we write .

Why the change? In part, it’s because of particle physics. In particle physics, we care about the rest mass of particles. Different particles have different rest mass: each electron has one rest mass, each top quark has another, regardless of how fast they’re going. They still get more energy, and harder to speed up, the faster they go, but we don’t describe it as a change in mass. Our equations match the old books, we just talk about them differently.

Of course, you can dig deeper, and things get stranger. You might hear that mass *does* change with energy, but in a very different way. You might hear that mass *is* energy, that they’re just two perspectives on the same thing. But those are stories for another day.

I titled this post “The Real E=mc^2”, but to clarify, none of these explanations are more “real” than the others. They’re words, useful in different situations and for different people. “The Real E=mc^2” isn’t the of nuclear chemists, or old books, or modern physicists. It’s the theory itself, the mathematical rules and principles that all the rest are just trying to describe.

Ferromagnets are solids, in pretty nearly every instance I can recall (though I suppose it's not impossible to imagine an itinerant Stoner magnet that's a liquid below its Curie temperature, and here is one apparent example). There's a neat paper in Science this week, reporting liquid droplets that act like ferromagnets and can be reshaped.

The physics at work here is actually a bit more interesting than just a single homogeneous material that happens to be liquid below its magnetic ordering temperature. The liquid in this case is a suspension of magnetite nanoparticles. Each nanoparticle is magnetic, as the microscopic ordering temperature for Fe_{3}O_{4} is about 858 K. However, the individual particles are so small (22 nm in diameter) that they are superparamagnetic at room temperature, meaning that thermal fluctuations are energetic enough to reorient how the little north/south poles of the single-domain particles are pointing. Now, if the interface at the surface of the suspension droplet confines the nanoparticles sufficiently, they jam together with such small separations that their magnetic interactions are enough to lock their magnetizations, killing the superparamagnetism and leading to a bulk magnetic response from the aggregate. Pretty cool! (Extra-long-time readers of this blog will note that this hearkens waaaay back to this post.)

The physics at work here is actually a bit more interesting than just a single homogeneous material that happens to be liquid below its magnetic ordering temperature. The liquid in this case is a suspension of magnetite nanoparticles. Each nanoparticle is magnetic, as the microscopic ordering temperature for Fe

Applied Category Theory 2019 happened last week! It was very exciting: about 120 people attended, and they’re pushing forward to apply category theory in many different directions. The topics ranged from ultra-abstract to ultra-concrete, sometimes in the same talk.

The talks are listed above — click for a more readable version. Below you can read what Jules Hedges and I wrote about all those talks:

• Jules Hedges, Applied Category Theory 2019.

I tend to give terse summaries of the talks, with links to the original papers or slides. Jules tends to give his impressions of their overall significance. They’re nicely complementary.

You can also see videos of some talks, created by Jelle Herold with help from Fabrizio Genovese:

• Giovanni de Felice, Functorial question answering.

• Antonin Delpeuch, Autonomization of monoidal categories.

• Colin Zwanziger, Natural model semantics for comonadic and adjoint modal type theory.

• Nicholas Behr, Tracelets and tracelet analysis Of compositional rewriting systems.

• Dan Marsden, No-go theorems for distributive laws.

• Christian Williams, Enriched Lawvere theories for operational semantics.

• Walter Tholen, Approximate composition.

• Erwan Beurier, Interfacing biology, category theory & mathematical statistics.

• Stelios Tsampas, Categorical contextual reasoning.

• Fabrizio Genovese, idris-ct: A library to do category theory in Idris.

• Michael Johnson, Machine learning and bidirectional transformations.

• Bruno Gavranović, Learning functors using gradient descent

• Zinovy Diskin, Supervised learning as change propagation with delta lenses.

• Bryce Clarke, Internal lenses as functors and cofunctors.

• Ryan Wisnewsky, Conexus AI.

• Ross Duncan, Cambridge Quantum Computing.

• Beurier Erwan, Memoryless systems generate the class of all discrete systems.

• Blake Pollard, Compositional models for power systems.

• Martti Karvonen, A comonadic view of simulation and quantum resources.

• Quanlong Wang, ZX-Rules for 2-qubit Clifford+T quantum circuits, and beyond.

• James Fairbank, A Compositional framework for scientific model augmentation.

• Titoan Carette, Completeness of graphical languages for mixed state quantum mechanics.

• Antonin Delpeuch, A complete language for faceted dataflow languages.

• John van der Wetering, An effect-theoretic reconstruction of quantum mechanics.

• Vladimir Zamdzhiev, Inductive datatypes for quantum programming.

• Octavio Malherbe, A categorical construction for the computational definition of vector spaces.

• Vladimir Zamdzhiev, Mixed linear and non-linear recursive types.

Yoram Alhassid asked the question at the end of my Yale Quantum Institute colloquium last February. I knew two facts about Yoram: (1) He belongs to Yale’s theoretical-physics faculty. (2) His PhD thesis’s title—“On the Information Theoretic Approach to Nuclear Reactions”—ranks among my three favorites.^{1}

Over the past few months, I’ve grown to know Yoram better. He had reason to ask about quantum statistical mechanics, because his research stands up to its ears in the field. If forced to synopsize quantum statistical mechanics in five words, I’d say, “study of many-particle quantum systems.” Examples include gases of ultracold atoms. If given another five words, I’d add, “Calculate and use partition functions.” A partition function is a measure of the number of states, or configurations, accessible to the system. Calculate a system’s partition function, and you can calculate the system’s average energy, the average number of particles in the system, how the system responds to magnetic fields, etc.

My colloquium concerned quantum thermodynamics, which I’ve blogged about many times. So I should have been able to distinguish quantum thermodynamics from its neighbors. But the answer I gave Yoram didn’t satisfy me. I mulled over the exchange for a few weeks, then emailed Yoram a 502-word essay. The exercise grew my appreciation for the question and my understanding of my field.

An adaptation of the email appears below. The adaptation should suit readers who’ve majored in physics, but don’t worry if you haven’t. Bits of what distinguishes quantum thermodynamics from quantum statistical mechanics should come across to everyone—as should, I hope, the value of question-and-answer sessions:

One distinction is a return to the operational approach of 19th-century thermodynamics. Thermodynamicists such as Sadi Carnot wanted to know how effectively engines could operate. Their practical questions led to fundamental insights, such as the Carnot bound on an engine’s efficiency. Similarly, quantum thermodynamicists often ask, “How can this state serve as a resource in thermodynamic tasks?” This approach helps us identify what distinguishes quantum theory from classical mechanics.

For example, quantum thermodynamicists found an advantage in charging batteries via nonlocal operations. Another example is the “MBL-mobile” that I designed with collaborators.* *Many-body localization (MBL), we found, can enhance an engine’s reliability and scalability.

Asking, “How can this state serve as a resource?” leads quantum thermodynamicists to design quantum engines, ratchets, batteries, etc. We analyze how these devices can outperform classical analogues, identifying which aspects of quantum theory power the outperformance. This question and these tasks contrast with the questions and tasks of many non-quantum-thermodynamicists who use statistical mechanics. They often calculate response functions and (e.g., ground-state) properties of Hamiltonians.

These goals of characterizing what nonclassicality is and what it can achieve in thermodynamic contexts resemble upshots of quantum computing and cryptography. As a 21st-century quantum information scientist, I understand what makes quantum theory quantum partially by understanding which problems quantum computers can solve efficiently and classical computers can’t. Similarly, I understand what makes quantum theory quantum partially by understanding how much more work you can extract from a singlet (a maximally entangled state of two qubits) than from a product state in which the reduced states have the same forms as in the singlet, .

As quantum thermodynamics shares its operational approach with quantum information theory, quantum thermodynamicists use mathematical tools developed in quantum information theory. An example consists of generalized entropies. Entropies quantify the optimal efficiency with which we can perform information-processing and thermodynamic tasks, such as data compression and work extraction.

Most statistical-mechanics researchers use just the Shannon and von Neumann entropies, and , and perhaps the occasional relative entropy. These entropies quantify optimal efficiencies in large-system limits, e.g., as the number of messages compressed approaches infinity and in the thermodynamic limit.

Other entropic quantities have been defined and explored over the past two decades, in quantum and classical information theory. These entropies quantify the optimal efficiencies with which tasks can be performed (i) if the number of systems processed or the number of trials is arbitrary, (ii) if the systems processed share correlations, (iii) in the presence of “quantum side information” (if the system being used as a resource is entangled with another system, to which an agent has access), or (iv) if you can tolerate some probability that you fail to accomplish your task. Instead of limiting ourselves to and , we use also “-smoothed entropies,” Rényi divergences, hypothesis-testing entropies, conditional entropies, etc.

Another hallmark of quantum thermodynamics is results’ generality and simplicity. Thermodynamics characterizes a system with a few macroscopic observables, such as temperature, volume, and particle number. The simplicity of some quantum thermodynamics served a chemist collaborator and me, as explained in the introduction of https://arxiv.org/abs/1811.06551.

Yoram’s question reminded me of one reason why, as an undergrad, I adored studying physics in a liberal-arts college. I ate dinner and took walks with students majoring in economics, German studies, and Middle Eastern languages. They described their challenges, which I analyzed with the physics mindset that I was acquiring. We then compared our approaches. Encountering other disciplines’ perspectives helped me recognize what tools I was developing as a budding physicist. How can we know our corner of the world without stepping outside it and viewing it as part of a landscape?

^{1}The title epitomizes clarity and simplicity. And I have trouble resisting anything advertised as “the information-theoretic approach to such-and-such.”

Bob Coecke, David Spivak, Christina Vasilakopoulou and I are running a conference on applied category theory:

• Applied Category Theory 2019, 15–19 July, 2019, Lecture Theatre B of the Department of Computer Science, 10 Keble Road, Oxford.

You can now see the program here, or below. Hope to see you soon!

While you and I may have been lagging behind a bit as of late, excused by a particularly hot July, the CMS collaboration has kept grinding its data, producing exquisite new results from the large amounts of proton-proton collisions that the experiment has been collecting during Run 2, i.e. until last year.

Of course, the process of going from subatomic collisions to submitted papers is a long and complex one. The first part of the journey involves triggering, storage, and reconstruction of the acquired datasets, and reduction of the data into an analysis-friendly format. While this might sound like a partly automatic and painless procedure, it involves the expense of liters of sweat by conscentious collaborators who oversee the data acquisition and their processing.

Of course, the process of going from subatomic collisions to submitted papers is a long and complex one. The first part of the journey involves triggering, storage, and reconstruction of the acquired datasets, and reduction of the data into an analysis-friendly format. While this might sound like a partly automatic and painless procedure, it involves the expense of liters of sweat by conscentious collaborators who oversee the data acquisition and their processing.

While I wait to board a flight at my favorite location on earth—Philadelphia International Airport—I figured I might as well blog something to mark the 50^{th} anniversary of Apollo 11. (Thanks also to Joshua Zelinsky for a Facebook post that inspired this.)

I wasn’t alive for Apollo, but I’ve been alive for 3/4 of the time *after* it, even though it now seems like ancient history—specifically, like a Roman cathedral being gawked at by a medieval peasant, like an achievement by some vanished, more cohesive civilization that we can’t even replicate today, let alone surpass.

Which brings me to a depressing mystery: why do so many people now deny that humans walked on the moon at all? Like, why *that* specifically? While they’re at it, why don’t they also deny that WWII happened, or that the Beatles existed?

Surprisingly, skepticism of the reality of Apollo seems to have gone all the way back to the landings themselves. One of my favorite stories growing up was of my mom, as a teenager, working as a waitress at an Israeli restaurant in Philadelphia, on the night of Apollo 11 landing. My mom asked for a few minutes off to listen to news of the landing on the radio. The owners wouldn’t grant it—explaining that it was all Hollywood anyway, just some actors in spacesuits on a sound stage, and obviously my mom wasn’t so naïve as to think anyone was *actually* walking to the moon?

Alas, as we get further and further from the event, with no serious prospect of ever replicating it past the stage of announcing an optimistic timetable (nor, to be honest, any scientific *reason* to replicate it), as the people involved die off, and as our civilization becomes ever more awash in social-media-fueled paranoid conspiracies, I fear that moon-landing denalism will become more common.

Because here’s the thing: Apollo could happen, but *only* because of a wildly improbable, once-in-history confluence of social and geopolitical factors. It was economically insane, taking 100,000 people and 4% of the US federal budget for some photo-ops, a flag-planting, some data and returned moon rocks that had genuine scientific value but could’ve been provided much more cheaply by robots. It was dismantled immediately afterwards like a used movie set, rather than leading to any greater successes. Indeed, manned spaceflight severely *regressed* afterwards, surely mocking the expectations of every last science fiction fan and techno-utopian who was alive at that time.

One could summarize the situation by saying that, in certain respects, the Apollo program really **was** “faked.” It’s just that the way they “faked” it, involved actually landing people on the moon!

A few weeks back I posted a poll, asking you guys what sort of physics background you have. The idea was to follow up on a poll I did back in 2015, to see how this blog’s audience has changed.

One thing that immediately leaped out of the data was how many of you are physicists. As of writing this, 66% of readers say they either have a PhD in physics or a related field, or are currently in grad school. This includes 7% specifically from my sub-field, “amplitudeology” (though this number may be higher than usual since we just had our yearly conference, and more amplitudeologists were reminded my blog exists).

I didn’t use the same categories in 2015, so the numbers can’t be easily compared. In 2015 only 2.5% of readers described themselves as amplitudeologists. Adding these up with the physics PhDs and grad students gives 59%, which goes up to 64.5% if I include the mathematicians (who this year might have put either “PhD in a related field” or “Other Academic”). So overall the percentages are pretty similar, though now it looks like more of my readers are grad students.

Despite the small difference, I am a bit worried: it looks like I’m losing non-physicist readers. I could flatter myself and think that I inspired those non-physicists to go to grad school, but more realistically I should admit that fewer of my posts have been interesting to a non-physics audience. In 2015 I worked at the Perimeter Institute, and helped out with their public lectures. Now I’m at the Niels Bohr Institute, and I get fewer opportunities to hear questions from non-physicists. I get fewer ideas for interesting questions to answer.

I want to keep this blog’s language accessible and its audience general. I appreciate that physicists like this blog and view it as a resource, but I don’t want it to turn into a blog for* *physicists only. I’d like to encourage the non-physicists in the audience: ask questions! Don’t worry if it sounds naive, or if the question seems easy: if you’re confused, likely others are too.

One of my earliest memories is standing with my father on the balcony of my grandmother’s house in Auckland. “Ma’s House” had a spectacular view northwards, across Auckland’s Hauraki Gulf, and the Moon was visible in the early evening sky. Following my eye, my father pointed and said, “I think there are people there at the moment.”

I would love to describe this as the father-son moment that lit my lifelong passion for astronomy, but doing so would take me far from the truth. My love of astronomy must have already been well aflame, because my actual recollection is that this was the first time I saw my father – a man frequently summoned to the hospital to tend the sick, who could fix any broken toy and who was able to split logs with an axe – as imperfect, fallible and human. How could any adult, I wondered, not know *exactly* how many people were standing on that pale little ball at *any* moment?

My memory is that this moment took place during a pre-Christmas visit. Looking at the list of Apollo missions I can work out the date: the only landing in any December was Apollo 17, the last mission, just after my sixth birthday in 1972. While it was clear that there would be hiatus in lunar travel after Apollo 17, it is still a shock to realise that I am now in my 50s and that no one significantly younger than me can have a clear memory of a moment when human beings were standing on another world. Ever since that day there have always been precisely zero people standing on the surface of the Moon.

The Apollo landings took place against alongside the Vietnam War and the Civil Rights struggle and constituted the winning entry in a race with the Soviet Union, a competition animated by superpower rivalry. But they are also a story of commitment, courage, teamwork and vision, a literal moonshot that shines as a moment of optimism and purpose. Perhaps even more so when you set it against the wider turmoil of the 1960s. I am not the first person to say this, but before Apollo “flying to the Moon” was a byword for an impossible, ridiculous dream – afterwards it was a metaphor for what can be done when we set ourselves to achieve lofty goals.

Sooner or later we will go back. Several countries and even SpaceX, a private company, are developing concrete plans to return human beings to the Moon. However, even with the most optimistic schedules those new voyages are still a few years off.

If it seems that we have had an unduly long wait, it is worth recalling that after the first trips to the South Pole — with sleds pulled by men and dogs — it was close to 50 years before human beings again stood at 90 degrees South but those subsequent visits were made with tractors and planes, and marked the beginning of a permanent human presence at the Pole.

Similarly, the Apollo missions pushed their 1960s technology to the absolute limit, and Apollo 13 escaped tragedy by the narrowest of margins. While the final missions spent several days on the lunar surface, the Apollo programme would have been hard-pressed to provide the foundation for a permanent lunar base. When humans do go back to the Moon, they will do so using spacecraft and rockets that have been developed to the point where lunar travel can conceivably become routine.

And then – perhaps in a decade or two -- a permanent human presence on the Moon will develop to the point where even the best-informed parents might be forgiven for not knowing exactly how many people are standing on the Moon.

IMAGE: The header image is from the Apollo 16 mission. The launch photo above is Apollo 11, on its way to the moon atop a Saturn V rocket. Both images via NASA.

The analogy between lunar and polar exploration has been made by a number of people — I believe I first heard it from the (now sadly deceased) polymathic Columbia astronomer Arlin Crotts, when we were colleagues in the early 2000s, and it appears in his book The New Moon.

Video games have been a part of my life for about as long as I can remember. From *Paperboy* and *The Last Ninja* on the Commodore 64 when I was barely old enough to operate a keyboard, to *Mario Kart 8* and *Zelda* on the Nintendo Switch, as a postdoc at Caltech, working on quantum computing and condensed matter physics. Up until recently, I have kept my two lives separate: my love of video games and my career in quantum physics.

The realization that I could combine quantum physics with games came during an entertaining discussion with my current supervisor, Gil Refael. Gil and I were brainstorming approaches to develop a quantum version of Tetris. Instead of stopping and laughing it off, or even keeping the idea on the horizon, Gil suggested that we talk to Spyridon (Spiros) Michalakis for some guidance.

This is not the story of Quantum Tetris (yet), but rather the story of how we made a quantum version of a much older, and possibly more universally known game. This is a new game that Spiros and myself have been testing at elementary schools.

And so I am super excited to be able to finally present to you: Quantum TiqTaqToe! As of *right now*, the app is available both for Android devices and iPhone/iPad:

Gil and I knew that Spiros had been involved in prior quantum games (most notably qCraft and Quantum Chess), so he seemed like the perfect contact point. He was conveniently located on the same campus, and even in the same department. But more importantly, he was curious about the idea and eager to talk.

After introducing the idea of Quantum Tetris, Spiros came up with an alternative approach. Seeing as this was going to be my first attempt at creating a video game, not to mention building a game from the ground up with quantum physics, he proposed to put me in touch with Chris Cantwell and help him improve the AI for Quantum Chess.

I thought long and hard about this proposition. Like five seconds. It was an amazing opportunity. I would get to look under the hood of a working and incredibly sophisticated video game, unlike any game ever made: the only game in the world I knew of that was truly based on quantum physics. And I would be solving a critical problem that I would have to deal with eventually, by adapting a conventional, classical rules-based game AI for quantum.

My first focus was to jump on Quantum Chess full-force, with the aim of helping Chris implement a new AI player for the game. After evaluating some possible chess-playing AI engines, including state-of-the-art players based off of Google’s AlphaZero, we landed on Stockfish as our best candidate for integration. The AI is currently hot-swappable though, so users can try to develop their own!

While some of the work for implementing the AI could be done directly using Chris’s C++ implementation of Quantum Chess, other aspects of the work required me to learn the program he had used to develop the user interface. That program is called Unity. Unity is a free game development program that I would highly recommend trying out and playing around with.

This experience was essential to the birth of Quantum TiqTaqToe. In my quest to understand Unity and Quantum Games, I set out to implement a “simple” game to get a handle on how all the different game components worked together. Having a game based on quantum mechanics is one thing; making sure it is fun to play requires an entirely different skill set.

Classic Tic-Tac-Toe is a game in which two players, called X and O, take turns in placing their symbols on a 3×3 grid. The first player to get 3 of their symbols in a line (diagonally, vertically or horizontally) wins. The game goes as far back as ancient Egypt, and evidence of the game has been found on roof tiles dating to 1300 BC [1].

Many variations of the game have existed across many cultures. The first print reference to a game called “tick-tack-toe” was in 1884. In the US the game was renamed “tic-tac-toe” sometime in the 20th century. Here’s a random fun fact: in Dutch, the game is most often referred to as “Butter-Cheese-and-Eggs” [2]. In 1952, computer scientist Alexander S. Douglas at the University of Cambridge turned it into one of the first computer games, featuring an AI player that could play perfect games against a human opponent.

Combinatorics has determined that whoever plays first will win 91 out of 138 possible board combinations. The second player will win in 44 boards. However, if both players play *optimally*, looking ahead through all the possible future outcomes, neither player should ever win and the game always ends in a draw, in one of only 3 board combinations.

*In Quantum TiqTaqToe, with the current ruleset, we don’t yet know if a winning strategy exists.*

I explicitly refer to the *current ruleset* because we currently limit the amount of quantumness in the game. We want to make sure the game is fun to play and ‘graspable’ for now. In addition, it turns out there already is a game called Quantum TicTacToe, developed by Allan Goff [3]. That version of TicTacToe has similar concepts but has a different set of rules.

A typical game of Quantum TiqTaqToe will look very much like regular Tic-Tac-Toe until one of the players decides to make a quantum move:

At this point, the game board enters into a superposition. The X is in each position with 50/50 chance; in one universe the X is on the left and in the other it is on the right. Neither player knows how things will play out. And the game only gets more interesting from here. The opponent can choose to place their O in a superposition between an empty square and a square occupied by a quantum X.

Et voilà, player O has entangled his fate with his opponent’s. Once the two squares become entangled, the only outcomes are X-O or O-X, each with probability ½. Interestingly, since the game is fully quantum, the phase between the two entangled outcomes can in principle be leveraged to create interesting plays through destructive and constructive interference. The app features a simple tutorial (to be updated) that teaches you these moves and a few others. There are boards that classically result in a draw but are quantumly “winnable”.

The squares in TiqTaqToe are all fully quantum. I represent them as qutrits (like qubits, but instead of having states 0 and 1 my qutrits have states 0, 1 and 2), and moves made by the players are unitary operations acting on them. So the game consists of these essential elements:

- The squares of the 3×3 grid are turned into qutrits (Empty, X, O). Each move is a unitary gate operation on those qutrits. I’ll leave the details of the math out, but for the case of qubits check out Chris’ detailed writeup on Quantum Chess [4].
- Quantum TiqTaqToe allows you to select
*two*squares in the grid, providing you with the option of creating a superposition or an entangled state. For the sake of simplicity (i.e. keeping the game fun to play and ‘graspable’ for now), no more than 3 squares can be involved in a given entangled state.

I chose to explicitly track sets of qutrits that share a Hilbert space. The entire quantum state of the game combines these sets with classical strings of the form “XEEOXEOXE”, indicating that the first square is an X, the second is Empty, etc.

So, when does the game end if these quantum states are in play? In Quantum TiqTaqToe, the board collapses to a single classical state as soon as it is full (i.e. every square is non-empty). The resulting state is randomly chosen from all the possible outcomes, with a probability that is equal to the (square of the) wave-function amplitude (basic quantum mechanics). If there is a winner after the collapse, the game ends. Otherwise, the game continues until either there *is* a winner or until there are no more moves to be made (ending in a draw). On top of this, players get the option to forfeit their move for the opportunity to cause a partial collapse of the state, by using the collapse-mode. Future versions may include other ways of collapse, including one that does not involve rolling dice! [5]

Due to quantum physics and the collapse of the state, the game is inherently statistical. So instead of asking: “Can I beat my opponent in a game of Quantum TiqTaqToe?” one should ask “If I play 100 games against my opponent, can I consistently win more than 50 of them?”

You can test your skill against the in-game quantum AI to see if you’ve indeed mastered Quantum TiqTaqToe yet. At the hardest setting, winning even 30% of the time after, say, 20 games may be extraordinary. The implementation of this AI, by the way, would have been a blog-post by itself. For the curious, I can say it is based on the ExpectiMiniMax algorithm. As of the moment of this writing, the hardest AI setting is not available in the app yet. Keep your eyes out for an update soon though!

“*Perhaps kids who grow up playing quantum games will acquire a visceral understanding of quantum phenomena that our generation lacks*.” – John Preskill, in his recent article [6].

From the get-go, Quantum TiqTaqToe (and Quantum Chess) have had outreach as a core motivation. Perhaps future quantum engineers and quantum programmers will look back on their youth and remember playing Quantum TiqTaqToe as I remember my Commodore 64 games. I am convinced that these small steps into the realm of Quantum Games are only just the beginning of an entirely new genre of fun and useful games.

In the meantime, we are hard at work implementing an Online mode so you can play with your fellow human friends remotely too. This online mode, plus the option of fighting a strong quantum AI, will be unlockable in-game through a small fee (unless you are an educator who wishes to introduce quantum physics in class through this game; those use cases are fee-free courtesy of IQIM and NSF). Each purchase will go towards supporting the future development of exciting new Quantum TiqTaqToe features, as well as other exciting Quantum Games (Tetris, anyone?)

Just in case you missed it: the app is available both for Android devices and iPhone/iPad *right now*:

I really hope you enjoy the game, and perhaps use it to get your friends and family excited about quantum physics. Oh, and start practicing! You never know if the online mode will bring along with it a real Quantum TiqTaqToe Tournament down the road

[1] https://en.wikipedia.org/wiki/Tic-tac-toe

[2] The origin of this name in Dutch isn’t really certain as far as I know. Alledgedly, it is a left-over from the period in which butter, cheese and eggs were sold at the door (so was milk, but that was done separately since it was sold daily). The salesman had a list with columns for each of these three products, and would jot down a cross or a zero whenever a customer at an address bought or declined a product. Three crosses in a row would earn them praise from the boss.

[3] https://en.wikipedia.org/wiki/Quantum_tic-tac-toe

[4] https://arxiv.org/abs/1906.05836

[5] https://en.wiktionary.org/wiki/God_does_not_play_dice_with_the_universe

I just returned from some travel, and I have quite a bit of writing I need to do, but here are a few items of interest:

- No matter how many times I see them (here I discussed a result from ten years ago), I'm still impressed by images taken of molecular orbitals, as in the work by IBM Zurich that has now appeared in Science. Here is the relevant video.
- Speaking of good videos, here is a talk by Tadashi Tokieda, presently at Stanford, titled "Science from a Sheet of Paper". Really nicely done, and it shows a great example of how surprising general behavior can emerge from simple building blocks.
- It's a couple of years old now, but this is a nice overview of the experimental state of the problem of high temperature superconductivity, particularly in the cuprates.
- Along those lines, here is a really nice article from SciAm by Greg Boebinger about achieving the promise of those materials.
- Arguments back and forth continue about the metallization of hydrogen.
- And Sean Carroll shows how remunerative it can be to be a science adviser for a Hollywood production.

A few months ago, I got to know Jerry Coyne, the recently-retired biologist at the University of Chicago who writes the blog “Why Evolution Is True.” The interaction started when Jerry put up a bemused post about my thoughts on predictability and free will, and if I pointed out that if he wanted to engage me on those topics, there was more to go on than an 8-minute YouTube video. I told Coyne that it would be a shame to get off on the wrong foot with him, since perusal of his blog made it obvious that whatever he and I disputed, it was dwarfed by our areas of agreement. He and I exchanged more emails and had lunch in Chicago.

By way of explaining how he hadn’t read “The Ghost in the Quantum Turing Machine,” Coyne emphasized the difference in my and his turnaround times: while these days I update my blog only a couple times per month, Coyne often updates multiple times per *day*. Indeed the sheer volume of material he posts, on subjects from biology to culture wars to Chicago hot dogs, would take months to absorb.

Today, though, I want to comment on just two posts of Jerry’s.

The first post, from back in May, concerns David Gelernter, the computer science professor at Yale who was infamously injured in a 1993 attack by the Unabomber, and who’s now mainly known as a right-wing commentator. I don’t know Gelernter, though I did once attend a small interdisciplinary workshop in the south of France that Gelernter also attended, wherein I gave a talk about quantum computing and computational complexity in which Gelernter showed no interest. Anyway, Gelernter, in an essay in May for the *Claremont Review of Books*, argued that recent work has definitively disproved Darwinism as a mechanism for generating new species, and until something better comes along, Intelligent Design is the best available alternative.

Curiously, I think that Gelernter’s argument falls flat not for detailed reasons of biology, but mostly just because it indulges in *bad math and computer science*—in fact, in precisely the sorts of arguments that I was trying to answer in my segment on Morgan Freeman’s *Through the Wormhole* (see also Section 3.2 of Why Philosophers Should Care About Computational Complexity). Gelernter says that

- a random change to an amino acid sequence will pretty much always make it worse,
- the probability of finding a useful new such sequence by picking one at random is at most ~1 in 10
^{77}, and - there have only been maybe ~10
^{40}organisms in earth’s history.

Since 10^{77} >> 10^{40}, Darwinism is thereby refuted—not in principle, but as an explanation for life on earth. QED.

The most glaring hole in the above argument, it seems to me, is that it simply ignores *intermediate* possible numbers of mutations. How hard would it be to change, not 1 or 100, but 5 amino acids in a given protein to get a usefully different one—as might happen, for example, with local optimization methods like simulated annealing run at nonzero temperature? And how many chances were there for *that* kind of mutation in the earth’s history?

Gelernter can’t personally see how a path could cut through the exponentially large solution space in a polynomial amount of time, so he asserts that it’s impossible. Many of the would-be P≠NP provers who email me every week do the same. But this particular kind of “argument from incredulity” has an abysmal track record: it would’ve applied equally well, for example, to problems like maximum matching that turned out to have efficient algorithms. This is why, in CS, we demand better evidence of hardness—like completeness results or black-box lower bounds—neither of which seem however to apply to the case at hand. Surely Gelernter understands all this, but had he not, he could’ve learned it from my lecture at the workshop in France!

Alas, online debate, as it’s wont to do, focused less on Gelernter’s actual arguments and the problems with them, than on the tiresome questions of “standing” and “status.” In particular: does Gelernter’s authority, as a noted computer science professor, somehow lend new weight to Intelligent Design? Or conversely: does the very fact that a computer scientist endorsed ID prove that computer science itself isn’t a real science at all, and that its practitioners should never be taken seriously in any statements about the real world?

It’s hard to say which of these two questions makes me want to bury my face deeper into my hands. Serge Lang, the famous mathematician and textbook author, spent much of his later life fervently denying the connection between HIV and AIDS. Lynn Margulis, the discoverer of the origin of mitochondria (and Carl Sagan’s first wife), died a 9/11 truther. What broader lesson should we draw from any of this? And anyway, what percentage of computer scientists actually do doubt evolution, and how does it compare to the percentage in other academic fields and other professions? Isn’t the question of how divorced we computer scientists are from the real world an … ahem … **empirical** matter, one hard to answer on the basis of armchair certainties and anecdotes?

Speaking of empiricism, if you check Gelernter’s publication list on DBLP and his Google Scholar page, you’ll find that he did influential work in programming languages, parallel computing, and other areas from 1981 through 1997, and then in the past 22 years published a grand total of … **two** papers in computer science. One with four coauthors, the other a review/perspective piece about his earlier work. So it seems fair to say that, some time after receiving tenure in a CS department, Gelernter pivoted (to put it mildly) away from CS and toward conservative punditry. His recent offerings, in case you’re curious, include the book America-Lite: How Imperial Academia Dismantled Our Culture (and Ushered In the Obamacrats).

Some will claim that this case underscores what’s wrong with the tenure system itself, while others will reply that it’s precisely what tenure was designed for, even if in this instance you happen to disagree with what Gelernter uses his tenured freedom to say. The point I wanted to make is different, though. It’s that the question “what kind of a field is computer science, anyway, that a guy can do high-level CS research on Monday, and then on Tuesday reject Darwinism and unironically use the word ‘Obamacrat’?”—well, even if I accepted the immense weight this question places on one atypical example (which I don’t), and even if I dismissed the power of compartmentalization (which I again don’t), the question *still* wouldn’t arise in Gelernter’s case, since getting from “Monday” to “Tuesday” seems to have taken him 15+ years.

Anyway, the second post of Coyne’s that I wanted to talk about is from just yesterday, and is about Jeffrey Epstein—the financier, science philanthropist, and confessed sex offender, whose appalling crimes you’ll have read all about this week if you weren’t on a long sea voyage without Internet or something.

For the benefit of my many fair-minded friends on Twitter, I should clarify that I’ve never met Jeffrey Epstein, let alone accepted any private flights to his sex island or whatever. I doubt he has any clue who I am either—even if he did once claim to be “intrigued” by quantum information.

I do know a few of the scientists who Epstein once hung out with, including Seth Lloyd and Steven Pinker. Pinker, in particular, is now facing vociferous attacks on Twitter, similar in magnitude perhaps to what I faced in the comment-171 affair, for having been photographed next to Epstein at a 2014 luncheon that was hosted by Lawrence Krauss (a physicist who later faced sexual harassment allegations of his own). By the evidentiary standards of social media, this photo suffices to convict Pinker as basically a child molester himself, and is *also* a devastating refutation of any data that Pinker might have adduced in his books about the Enlightenment’s contributions to human flourishing.

From my standpoint, what’s surprising is not that Pinker is up against this, but that it *took this long* to happen, given that Pinker’s pro-Enlightenment, anti-blank-slate views have had the effect of painting a giant red target on his back. Despite the near-inevitability, though, you can’t blame Pinker for wanting to defend himself, as I did when it was my turn for the struggle session.

Thus, in response to an emailed inquiry by Jerry Coyne, Pinker shared some detailed reflections about Epstein; Pinker then gave Coyne permission to post those reflections on his blog (though they were originally meant for Coyne only). Like everything Pinker writes, they’re worth reading in full. Here’s the opening paragraph:

The annoying irony is that I could never stand the guy [Epstein], never took research funding from him, and always tried to keep my distance. Friends and colleagues described him to me as a quantitative genius and a scientific sophisticate, and they invited me to salons and coffee klatches at which he held court. But I found him to be a kibitzer and a dilettante — he would abruptly change the subject ADD style, dismiss an observation with an adolescent wisecrack, and privilege his own intuitions over systematic data.

Pinker goes on to discuss his record of celebrating, and extensively documenting, the forces of modernity that led to dramatic reductions in violence against women and that have the power to continue doing so. On Twitter, Pinker had already written: “Needless to say I condemn Epstein’s crimes in the strongest terms.”

I probably should’ve predicted that Pinker would then be attacked again—this time, for having prefaced his condemnation with the phrase “needless to say.” The argument, as best I can follow, runs like this: given all the isms of which woke Twitter has already convicted Pinker—scientism, neoliberalism, biological determinism, etc.—how could Pinker’s being against Epstein’s crimes (which we recently learned probably include the rape, and not only statutorily, of a 15-year-old) *possibly* be assumed as a given?

For the record, just as Epstein’s friends and enablers weren’t confined to one party or ideology, so the public condemnation of Epstein strikes me as a matter that is (or should be) beyond ideology, with all reasonable dispute now confined to the space between “very bad” and “extremely bad,” between “lock away for years” and “lock away for life.”

While I didn’t need Pinker to tell me *that*, one reason I personally appreciated his comments is that they helped to answer a question that had bugged me, and that none of the mountains of other condemnations of Epstein had given me a clear sense about. Namely: supposing, hypothetically, that I’d met Epstein around 2002 or so—without, of course, knowing about his crimes—would I have been as taken with him as many other academics seem to have been? (Would *you* have been? How sure are you?)

Over the last decade, I’ve had the opportunity to meet some titans and semi-titans of finance and business, to discuss quantum computing and other nerdy topics. For a few (by no means all) of these titans, my overriding impression was *precisely* their unwillingness to concentrate on any one point for more than about 20 seconds—as though they wanted the crust of a deep intellectual exchange without the meat filling. My experience with them fit Pinker’s description of Epstein to a T (though I hasten to add that, as far as I know, none of these others ran teenage sex rings).

Anyway, given all the anger at Pinker for having intersected with Epstein, it’s ironic that I could easily imagine Pinker’s comments rattling Epstein the most of anyone’s, if Epstein hears of them from his prison cell. It’s like: Epstein must have developed a skin like a rhinoceros’s by this point about being called a child abuser, a creep, and a thousand similar (and similarly deserved) epithets. But “a kibitzer and a dilettante” who merely lured famous intellectuals into his living room, with wads of cash not entirely unlike the ones used to lure teenage girls to his massage table? Ouch!

OK, but what about Alan Dershowitz—the man who apparently used to be Epstein’s close friend, who still is Pinker’s friend, and who played a crucial role in securing Epstein’s 2008 plea bargain, the one now condemned as a travesty of justice? I’m not sure how I feel about Dershowitz. It’s like: I understand that our system requires attorneys willing to mount a vociferous defense even for clients who they privately know or believe to be guilty—and even to get those clients off on technicalities or bargaining whenever they can. I’m also incredibly grateful that I chose CS rather than law school, because I don’t think I could last an hour advocating causes that I knew to be unjust. Just like my fellow CS professor, the intelligent design advocate David Gelernter, I have the privilege and the burden of speaking only for myself.

Vox has just published an excellent article on open access scientific publishing, “The open access wars“. While that is too provocative of a title, the article still manages to give an accurate assessment of the current state of play. Although I like the article, I can’t help but nitpick a few points.

From the article,

“We spoke with executives at both Elsevier and Springer Nature, and they maintain their companies still provide a lot of value in ensuring the quality of academic research.”

This is false. Publishers do not add any significant value in ensuring the quality of the academic research. Peer reviewers do that, and even then not consistently. True, the publishers facilitate finding peer reviewers, but this has nothing to do with the actual publishing of the research. The role of the journal itself (sans peer review) is just to market and advertise the research, not to ensure quality. The journal also provides a venue for signaling for the prestige-smitten researcher. It is a personal value judgement how much these other things matter, but they certainly don’t impact the quality of the research.

Organizing the peer review (as opposed to actually reviewing) is the job of a middleman: it may provide a service, but it doesn’t add value to the product and it only drives up prices. This is why non-profit overlay journals like Quantum only cost between 0 and €200 to publish. The average cost per submission for hosting a preprint on the arxiv is less than $7.

Which brings me to my next point: I was also a little disappointed in the article that they failed to mention arxiv.org. They do mention the rise of preprints, but they only mention prepubmed.org, which apparently only came online in 2007. By contrast, the oldest arxiv preprint that I’m aware of is Paul Ginsparg’s notes on conformal field theory, which were posted in 1988!

That might be the oldest timestamp, but the arxiv only started having regular preprint service in 1991. Still, this means that essentially all physics research in the last 25+ years is available for free online. In practice, this means that any time you need a physics paper, you simply find the associated preprint and read that instead of the journal version. This is especially convenient for a field like quantum information, where all but a handful of papers are available on the arxiv.

Any article on open access should lead with a discussion of the arxiv. It’s one of the most important science-related developments of the last 30 years.

I’m delighted to announce that quantum computing theorist John Wright will be joining the computer science faculty at UT Austin in Fall 2020, after he finishes a one-year postdoc at Caltech.

John made an appearance on this blog a few months ago, when I wrote about the new breakthrough by him and Anand Natarajan: namely, that MIP* (multi-prover interactive proofs with entangled provers) contains NEEXP (nondeterministic double-exponential time). Previously, MIP* had only been known to contain NEXP (nondeterministic *single* exponential time). So, this is an exponential expansion in the power of entangled provers over what was previously known and believed, and the first proof that entanglement actually *increases* the power of multi-prover protocols, rather than decreasing it (as it could’ve done a priori). Even more strikingly, there seems to be no natural stopping point: MIP* might soon swallow up arbitrary towers of exponentials or even the halting problem (!). For more, see for example this *Quanta* article, or this post by Thomas Vidick, or this short story [sic] by Henry Yuen.

John grew up in Texas, so he’s no stranger to BBQ brisket or scorching weather. He did his undergrad in computer science at UT Austin—my colleagues remember him as a star—and then completed his PhD with Ryan O’Donnell at Carnegie Mellon, followed by a postdoc at MIT. Besides the work on MIP*, John is also well-known for his 2015 work with O’Donnell pinning down the sample complexity of quantum state tomography. Their important result, a version of which was independently obtained by Haah et al., says that if you want to learn an unknown d-dimensional quantum mixed state ρ to a reasonable precision, then ~d^{2} copies of ρ are both necessary and sufficient. This solved a problem that had personally interested me, and already plays a role in, e.g., my work on shadow tomography and gentle measurements.

Our little quantum information center at UT Austin is growing rapidly. Shyam Shankar, a superconducting qubits guy who previously worked in Michel Devoret’s group at Yale, will also be joining UT’s Electrical and Computer Engineering department this fall. I’ll have two new postdocs—Andrea Rocchetto and Yosi Atia—as well as new PhD students. We’ll continue recruiting this coming year, with potential opportunities for students, postdocs, faculty, and research scientists across the CS, physics, and ECE departments as well as the Texas Advanced Computing Center (TACC). I hope you’ll consider applying to join us.

With no evaluative judgment attached, I can honestly say that this is an unprecedented time for quantum computing as a field. Where once faculty applicants struggled to make a case for quantum computing (physics departments: “but isn’t this really CS?” / CS departments: “isn’t it really physics?” / everyone: “couldn’t this whole QC thing, like, all blow over in a year?”), today departments are vying with each other and with industry players and startups to recruit talented people. In such an environment, we’re fortunate to be doing as well as we are. We hope to continue to expand.

Meanwhile, this was an unprecedented year for CS hiring at UT Austin more generally. John Wright is one of at least four new faculty (probably more) who will be joining us. It’s a good time to be in CS.

A huge welcome to John, and hook ’em Hadamards!

(And for US readers: have a great 4^{th}! Though how could any fireworks match the proof of the Sensitivity Conjecture?)

Hard to believe it’s been 15 years since the publication of *Spacetime and Geometry: An Introduction to General Relativity*, my graduate-level textbook on everyone’s favorite theory of gravititation. The book has become quite popular, being used as a text in courses around the world. There are a lot of great GR books out there, but I felt another one was needed that focused solely on the idea of “teach students general relativity.” That might seem like an obvious goal, but many books also try to serve as reference books, or to put forward a particular idiosyncratic take on the subject. All I want to do is to teach you GR.

And now I’m pleased to announce that the book is changing publishers, from Pearson to Cambridge University Press. Even with a new cover, shown above.

I must rush to note that it’s **exactly the same book**, just with a different publisher. Pearson was always good to me, I have no complaints there, but they are moving away from graduate physics texts, so it made sense to try to find S&G a safe permanent home.

Well, there is one change: it’s cheaper! You can order the book either from CUP directly, or from other outlets such as Amazon. Copies had been going for roughly $100, but the new version lists for only $65 — and if the Amazon page is to be believed, it’s currently on sale for an amazing $46. That’s a lot of knowledge for a minuscule price. I’d rush to snap up copies for you and your friends, if I were you.

My understanding is that copies of the new version are not quite in stores yet, but they’re being printed and should be there momentarily. Plenty of time for courses being taught this Fall. (Apologies to anyone who has been looking for the book over the past couple of months, when it’s been stuck between publishers while we did the handover.)

Again: it’s precisely the same book. I have thought about doing revisions to produce an actually new edition, but I think about many things, and that’s not a super-high priority right now. Maybe some day.

Thanks to everyone who has purchased *Spacetime and Geometry* over the years, and said such nice things about it. Here’s to the next generation!

My Orioles optimism from the end of April hasn’t held up too well. When I wrote that, the team was 10-18. Since then, they’ve won 16 more games, and lost — it kind of hurts to type this — 43.

Why so bad? The team ERA has dropped almost half a run since I wrote that post, from 6.15 to 5.75. Their RS and RA for June were about the same as they were for May, but they went 6-20 instead of 8-19.

I can’t believe I’m saying this, but — maybe the Orioles aren’t really *that* bad? Their Pythagorean record is 28-59, which is terrible, but not even worst in MLB right now. (That honor belongs to the Tigers.) John Means continues to be great and Andrew Cashner and Dylan Bundy have now been pretty consistently turning in utterly acceptable starts.

The thing about baseball is, things happen suddenly. On Tuesday, September 5, 2017, less than two years ago, Manny Machado hit a 2-run homer in the bottom of the 9th to give the Orioles a 7-6 win against the Yankees. The Orioles were 71-68.

The next game after that, they lost 9-1. And then went 4-18 the rest of the way. They haven’t had a full month since then with a record better than .360. The Orioles became terrible *in an instant*. I don’t see why it can’t go the other way.

Back from nearly two weeks at the Institut Henri Poincare, where we were reinventing rational points, though they actually seem pretty much as they have always been. But lots of new ideas floating around and in particular lots of problems I see as potentially rich ones for students.

Last week featured the hottest temperatures ever recorded in France, reminding one that when you move the mean of a distribution even a little, the frequency of formerly rare events might jump quite a lot. Paris was spared the worst of the heat; after initial predictions of temperatures going over 100F, the hottest day of the conference was 97 and the rest of the week was in the mid-90s, regular old East Coast US summer weather. But of course France doesn’t have regular old East Coast US summer air-conditioning. Faiblement climatisé is the order of the day. The word for heatwave in French is “canicule,” which comes from the Italian word for Sirius, thought to be a bringer of hot weather.

It’s also the Women’s World Cup. Tickets for the US-France quarterfinal, held the night before I left, were going at 350 euros for the very cheapest, but I don’t think I’d have wanted to go, anyway. The Orioles are the only team I love enough to really enjoy rooting for them as the visiting team. Instead I went to Scotland-Argentina, which looked like a laugher 70 minutes in with Scotland up 3-0, but ended in a controversial tie after Scotland’s apparent save of a last-minute penalty kick was called back when VAR showed the goalie jumping off the line a moment before the ball was kicked. The ref called end of time directly after the second kick went in to tie the game, to the confusion and dismay of the players on the field; both teams needed a win to have a real chance of advancing past the group stage, and the tie left them both out. Scottish forward Erin Cuthbert pulled something out of her sock and kissed it after her goal; later I found out it was a picture of herself as a baby. I like her style!

I ate well. I ate whelks. They’re OK. I ate thiebou djienne at this place near IHP which was much better than OK. I ate a watermelon-chevre salad that was so good I went to a spice store and bought the pepper they used, piment d’espelette, and now I have a spice Penzey’s doesn’t sell. Favorite new cheese I ate on this trip was Soumaintrain.

I went to the museum of Jewish history where I saw this campaign poster:

And I saw the computer teen Blaise Pascal built for his dad in 1642, which is at the Musée des arts et métiers, along with a revolutionary 10-hour clock:

And right there at the museum, later that night, just by my good luck, there was a free Divine Comedy concert as part of the Fête de la Musique. It was sold out but, my good luck part deux, someone’s friend didn’t show up and in I went. Great set. Sort of a beautifully multinational moment to watch an Irish guy play a They Might Be Giants song in Paris in front of a cast of the Statue of Liberty:

I also learned on this trip that when French kids play Capture the Flag they use an actual French flag:

and that “Good Grief!” in French is “Bon sang!”

The Sensitivity Conjecture, which I blogged about here, says that, for every Boolean function f:{0,1}^{n}→{0,1}, the *sensitivity* of f—that is, the maximum, over all 2^{n} input strings x∈{0,1}^{n}, of the number of input bits such that flipping them changes the value of f—is at most polynomially smaller than a bunch of other complexity measures of f, including f’s block sensitivity, degree as a real polynomial, and classical and quantum query complexities. (For more, see for example this survey by Buhrman and de Wolf. Or for quick definitions of the relevant concepts, see here.)

Ever since it was posed by Nisan and Szegedy in 1989, this conjecture has stood as one of the most frustrating and embarrassing open problems in all of combinatorics and theoretical computer science. It seemed so easy, and so similar to other statements that had 5-line proofs. But a lot of the best people in the field sank months into trying to prove it. For whatever it’s worth, I also sank … well, at least weeks into it.

Now Hao Huang, a mathematician at Emory University, has posted a 6-page preprint on his homepage that finally proves the Sensitivity Conjecture, in the form s(f)≥√deg(f). (I thank Ryan O’Donnell for tipping me off to this.) Within the preprint, the proof itself is about a page and a half.

Whenever there’s an announcement like this, ~99% of the time either the proof is wrong, or at any rate it’s way too complicated for outsiders to evaluate it quickly. This is one of the remaining 1% of cases. I’m rather confident that the proof is right. Why? Because I read and understood it. It took me about half an hour. If you’re comfortable with concepts like *induced subgraph* and *eigenvalue*, you can do the same.

From pioneering work by Gotsman and Linial in 1992, it was known that to prove the Sensitivity Conjecture, it suffices to prove the following even simpler combinatorial conjecture:

Let S be any subset of the n-dimensional Boolean hypercube, {0,1}

^{n}, which has size 2^{n-1}+1. Then there must be a point in S with at least ~n^{c}neighbors in S.

Here c>0 is some constant (say 1/2), and two points in S are “neighbors” if and only they differ in a single coordinate. Note that if S had size 2^{n-1}, then the above statement would be false—as witnessed, for example, by the set of all n-bit strings with an even number of 1’s.

Huang proceeds by proving the Gotsman-Linial Conjecture. And the way he proves Gotsman-Linial is … well, at this point maybe I should just let you read the damn preprint yourself. I can’t say it more simply than he does.

If I had to try anyway, I’d say: Huang constructs a 2^{n}×2^{n} matrix, called A_{n}, that has 0’s where there are no edges between the corresponding vertices of the Boolean hypercube, and either 1’s or -1’s where there *are* edges—with a simple, weird pattern of 1’s and -1’s that magically makes everything work. He then lets H be an induced subgraph of the Boolean hypercube of size 2^{n-1}+1. He lower-bounds the maximum degree of H by the largest eigenvalue of the corresponding (2^{n-1}+1)×(2^{n-1}+1) submatrix of A_{n}. Finally, he lower-bounds that largest eigenvalue by … no, I don’t want to spoil it! Read it yourself!

Paul Erdös famously spoke of a book, maintained by God, in which was written the simplest, most beautiful proof of each theorem. The highest compliment Erdös could give a proof was that it “came straight from the book.” In this case, I find it hard to imagine that even God knows how to prove the Sensitivity Conjecture in any simpler way than this.

Indeed, the question is: how could such an elementary 1.5-page argument have been overlooked for 30 years? I don’t have a compelling answer to that, besides noting that “short” and “elementary” often have little to do with “obvious.” Once you start looking at the spectral properties of this matrix A_{n}, the pieces snap together in precisely the right way—but how would you know to look at that?

By coincidence, earlier today I finished reading my first PG Wodehouse novel (*Right Ho, Jeeves!*), on the gushing recommendation of a friend. I don’t know how I’d missed Wodehouse for 38 years. His defining talent is his ability to tie together five or six plot threads in a way that feels perfect and inevitable even though you didn’t see it coming. This produces a form of pleasure that’s nearly indistinguishable from the pleasure one feels in reading a “proof from the book.” So my pleasure centers are pretty overloaded today—but in such depressing times for the world, I’ll take pleasure wherever I can get it.

Huge congratulations to Hao!

**Added thought:** What this really is, is one of the purest illustrations I’ve seen in my career of the power and glory of the P≠NP phenomenon. We talk all the time about how proofs are easier to verify than to find. In practice, though, it can be far from obvious that that’s true. Consider your typical STOC/FOCS paper: writing it probably took the authors several months, while fully understanding the thing from scratch would probably take … *also* several months! If there’s a gap, it’s only by a factor of 4 or 5 or something. Whereas in this case, I don’t know how long Huang spent searching for the proof, but the combined search efforts of the community add up to years or decades. The ratio of the difficulty of finding to the difficulty of completely grasping is in the hundreds of thousands or millions.

**Another added thought:** Because Hao actually proves a stronger statement than the original Sensitivity Conjecture, it has additional implications, a few of which Hao mentions in his preprint. Here’s one he didn’t mention: any randomized algorithm to guess the parity of an n-bit string, which succeeds with probability at least 2/3 on the majority of strings, must make at least ~√n queries to the string, while any such quantum algorithm must make at least ~n^{1/4} queries. For more, see the paper Weak Parity by me, Ambainis, Balodis, and Bavarian (Section 6).

**Important Update:** Hao Huang himself has graciously visited the comment section to satisfy readers’ curiosity by providing a detailed timeline of his work on the Sensitivity Conjecture. (tl;dr: he was introduced to the problem by Mike Saks in 2012, and had been attacking it on and off since then, until he finally had the key insight this past month while writing a grant proposal. Who knew that grant proposals could ever be useful for anything?!?)

**Another Update:** In the comments section, my former student Shalev Ben-David points out a simplification of Huang’s argument, which no longer uses Cauchy’s interlacing theorem. I thought there was no way this proof could possibly be made any simpler, and I was wrong!

This blog post introduces a newly published book, titled "Maxwell-Dirac Theory and Occam's Razor: Unified Field, Elementary Particles, and Nuclear Interactions".

*Les Atomes Dansants at CMS experimental site*

Many philosophers thought that the *Desire* is the driving force of the world, on both microscopic and macroscopic scales. The same *Desire* that moves scientists in their demanding and endless quest after knowledge.

Let be some domain (such as the real numbers). For any natural number , let denote the space of symmetric real-valued functions on variables , thus

for any permutation . For instance, for any natural numbers , the elementary symmetric polynomials

will be an element of . With the pointwise product operation, becomes a commutative real algebra. We include the case , in which case consists solely of the real constants.

Given two natural numbers , one can “lift” a symmetric function of variables to a symmetric function of variables by the formula

where ranges over all injections from to (the latter formula making it clearer that is symmetric). Thus for instance

and

Also we have

With these conventions, we see that vanishes for , and is equal to if . We also have the transitivity

if .

The lifting map is a linear map from to , but it is not a ring homomorphism. For instance, when , one has

In general, one has the identity

for all natural numbers and , , where range over all injections , with . Combinatorially, the identity (2) follows from the fact that given any injections and with total image of cardinality , one has , and furthermore there exist precisely triples of injections , , such that and .

Example 1When , one haswhich is just a restatement of the identity

Note that the coefficients appearing in (2) do not depend on the final number of variables . We may therefore abstract the role of from the law (2) by introducing the real algebra of formal sums

where for each , is an element of (with only finitely many of the being non-zero), and with the formal symbol being formally linear, thus

and

for and scalars , and with multiplication given by the analogue

of (2). Thus for instance, in this algebra we have

and

Informally, is an abstraction (or “inverse limit”) of the concept of a symmetric function of an unspecified number of variables, which are formed by summing terms that each involve only a bounded number of these variables at a time. One can check (somewhat tediously) that is indeed a commutative real algebra, with a unit . (I do not know if this algebra has previously been studied in the literature; it is somewhat analogous to the abstract algebra of finite linear combinations of Schur polynomials, with multiplication given by a Littlewood-Richardson rule. )

For natural numbers , there is an obvious specialisation map from to , defined by the formula

Thus, for instance, maps to and to . From (2) and (3) we see that this map is an algebra homomorphism, even though the maps and are not homomorphisms. By inspecting the component of we see that the homomorphism is in fact surjective.

Now suppose that we have a measure on the space , which then induces a product measure on every product space . To avoid degeneracies we will assume that the integral is strictly positive. Assuming suitable measurability and integrability hypotheses, a function can then be integrated against this product measure to produce a number

In the event that arises as a lift of another function , then from Fubini’s theorem we obtain the formula

is an element of the formal algebra , then

Note that by hypothesis, only finitely many terms on the right-hand side are non-zero.

Now for a key observation: whereas the left-hand side of (6) only makes sense when is a natural number, the right-hand side is meaningful when takes a fractional value (or even when it takes negative or complex values!), interpreting the binomial coefficient as a polynomial in . As such, this suggests a way to introduce a “virtual” concept of a symmetric function on a fractional power space for such values of , and even to integrate such functions against product measures , even if the fractional power does not exist in the usual set-theoretic sense (and similarly does not exist in the usual measure-theoretic sense). More precisely, for arbitrary real or complex , we now *define* to be the space of abstract objects

with and (and now interpreted as formal symbols, with the structure of a commutative real algebra inherited from , thus

In particular, the multiplication law (2) continues to hold for such values of , thanks to (3). Given any measure on , we formally define a measure on with regards to which we can integrate elements of by the formula (6) (providing one has sufficient measurability and integrability to make sense of this formula), thus providing a sort of “fractional dimensional integral” for symmetric functions. Thus, for instance, with this formalism the identities (4), (5) now hold for fractional values of , even though the formal space no longer makes sense as a set, and the formal measure no longer makes sense as a measure. (The formalism here is somewhat reminiscent of the technique of dimensional regularisation employed in the physical literature in order to assign values to otherwise divergent integrals. See also this post for an unrelated abstraction of the integration concept involving integration over supercommutative variables (and in particular over fermionic variables).)

Example 2Suppose is a probability measure on , and is a random variable; on any power , we let be the usual independent copies of on , thus for . Then for any real or complex , the formal integralcan be evaluated by first using the identity

(cf. (1)) and then using (6) and the probability measure hypothesis to conclude that

For a natural number, this identity has the probabilistic interpretation

whenever are jointly independent copies of , which reflects the well known fact that the sum has expectation and variance . One can thus view (7) as an abstract generalisation of (8) to the case when is fractional, negative, or even complex, despite the fact that there is no sensible way in this case to talk about independent copies of in the standard framework of probability theory.

In this particular case, the quantity (7) is non-negative for every nonnegative , which looks plausible given the form of the left-hand side. Unfortunately, this sort of non-negativity does not always hold; for instance, if has mean zero, one can check that

and the right-hand side can become negative for . This is a shame, because otherwise one could hope to start endowing with some sort of commutative von Neumann algebra type structure (or the abstract probability structure discussed in this previous post) and then interpret it as a genuine measure space rather than as a virtual one. (This failure of positivity is related to the fact that the characteristic function of a random variable, when raised to the power, need not be a characteristic function of any random variable once is no longer a natural number: “fractional convolution” does not preserve positivity!) However, one vestige of positivity remains: if is non-negative, then so is

One can wonder what the point is to all of this abstract formalism and how it relates to the rest of mathematics. For me, this formalism originated implicitly in an old paper I wrote with Jon Bennett and Tony Carbery on the multilinear restriction and Kakeya conjectures, though we did not have a good language for working with it at the time, instead working first with the case of natural number exponents and appealing to a general extrapolation theorem to then obtain various identities in the fractional case. The connection between these fractional dimensional integrals and more traditional integrals ultimately arises from the simple identity

(where the right-hand side should be viewed as the fractional dimensional integral of the unit against ). As such, one can manipulate powers of ordinary integrals using the machinery of fractional dimensional integrals. A key lemma in this regard is

Lemma 3 (Differentiation formula)Suppose that a positive measure on depends on some parameter and varies by the formulafor some function . Let be any real or complex number. Then, assuming sufficient smoothness and integrability of all quantities involved, we have

for all that are independent of . If we allow to now depend on also, then we have the more general total derivative formula

again assuming sufficient amounts of smoothness and regularity.

*Proof:* We just prove (10), as (11) then follows by same argument used to prove the usual product rule. By linearity it suffices to verify this identity in the case for some symmetric function for a natural number . By (6), the left-hand side of (10) is then

Differentiating under the integral sign using (9) we have

and similarly

where are the standard copies of on :

By the product rule, we can thus expand (12) as

where we have suppressed the dependence on for brevity. Since , we can write this expression using (6) as

where is the symmetric function

But from (2) one has

and the claim follows.

Remark 4It is also instructive to prove this lemma in the special case when is a natural number, in which case the fractional dimensional integral can be interpreted as a classical integral. In this case, the identity (10) is immediate from applying the product rule to (9) to conclude thatOne could in fact derive (10) for arbitrary real or complex from the case when is a natural number by an extrapolation argument; see the appendix of my paper with Bennett and Carbery for details.

Let us give a simple PDE application of this lemma as illustration:

Proposition 5 (Heat flow monotonicity)Let be a solution to the heat equation with initial data a rapidly decreasing finite non-negative Radon measure, or more explicitlyfor al . Then for any , the quantity

is monotone non-decreasing in for , constant for , and monotone non-increasing for .

*Proof:* By a limiting argument we may assume that is absolutely continuous, with Radon-Nikodym derivative a test function; this is more than enough regularity to justify the arguments below.

For any , let denote the Radon measure

Then the quantity can be written as a fractional dimensional integral

Observe that

and thus by Lemma 3 and the product rule

where we use for the variable of integration in the factor space of .

To simplify this expression we will take advantage of integration by parts in the variable. Specifically, in any direction , we have

Multiplying by and integrating by parts, we see that

where we use the Einstein summation convention in . Similarly, if is any reasonable function depending only on , we have

and hence on integration by parts

We conclude that

and thus by (13)

The choice of that then achieves the most cancellation turns out to be (this cancels the terms that are linear or quadratic in the ), so that . Repeating the calculations establishing (7), one has

and

where is the random variable drawn from with the normalised probability measure . Since , one thus has

Remark 6As with Remark 4, one can also establish the identity (14) first for natural numbers by direct computation avoiding the theory of fractional dimensional integrals, and then extrapolate to the case of more general values of . This particular identity is also simple enough that it can be directly established by integration by parts without much difficulty, even for fractional values of .

A more complicated version of this argument establishes the non-endpoint multilinear Kakeya inequality (without any logarithmic loss in a scale parameter ); this was established in my previous paper with Jon Bennett and Tony Carbery, but using the “natural number first” approach rather than using the current formalism of fractional dimensional integration. However, the arguments can be translated into this formalism without much difficulty; we do so below the fold. (To simplify the exposition slightly we will not address issues of establishing enough regularity and integrability to justify all the manipulations, though in practice this can be done by standard limiting arguments.)

** — 1. Multilinear heat flow monotonicity — **

Before we give a multilinear variant of Proposition 5 of relevance to the multilinear Kakeya inequality, we first need to briefly set up the theory of finite products

of fractional powers of spaces , where are real or complex numbers. The functions to integrate here lie in the tensor product space

which is generated by tensor powers

with , with the usual tensor product identifications and algebra operations. One can evaluate fractional dimensional integrals of such functions against “virtual product measures” , with a measure on , by the natural formula

assuming sufficient measurability and integrability hypotheses. We can lift functions to an element of the space (15) by the formula

This is easily seen to be an algebra homomorphism.

Example 7If and are functions and are measures on respectively, then (assuming sufficient measurability and integrability) then the multiple fractional dimensional integralis equal to

In the case that are natural numbers, one can view the “virtual” integrand here as an actual function on , namely

in which case the above evaluation of the integral can be achieved classically.

From a routine application of Lemma 3 and various forms of the product rule, we see that if each varies with respect to a time parameter by the formula

and is a time-varying function in (15), then (assuming sufficient regularity and integrability), the time derivative

Now suppose that for each space one has a non-negative measure , a vector-valued function , and a matrix-valued function taking values in real symmetric positive semi-definite matrices. Let be positive real numbers; we make the abbreviations

For any and , we define the modified measures

and then the product fractional power measure

If we then define the heat-type functions

(where we drop the normalising power of for simplicity) we see in particular that

hence we can interpret the multilinear integral in the left-hand side of (17) as a product fractional dimensional integral. (We remark that in my paper with Bennett and Carbery, a slightly different parameterisation is used, replacing with , and also replacing with .)

If the functions were constant in , then the functions would obey some heat-type partial differential equation, and the situation is now very analogous to Proposition 5 (and is also closely related to Brascamp-Lieb inequalities, as discussed for instance in this paper of Carlen, Lieb, and Loss, or this paper of mine with Bennett, Carbery, and Christ). However, for applications to the multilinear Kakeya inequality, we permit to vary slightly in the variable, and now the do not directly obey any PDE.

A naive extension of Proposition 5 would then seek to establish monotonicity of the quantity (17). While such monotonicity is available in the “Brascamp-Lieb case” of constant , as discussed in the above papers, this does not quite seem to be to be true for variable . To fix this problem, a weight is introduced in order to avoid having to take matrix inverses (which are not always available in this algebra). On the product fractional dimensional space , we have a matrix-valued function defined by

The determinant is then a scalar element of the algebra (15). We then define the quantity

Example 8Suppose we take and let be natural numbers. Then can be viewed as the -matrix valued functionBy slight abuse of notation, we write the determinant of a matrix as , where and are the first and second rows of . Then

and after some calculation, one can then write as

By a polynomial extrapolation argument, this formula is then also valid for fractional values of ; this can also be checked directly from the definitions after some tedious computation. Thus we see that while the compact-looking fractional dimensional integral (18) can be expressed in terms of more traditional integrals, the formulae get rather messy, even in the case. As such, the fractional dimensional calculus (based heavily on derivative identities such as (16)) gives a more convenient framework to manipulate these otherwise quite complicated expressions.

Suppose the functions are close to constant matrices , in the sense that

We can therefore write

where and the coefficients of the matrix are some polynomial combination of the coefficients of , with all coefficients in this polynomial of bounded size. As a consequence, and on expanding out all the fractional dimensional integrals, one obtains a formula of the form

Thus, as long as is strictly positive definite and is small enough, this quantity is comparable to the classical integral

Now we compute the time derivative of . We have

where we use as the coordinate for the copy of that is being lifted to .

As before, we can take advantage of some cancellation in this expression using integration by parts. Since

where are the standard basis for , we see from (16) and integration by parts that

with the usual summation conventions on the index . Also, similarly to before, we suppose we have an element of (15) for each that does not depend on , then by (16) and integration by parts

or, writing ,

We can thus write (20) as

where is the element of (15) given by

The terms in that are quadratic in cancel. The linear term can be rearranged as

To cancel this, one would like to set equal to

Now in the commutative algebra (15), the inverse does not necessarily exist. However, because of the weight factor , one can work instead with the adjugate matrix , which is such that where is the identity matrix. We therefore set equal to the expression

and now the expression in (22) does not contain any linear or quadratic terms in . In particular it is completely independent of , and thus we can write

where is an arbitrary element of that we will select later to obtain a useful cancellation. We can rewrite this a little as

If we now introduce the matrix functions

and the vector functions

then this can be rewritten as

Similarly to (19), suppose that we have

uniformly on , where , thus we can write

where

and is some polynomial combination of the and (or more precisely, of the quantities , , , ) that is quadratic in the variables, with bounded coefficients. As a consequence, after expanding out the product fractional dimensional integrals and applying some Cauchy-Schwarz to control cross-terms, we have

Now we simplify . We let

be the average value of ; for each this is just a vector in . We then split , leading to the identities

and

The term is problematic, but we can eliminate it as follows. By construction one has (supressing the dependence on )

By construction, one has

Thus if is positive definite and is small enough, this matrix is invertible, and we can choose so that the expression vanishes. Making this choice, we then have

Observe that the fractional dimensional integral of

or

for and arbitrary constant matrices against vanishes. As a consequence, we can now simplify the integral

as the sum of

and

The latter also integrates to zero by the mean zero nature of . Thus we have simplified (24) to

Now let us make the key hypothesis that the matrix

is strictly positive definite, or equivalently that

for all , where the ordering is in the sense of positive definite matrices. Then we have the pointwise bound

and thus

For small enough, the expression inside the is non-negative, and we conclude the monotonicity

We have thus proven the following statement, which is essentially Proposition 4.1 of my paper with Bennett and Carbery:

Proposition 9Let , let be positive semi-definite real symmetric matrices, and let be such thatIn particular, we have

for any .

A routine calculation shows that for reasonable choices of (e.g. discrete measures of finite support), one has

and hence (setting ) we have

If we choose the to be the sum of Dirac masses, and each to be the diagonal matrix , then the key condition (25) is obeyed for , and one arrives at the multilinear Kakeya inequality

whenever are infinite tubes in of width and oriented within of the basis vector , for a sufficiently small absolute constant . (The hypothesis on the directions can then be relaxed to a transversality hypothesis by applying some linear transformations and the triangle inequality.)

Barcelona refused to stay indoors this May.

Merchandise spilled outside shops onto the streets, restaurateurs parked diners under trees, and ice-cream cones begged to be eaten on park benches. People thronged the streets, markets filled public squares, and the scents of flowers wafted from vendors’ stalls. I couldn’t blame the city. Its sunshine could have drawn Merlin out of his crystal cave. Insofar as a city lives, Barcelona epitomized a quotation by thermodynamicist Ilya Prigogine: “The main character of any living system is openness.”

Prigogine (1917–2003), who won the Nobel Prize for chemistry, had brought me to Barcelona. I was honored to receive, at the Joint European Thermodynamics Conference (JETC) there, the Ilya Prigogine Prize for a thermodynamics PhD thesis. The JETC convenes and awards the prize biennially; the last conference had taken place in Budapest. Barcelona suited the legacy of a thermodynamicist who illuminated open systems.

Ilya Prigogine began his life in Russia, grew up partially in Germany, settled in Brussels, and worked at American universities. His nobelprize.org biography reveals a mind open to many influences and disciplines: Before entering university, his “interest was more focused on history and archaeology, not to mention music, especially piano.” Yet Prigogine pursued chemistry.

He helped extend thermodynamics outside equilibrium. Thermodynamics is the study of energy, order, and time’s arrow in terms of large-scale properties, such as temperature, pressure, and volume. Many physicists think that thermodynamics describes only equilibrium. Equilibrium is a state of matter in which (1) large-scale properties remain mostly constant and (2) stuff (matter, energy, electric charge, etc.) doesn’t flow in any particular direction much. Apple pies reach equilibrium upon cooling on a countertop. When I’ve described my research as involving nonequilibrium thermodynamics, some colleagues have asked whether I’ve used an oxymoron. But “nonequilibrium thermodynamics” appears in Prigogine’s Nobel Lecture.

Another Nobel laureate, Lars Onsager, helped extend thermodynamics a little outside equilibrium. He imagined poking a system gently, as by putting a pie on a lukewarm stovetop or a magnet in a weak magnetic field. (Experts: Onsager studied the linear-response regime.) You can read about his work in my blog post “Long live Yale’s cemetery.” Systems poked slightly out of equilibrium tend to return to equilibrium: Equilibrium is stable. Systems flung far from equilibrium, as Prigogine showed, can behave differently.

A system can stay far from equilibrium by interacting with other systems. Imagine placing an apple pie atop a blistering stove. Heat will flow from the stove through the pie into the air. The pie will stay out of equilibrium due to interactions with what we call a “hot reservoir” (the stove) and a “cold reservoir” (the air). Systems (like pies) that interact with other systems (like stoves and air), we call “open.”

You and I are open: We inhale air, ingest food and drink, expel waste, and radiate heat. Matter and energy flow through us; we remain far from equilibrium. A bumper sticker in my high-school chemistry classroom encapsulated our status: “Old chemists don’t die. They come to equilibrium.” We remain far from equilibrium—alive—because our environment provides food and absorbs heat. If I’m an apple pie, the yogurt that I ate at breakfast serves as my stovetop, and the living room in which I breakfasted serves as the air above the stove. We live because of our interactions with our environments, because we’re open. Hence Prigogine’s claim, “The main character of any living system is openness.”

JETC 2019 fostered openness. The conference sessions spanned length scales and mass scales, from quantum thermodynamics to biophysics to gravitation. One could arrive as an expert in cell membranes and learn about astrophysics.

I remain grateful for the prize-selection committee’s openness. The topics of earlier winning theses include desalination, colloidal suspensions, and falling liquid films. If you tipped those topics into a tube, swirled them around, and capped the tube with a kaleidoscope glass, you might glimpse my thesis’s topic, quantum steampunk. Also, of the nine foregoing Prigogine Prize winners, only one had earned his PhD in the US. I’m grateful for the JETC’s consideration of something completely different.

When Prigogine said, “openness,” he referred to exchanges of energy and mass. Humans can exhibit openness also to ideas. The JETC honored Prigogine’s legacy in more ways than one. Here’s hoping I live up to their example.

*Guest post by Daniela Brill Estrada*

*Lukas Heinrich presenting the keynote. In the background, you can see the Higgs boson mass peak emerge from the background in simulation and data as the jobs running in the cloud are completing.*

I have just received an email from Sergey Kitaev, one of the three combinatorialists at Strathclyde. As in many universities, they belong not to the mathematics department but to the computer science department. Kitaev informs me that the administrators of that department, in their infinite wisdom, have decided that the future of the department is best served by axing discrete mathematics. I won’t write a long post about this, but instead refer you to a post by Peter Cameron that says everything I would want to say about the decision, and does so extremely cogently. I recommend that you read it if this kind of decision worries you.

**[Note Added: Thanks to some great comments I’ve received, I’m continuing to add clarifying remarks to this post. You’ll find them in green.]**

It’s been a couple of months since the `photo’ *(a false-color image created to show the intensity of radio waves, not visible light) *of the black hole at the center of the galaxy M87, taken by the Event Horizon Telescope (EHT) collaboration, was made public. Before it was shown, I wrote an introductory post explaining what the ‘photo’ is and isn’t. There I cautioned readers that I thought it might be difficult to interpret the image, and controversies about it might erupt.

So far, the claim that the image shows the vicinity of M87’s black hole (which I’ll call `M87bh’ for short) has not been challenged, and I’m not expecting it to be. But what and where exactly is the material that is emitting the radio waves and thus creating the glow in the image? And what exactly determines the size of the dark region at the center of the image? These have been problematic issues from the beginning, but discussion is starting to heat up. **And it’s important: it has implications for the measurement of the black hole’s mass** (which EHT claims is that of 6.5 billion Suns, with an uncertainty of about 15%),** and for any attempt to estimate its rotation rate.**

Over the last few weeks I’ve spent some time studying the mathematics of spinning black holes, talking to my Harvard colleagues who are world’s experts on the relevant math and physics, and learning from colleagues who produced the `photo’ and interpreted it. So I think I can now clearly explain what most journalists and scientist-writers (including me) got wrong at the time of the photo’s publication, and clarify what the photo does and doesn’t tell us.

One note before I begin: this post is long. But it starts with a summary of the situation that you can read quickly, and then comes the long part: a step-by-step non-technical explanation of an important aspect of the black hole ‘photo’ that, to my knowledge, has not yet been given anywhere else.

*[I am heavily indebted to Harvard postdocs Alex Lupsasca and Shahar Hadar for assisting me as I studied the formulas and concepts relevant for fast-spinning black holes. Much of what I learned comes from early 1970s papers, especially those by my former colleague Professor Jim Bardeen (see this one written with Press and Teukolsky), and from papers written in the last couple of years, especially this one by my present and former Harvard colleagues.]*

**What Does the EHT Image Show?**

Scientists understand the black hole itself — the geometric dimple in space and time — pretty well. If one knows the mass and the rotation rate of the black hole, and assumes Einstein’s equations for gravity are mostly correct *(for which we have considerable evidence, for example from LIGO measurements and elsewhere), *then the equations tell us what the black hole does to space and time and how its gravity works.

But for the `photo’, that’s not enough information. We don’t get to observe the black hole itself (it’s black, after all!) What the `photo’ shows is a blurry ring of radio waves, emitted from hot material (a plasma of mostly electrons and protons) **somewhere around** the black hole — material whose location, velocity, and temperature we do not know. That material and its emission of radio waves are influenced by powerful gravitational forces (whose details depend on the rotation rate of the M87bh, which we don’t know yet) and powerful magnetic fields (whose details we hardly know at all.) The black hole’s gravity then causes the paths on which the radio waves travel to bend, even more than a glass lens will bend the path of visible light, so that where things appear in the ‘photo’ is not where they are actually located.

The only insights we have into this extreme environment come from computer simulations and a few other `photos’ at lower magnification. The simulations are based on well-understood equations, but the equations have to be solved approximately, using methods that may or may not be justified. And the simulations don’t tell you where the matter ** is**; they tell you where the material

However, there is one thing we can certainly say, and it has just been said publicly in a paper by Samuel Gralla, Daniel Holz and Robert Wald.

Two months ago, when the EHT `photo’ appeared, it was widely reported in the popular press and on blogs that the photo shows the image of a **photon sphere** at the edge of the **shadow** of the M87bh. *(Instead of `shadow’, I suggested the term ‘ quasi-silhouette‘, which I viewed as somewhat less misleading to a non-expert.) *

Unfortunately, it seems these statements are not true; and this was well-known to (but poorly communicated by, in my opinion) the EHT folks. This lack of clarity might perhaps annoy some scientists and science-loving non-experts; but does this issue also matter scientifically? Gralla et al., in their new preprint, suggest that it does (though they were careful to not yet make a precise claim.)

**The Photon Sphere Doesn’t Exist**…

Indeed, if you happened to be reading my posts carefully when the `photo’ first appeared, you probably noticed that I was quite vague about the photon-sphere — I never defined precisely what it was. You would have been right to read this as a warning sign, for indeed I wasn’t getting clear explanations of it from anyone. Studying the equations and conversing with expert colleagues, I soon learned why: **for a rotating black hole, the photon sphere doesn’t really exist.**

But let’s first define what the photon sphere is for a non-rotating black hole! Like the Earth’s equator, the photon sphere is a location, not an object. This location is the surface of an imaginary ball, lying well outside the black hole’s horizon. On the photon sphere, photons (the particles that make up light, radio waves, and all other electromagnetic waves) travel on special circular or spherical orbits around the black hole.

By contrast, a rotating black hole has a larger, broader `photon-zone’ where photons can have special orbits. But you won’t ever see the whole photon zone in any image of a rotating black hole. Instead, a piece of the photon zone will appear as a `* photon ring*‘, a bright and very thin loop of radio waves. However, the photon ring is

**… and the Photon Ring Isn’t What We See…
**

It seems likely that the M87bh is rotating quite rapidly, so it has a photon-zone rather than a photon-sphere, and images of it will have a photon ring. Ok, fine; but then, can we interpret EHT’s `photo’ simply as showing the photon ring, blurred by the imperfections in the `telescope’? Although some of the EHT folks have seemed to suggest the answer is “yes”, Gralla et al. suggest the answer is likely “no” *(and many of their colleagues have been pointing out the same thing in private.)* * The circlet of radio waves that appears in the EHT `photo’ is probably not simply a blurred image of M87bh’s photon ring; it probably shows a combination of the photon ring with something brighter (as explained below).* That’s where the controversy starts.

**…so the Dark Patch May Not Be the Full Shadow…**

The term `shadow’ is confusing (which is why I prefer `quasi-silhouette’ in describing it in public contexts, though that’s my own personal term) but no matter what you call it, in its ideal form it is supposed to be an absolutely dark area whose edge is the photon ring. But in reality the perfectly dark area need not appear so dark after all; it may be partly filled in by various effects. Furthermore, since the `photo’ may not show us the photon ring, it’s far from clear that the dark patch in the center is the full shadow anyway. The EHT folks are well aware of this, but at the time the photo came out, many science writers and scientist-writers (including me) were not.

**…so EHT’s Measurement of the M87bh’s Mass is Being Questioned**

It was wonderful that EHT could make a picture that could travel round the internet at the speed of light, and generate justifiable excitement and awe that human beings could indirectly observe such an amazing thing as a black hole with a mass of several billion Sun-like stars. Qualitatively, they achieved something fantastic in showing that yes, the object at the center of M87 really is as compact and dark as such a black hole would be expected to be! But the EHT telescope’s main ** quantitative** achievement was a measurement of the mass of the M87bh, with a claimed precision of about 15%.

Naively, one could imagine that the mass is measured by looking at the diameter of the dark spot in the black hole ‘photo’, under the assumption that it is the black hole’s shadow. So here’s the issue: **Could interpreting the dark region incorrectly perhaps lead to a significant mistake in the mass measurement, and/or an underestimate of how uncertain the mass measurement actually is?**

I don’t know. The EHT folks are certainly aware of these issues; their simulations show them explicitly. The mass of the M87bh isn’t literally measured by putting a ruler on the ‘photo’ and measuring the size of the dark spot! The actual methods are much more sophisticated than that, and I don’t understand them well enough yet to explain, evaluate or criticize them. All I can say with confidence right now is that these are important questions that experts currently are debating, and consensus on the answer may not be achieved for quite a while.

———————————————————————-

**The Appearance of a Black Hole With Nearby Matter**

Ok, now I’m going to explain the most relevant points, step-by-step. Grab a cup of coffee or tea, find a comfy chair, and bear with me.

Because fast-rotating black holes are more complicated, I’m going to start illuminating the controversy by looking at a **non-rotating** black hole’s properties, which is also what Gralla et al. mainly do in their paper. It turns out the qualitative conclusion drawn from the non-rotating case largely applies in the rotating case too, at least in the case of the M87bh as seen from our perspective; that’s important because the M87bh may well be rotating at a very good clip.

A little terminology first: for a rotating black hole there’s a natural definition of the poles and the equator, just as there is for the Earth: there’s an axis of rotation, and the poles are where that axis intersects with the black hole horizon. The equator is the circle that lies halfway between the poles. For a non-rotating black hole, there’s no such axis and no such automatic definition, but it will be useful to define the north pole of the black hole to be the point on the horizon closest to us.

**A Single Source of Electromagnetic Waves**

Let’s imagine placing a bright light bulb on the same plane as the equator, outside the black hole horizon but rather close to it. *(The bulb could emit radio waves or visible light or any other form of electromagnetic waves, at any frequency; for what I’m about to say, it doesn’t matter at all, so I’ll just call it `light’.)* See Figure 1. Where will the light from the bulb go?

Some of it, heading inward, ends up in the black hole, while some of it heads outward toward distant observers. The gravity of the black hole will bend the path of the light. And here’s something remarkable: a small fraction of the light, aimed just so, can actually spiral around the black hole any number of times before heading out. As a result, you will see the bulb not once but multiple times!

There will be a direct image — light that comes directly to us — from near the bulb’s true location (displaced because gravity bends the light a bit, just as a glass lens will distort the appearance of what’s behind it.) That path of that light is the orange arrow in Figure 1. But then there will be an **indirect** image (the green arrow in Figure 1) from light that goes halfway around the black hole before heading in our direction; we will see that image of the bulb on the opposite side of the black hole. Let’s call that the `first indirect image.’ Then there will be a second indirect image from light that orbits the black hole once and comes out near the direct image, but further out; that’s the blue arrow in Figure 1. Then there will be a third indirect image from light that goes around one and a half times (not shown), and so on. In short, Figure 1 shows the paths of the direct, first indirect, and second indirect images of the bulb as they head toward our location at the top of the image.

In the case of a ** rotating** black hole, something very similar happens

*[If you’d like to learn a bit more about the photon zone, and you have a little bit of knowledge of black holes already, you can profit from exploring this demo by Professor Leo Stein: https://duetosymmetry.com/tool/kerr-circular-photon-orbits/ ]*

Back to the non-rotating case: What our camera will see, looking at what is emitted from the light bulb, is shown in Figure 2: an infinite number of increasingly squished `indirect’ images, half on one side of the black hole near the direct image, and the other half on the other side. What is not obvious, but true, is that only the first of the indirect images is large and bright; this is one of Gralla *et al.*‘s main points. We can, therefore, separate the images into the direct image, the first indirect image, and the remaining indirect images. The total amount of light coming from the direct image and the first indirect image can be large, but the total amount of light from the remaining indirect images is typically (according to Gralla *et al.*) less than 5% of the light from the first indirect image. And so, unless we have an extremely high-powered camera, we’ll never pick those other images up. **Let’s therefore focus our attention on the direct image and the first indirect image.**

**WARNING (since this seems to be a common confusion): **

*IN ALL MY FIGURES IN THIS POST, AS IN THE BLACK HOLE `PHOTO’ ITSELF, THE COLORS OF THE IMAGES ARE CHOSEN ARBITRARILY (as explained in my first blog post on this subject.) THE `PHOTO’ WAS TAKEN AT A SINGLE, NON-VISIBLE FREQUENCY OF ELECTROMAGNETIC WAVES: EVEN IF WE COULD SEE THAT TYPE OF RADIO WAVE WITH OUR EYES, IT WOULD BE A SINGLE COLOR, AND THE ONLY THING THAT WOULD VARY ACROSS THE IMAGE IS BRIGHTNESS. IN THIS SENSE, A BLACK AND WHITE IMAGE MIGHT BE CLEARER CONCEPTUALLY, BUT IT IS HARDER FOR THE EYE TO PROCESS.*

**A Circular Source of Electromagnetic Waves**

Proceeding step by step toward a more realistic situation, let’s replace our ordinary bulb by a circular bulb (Figure 3), again set somewhat close to the horizon, sitting in the plane that contains the equator. What would we see now?

That’s shown in Figure 4: the direct image is a circle (looking somewhat larger than it really is); outside it sits the first indirect image of the ring; and then come all the other indirect images, looking quite dim and all piling up at one radius. We’re going to call all those piled-up images the “photon ring”.

Importantly, if we consider circular bulbs of different diameter [yellow, red and blue in Figure 5], then although the direct images reflect the differences in the bulbs’ diameters (somewhat enlarged by lensing), the first indirect images **all are about the same diameter**, just a tad larger or smaller than the photon ring. The remaining indirect images all sit together at the radius of the photon ring.

These statements are also essentially true for a rotating black hole seen from the north or south pole; a circular bulb generates a series of circular images, and the indirect images all pile more or less on top of each other, forming a photon ring. When viewed off the poles, the rotating black hole becomes a more complicated story, but as long as the viewing angle is small enough, the changes are relatively minor and the picture is qualitatively somewhat similar.

**A Disk as a Source of Electromagnetic Waves**

And what if you replaced the circular bulb with a disk-shaped bulb, a sort of glowing pancake with a circular hole at its center, as in Figure 7? That’s relevant because black holes are thought to have `accretion disks’ made of material orbiting the black hole, and eventually spiraling in. The accretion disk may well be the dominant source emitting radio waves at the M87bh. *(I’m showing a very thin uniform disk for illustration, but a real accretion disk is not uniform, changes rapidly as clumps of material move within it and then spiral into the black hole, and may be quite thick — as thick as the black hole is wide, or even thicker.)*

Well, we can think of the disk as many concentric circles of light placed together. The direct images of the disk (shown in Figure 6 left, on one side of the disk, as an orange wash) would form a disk in your camera, the dim red region in Figure 6 right; the hole at its center would appear larger than it really is due to the bending caused by the black hole’s gravity, but the shape would be similar. However, the indirect images would all pile up in almost the same place from your perspective, forming a bright and quite thin ring, the bright yellow circle in Figure 6 right. (The path of the disk’s first indirect image is shown in Figure 6 left, going halfway about the black hole as a green wash; notice how it narrows as it travels, which is why it appears as a narrow ring in the image at right.) This circle — the full set of indirect images of the whole disk — is the edge of the photon-sphere for a non-rotating black hole, and the circular photon ring for a rotating black hole viewed from its north or south pole.

*[Gralla et al. call the first indirect image the `lensed ring’ and the remaining indirect images, currently unobservable at EHT, the `photon ring’, while EHT refers to all the indirect images as the `photon ring’. Just letting you know in case you hear `lensed ring’ referred to in future.]*

So the conclusion is that if we had a perfect camera, the direct image of a disk makes a disk, but the indirect images (mainly just the first one, as Gralla et al. emphasize) make a bright, thin ring that may be superposed upon the direct image of the disk, depending on the disk’s shape.

And this conclusion, with some important adjustments, applies also for a spinning black hole viewed from above its north or south pole — i.e., along its axis of rotation — or from near that axis; I’ll mention the adjustments in a moment.

But EHT is not a perfect camera. To make the black hole image, technology had to be pushed to its absolute limits. Someday we’ll see both the disk and the ring, but right now, they’re all blurred together. So which one is more important?

**From a Blurry Image to Blurry Knowledge**

What does a blurry camera do to this simple image? You might think that the disk is so dim and the ring so bright that the camera will mainly show you a blurry image of the bright photon ring. But that’s wrong. The ring isn’t bright enough. A simple calculation reveals that the photo will show mainly the disk, not the photon ring! This is shown in Figure 9, which you can compare with the Black Hole `photo’ (Figure 10). *(Figure 9 is symmetric around the ring, but the photo is not, for multiple reasons — Doppler-like effect from rotation, viewpoint off the rotation axis, etc. — which I’ll have to defer til another post.)*

More precisely, **the ring and disk blur together, but the brightness of the image is dominated by the disk, not the ring.
**

Let’s say that again: the black hole `photo’ may mainly show the M87bh’s accretion disk, with the photon ring contributing only some of the light, **and therefore the photon ring does not completely and unambiguously determine the radius of the observed dark patch in the `photo.’ In general, the patch could be considerably smaller than what is usually termed the `shadow’ of the black hole.
**

This is important. The photon ring’s diameter, and thus the width of the `shadow’ too, barely depend on the rotation rate of the black hole; they depend almost exclusively on the black hole’s mass. So * if the ring in the photo were simply the photon ring of the M87bh, you’d have a very simple way to measure the black hole’s mass without knowing its rotation rate*: you’d look at how large the dark patch is, or equivalently, the diameter of the blurry ring, and that would give you the answer to within 10%. But it’s nowhere near so simple if the blurry ring shows the accretion disk, because the accretion disk’s properties and appearance can vary much more than the photon ring; they can depend strongly on the black hole’s rotation rate, and also on magnetic fields and other details of the black hole’s vicinity.

**The Important Role of Rotation**

If we conclude that EHT is seeing a mix of the accretion disk with the photon ring, with the former dominating the brightness, then this makes EHT’s measurement of the M87bh’s mass more confusing and even potentially suspect. Hence: controversy. Is it possible that EHT underestimated their uncertainties, and that their measurement of the black hole mass has more ambiguities, and is not as precise, as they currently claim?

Here’s where the rotation rate is important. Despite what I showed (for pedagogical simplicity) in Figure 7, for a non-rotating black hole the accretion disk’s central gap is actually expected to lie **outside** the photon ring; this is shown at the top of Figure 9. But the faster the black hole rotates, the smaller this central gap is expected to be, to the point that for a fast-rotating black hole the gap will lie **inside** the photon ring, as shown at the bottom of Figure 9. *(This tendency is not obvious; it requires understanding details of the black hole geometry.)* And if that is true, the dark patch in the EHT image may not be the black hole’s full shadow (i.e. quasi-silhouette), which is the region inside the photon ring. **It may be just the inner portion of it, with the outer portion obscured by emission from the accretion disk.**

The effect of blurring in the two cases of slow (or zero) and fast rotation are illustrated in Figure 9, where the photon ring’s size is taken to be the same in each case but the disk’s inner edge is close in or far out. (The black holes, not illustrated since they aren’t visible anyway, differ in mass by about 10% in order to have the photon ring the same size.) This shows why the size of the dark patch can be quite different, depending on the disk’s shape, even when the photon ring’s size is the same.

Gralla et al. subtly raise these questions but are careful not to overstate their case, perhaps because they have not yet completed their study of rotating black holes. But the question is now in the air.

I’m interested to hear what the EHT folks have to say about it, as I’m sure they have detailed arguments in favor of their procedures. In particular, **EHT’s simulations show all of the effects mentioned above; there’s none of this of which they are unaware.** (In fact, the reason I know my illustrations above are reasonable is partly because you can see similar pictures in the EHT papers.) As long as the EHT folks correctly accounted for all the issues, then they should have been able to properly measure the mass and estimate their uncertainties correctly. ** In fact, they don’t really use the photo itself; they use more subtle techniques applied to their telescope data directly. **Thus it’s not enough to argue the photo itself is ambiguous; one has to argue that EHT’s more subtle analysis methods are flawed. No one has argued that yet, as far as I am aware.

But the one thing that’s clear right now is that science writers almost uniformly got it wrong *[because the experts didn’t explain these points well]* when they tried to describe the image two months ago. The `photo’ probably does not show “a photon ring surrounding a shadow.” That would be nice and simple and impressive-sounding, since it refers to fundamental properties of the black hole’s warping effects on space. But it’s far too glib, as Figures 7 and 9 show. We’re probably seeing an accretion disk supplemented by a photon ring, all blurred together, and the dark region may well be smaller than the black hole’s shadow.

*(Rather than, or in addition to, the accretion disk, it is also possible that the dominant emission in the photo comes from the inner portion of one of the jets that emerges from the vicinity of the black hole; see Figure 8 above. This is another detail that makes the situation more difficult to interpret, but doesn’t change the main point I’m making.)*

Someday in the not distant future, improved imaging should allow EHT to separately image the photon ring and the disk, so both can be observed easily, as in the left side of Figure 9. Then all these questions will be answered definitively.

**Why the Gargantua Black Hole from Interstellar is Completely Different**

Just as a quick aside, what would you see if an accretion disk were edge-on rather than face-on? Then, in a perfect camera, you’d see something like the famous picture of Gargantua, the black hole from the movie *Interstellar* — a direct image of the front edge of the disk, and a strongly lensed indirect image of the back side of the disk, appearing both above and below the black hole, as illustrated in Figure 11. And that leads to the Gargantua image from the movie, also shown in Figure 11. Notice the photon ring (which is, as I cautioned you earlier, off-center!) *[Note added: this figure has been modified; in the original version I referred to the top and bottom views of the disk’s far side as the “1st indirect image”, but as pointed out by Professor Jean-Pierre Luminet, that’s not correct terminology here.] *

However, the movie image leaves out an important Doppler-like effect (which I’ll explain someday when I understand it 100%). This makes the part of the disk that is rotating toward us bright, and the part rotating away from us dim… and so a real image from this vantage point would be very asymmetric — bright on the left, dim on the right — unlike the movie image. At the suggestion of Professsor Jean-Pierre Luminet I have added, at the bottom of Figure 10, a very early simulation by Jean-Alain Marck that shows this effect.

I mention this because a number of expert science journalists incorrectly explained the M87 image by referring to Gargantua — but that image has essentially * nothing* to do with the recent black hole `photo’. M87’s accretion disk is certainly not edge-on. The movie’s Gargantua image is taken from the equator, not from near the pole.

**Final Remarks: Where a Rotating Black Hole Differs from a Non-Rotating One
**

Before I quit for the week, I’ll just summarize a few big differences for fast-rotating black holes compared to non-rotating ones.

1) As I’ve just emphasized, what a rotating black hole looks like to a distant observer depends not only on where the matter around the black hole is located but also on how the black hole’s rotation axis is oriented relative to the observer. A pole observer, an equatorial observer, and a near-pole observer see quite different things. (As noted in Figure 8, we are apparently near-south-pole observers for M87’s black hole.)

Let’s assume that the accretion disk lies in the same plane as the black hole’s equator — there are some reasons to expect this. Even then, the story is complex.

2) As I mentioned above, instead of a photon-sphere, there is a ‘photon-zone’ — a region where specially aimed photons can travel round the black hole multiple times. For high-enough spin (greater than about 80% of maximum as I recall), an accretion disk’s inner edge can lie within the photon zone, or even closer to the black hole than the photon zone; and this can cause a filling-in of the ‘shadow’.

3) Depending on the viewing angle, the indirect images of the disk that form the photon ring may not be a circle, and may not be concentric with the direct image of the disk. Only when viewed from along the rotation axis (i.e., above the north or south pole) will the direct and indirect images of the disk all be circular and concentric. We’re not viewing the M87bh on its axis, and that further complicates interpretation of the blurry image.

4) When the viewing angle is not along the rotation axis the image will be asymmetric, brighter on one side than the other. (This is true of EHT’s `photo’.) However, I know of at least four potential causes of this asymmetry, any or all of which might play a role, and the degree of asymmetry depends on properties of the accretion disk and the rotation rate of the black hole, both of which are currently unknown. Claims about the asymmetry made by the EHT folks seem, at least to me, to be based on certain assumptions that I, at least, cannot currently check.

Each of these complexities is a challenge to explain, so I’ll give both you and I a substantial break while I figure out how best to convey what is known (at least to me) about these issues.