Planet Musings

January 17, 2022

John BaezThe Color of Infinite Temperature


This is the color of something infinitely hot. Of course you’d instantly be fried by gamma rays of arbitrarily high frequency, but this would be its spectrum in the visible range.

This is also the color of a typical neutron star. They’re so hot they look the same.

It’s also the color of the early Universe!

This was worked out by David Madore.

As a blackbody gets hotter and hotter, its spectrum approaches the classical Rayleigh–Jeans law. That is, its true spectrum as given by the Planck law approaches the classical prediction over a larger and larger range of frequencies.

So, for an extremely hot blackbody, the spectrum of light we can actually see with our eyes is governed by the Rayleigh–Jeans law. This law says the color doesn’t depend on the temperature: only the brightness does!

And this color is shown above.

This involves human perception, not just straight physics. So David Madore needed to work out the response of the human eye to the Rayleigh–Jeans spectrum — “by integrating the spectrum against the CIE XYZ matching functions and using the definition of the sRGB color space.”

The color he got is sRGB(148,177,255). And according to the experts who sip latte all day and make up names for colors, this color is called ‘Perano’.

Here is some background material Madore wrote on colors and visual perception. It doesn’t include the whole calculation that leads to this particular color, so somebody should check it, but it should help you understand how to convert the blackbody spectrum at a particular temperature into an sRGB color:

• David Madore, Colors and colorimetry.

In the comments you can see that Thomas Mansencal redid the calculation and got a slightly different color: sRGB(154,181,255). It looks quite similar to me:

Jordan EllenbergTailgating

I was driving home from picking up sushi the other night, and another car was tailgating me. I was really annoyed. I was on a curvy road, it was icy out, and I was going the speed limit, 25 – and this guy was riding my bumper, with those new really bright halogen headlights shining right into my rear-view mirror. I was not going to speed up to satisfy him, and anyway I was just going a couple more blocks. But when I turned onto my block, the tailgater turned with me, and when I pulled into my driveway, he parked next to my house. Now I was kind of freaked out. Was the guy going to get out of his car and scream at me for slowing him down? He did get out of his car. No chance of avoiding a conversation. He came up to me and asked where a certain address on my street was. He was a DoorDash delivery guy. Tailgating me because his ability to make enough money to live on depends on getting a certain number of deliveries done per hour, and that means that it’s an economic necessity for him to drive too fast on icy roads.

Matt StrasslerHas a New Force of Nature Been Discovered?

There have been dramatic articles in the news media suggesting that a Nobel Prize has essentially already been awarded for the amazing discovery of a “fifth force.” I thought I’d better throw some cold water on that fire; it’s fine for it to smoulder, but we shouldn’t let it overheat.

There could certainly be as-yet unknown forces waiting to be discovered — dozens of them, perhaps.   So far, there are four well-studied forces: gravity, electricity/magnetism, the strong nuclear force, and the weak nuclear force.  Moreover, scientists are already fully confident there is a fifth force, predicted but not yet measured, that is generated by the Higgs field. So the current story would really be about a sixth force.

Roughly speaking, any new force comes with at least one new particle.  That’s because

  • every force arises from a type of field (for instance, the electric force comes from the electromagnetic field, and the predicted Higgs force comes from the Higgs field)
  • and ripples in that type of field are a type of particle (for instance, a minimal ripple in the electromagnetic field is a photon — a particle of light — and a minimal ripple in the Higgs field is the particle known as the Higgs boson.)

The current excitement, such as it is, arises because someone claims to have evidence for a new particle, whose properties would imply a previously unknown force exists in nature.  The force itself has not been looked for, much less discovered.

The new particle, if it really exists, would have a rest mass about 34 times larger than that of an electron — about 1/50th of a proton’s rest mass. In technical terms that means its E=mc² energy is about 17 million electron volts (MeV), and that’s why physicists are referring to it as the X17.  But the question is whether the two experiments that find evidence for it are correct.

In the first experiment, whose results appeared in 2015, an experimental team mainly based in Debrecen, Hungary studied large numbers of nuclei of beryllium-8 atoms, which had been raised to an “excited state” (that is, with more energy than usual).  An excited nucleus inevitably disintegrates, and the experimenters studied the debris.  On rare occasions they observed electrons and positrons [a.k.a. anti-electrons], and these behaved in a surprising way, as though they were produced in the decay of a previously unknown particle.

In the newly reported experiment, whose results just appeared, the same team observed  the disintegration of excited nuclei of helium.  They again found evidence for what they hope is the X17, and therefore claim confirmation of their original experiments on beryllium.

When two qualitatively different experiments claim the same thing, they are less likely to be wrong, because it’s not likely that any mistakes in the two experiments would create fake evidence of the same type.  On the face of it, it does seem unlikely that both measurements, carried out on two different nuclei, could fake an X17 particle.

However, we should remain cautious, because both experiments were carried out by the same scientists. They, of course, are hoping for their Nobel Prize (which, if their experiments are correct, they will surely win) and it’s possible they could suffer from unconscious bias. It’s very common for individual scientists to see what they want to see; scientists are human, and hidden biases can lead even the best scientists astray.  Only collectively, through the process of checking, reproducing, and using each other’s work, do scientists create trustworthy knowledge.

So it is prudent to await efforts by other groups of experimenters to search for this proposed X17 particle.  If the X17 is observed by other experiments, then we’ll become confident that it’s real. But we probably won’t know until then.  I don’t currently know whether the wait will be months or a few years.

Why I am so skeptical? There are two distinct reasons.

First, there’s a conceptual, mathematical issue. It’s not easy to construct reasonable equations that allow the X17 to co-exist with all of the known types of elementary particles. That it has a smaller mass than a proton is not a problem per se.  But the X17 needs to have some unique and odd properties in order to (1)  be seen in these experiments, yet (2) not be seen in certain other previous experiments, some of which were explicitly looking for something similar.   To make equations that are consistent with these properties requires some complicated and not entirely plausible trickery.  Is it impossible? No.  But a number of the methods that scientists suggested were flawed, and the ones that remain are, to my eye, a bit contrived.

Of course, physics is an experimental science, and what theorists like me think doesn’t, in the end, matter.  If the experiments are confirmed, theorists will accept the facts and try to understand why something that seems so strange might be true.  But we’ve learned an enormous amount from mathematical thinking about nature in the last century — for instance, it was math that told us that the Higgs particle couldn’t be heavier than 1000 protons, and it was on the basis of that `advice’ that the Large Hadron Collider was built to look for it (and it found it, in 2012.) Similar math led to the discoveries of the W and Z particles roughly where they were expected. So when the math tells you the X17 story doesn’t look good, it’s not reason enough for giving up, but it is reason for some pessimism.

Second, there are many cautionary tales in experimental physics. For instance, back in 2003 there were claims of evidence of a particle called a pentaquark with a rest mass about 1.6 times a proton’s mass — an exotic particle, made from quarks and gluons, that’s both like and unlike a proton.  Its existence was confirmed by multiple experimental groups!  Others, however, didn’t see it. It took several years for the community to come to the conclusion that this pentaquark, which looked quite promising initially, did not in fact exist.

The point is that mistakes do get made in particle hunts, sometimes in multiple experiments, and it can take some time to track them down. It’s far too early to talk about Nobel Prizes.

[Note that the Higgs boson’s discovery was accepted more quickly than most.  It was discovered simultaneously by two distinct experiments using two methods each, and confirmed by additional methods and in larger data sets soon thereafter.  Furthermore,  there were already straightforward equations that happily accommodated it, so it was much more plausible than the X17.] 

And just for fun, here’s a third reason I’m skeptical. It has to do with the number 17. I mean, come on, guys, seriously — 17 million electron volts? This just isn’t auspicious.  Back when I was a student, in the late 1980s and early 90s, there was a set of experiments, by a well-regarded experimentalist, which showed considerable evidence for an additional neutrino with a E=mc² energy of 17 thousand electron volts. Other experiments tried to find it, but couldn’t. Yet no one could find a mistake in the experimenter’s apparatus or technique, and he had good arguments that the competing experiments had their own problems. Well, after several years, the original experimenter discovered that there was a piece of his equipment which unexpectedly could absorb about 17 keV of energy, faking a neutrino signal. It was a very subtle problem, and most people didn’t fault him since no one else had thought of it either. But that was the end of the 17 keV neutrino, and with it went hundreds of research papers by both experimental and theoretical physicists, along with one scientist’s dreams of a place in history.

In short, history is cruel to most scientists who claim important discoveries, and teaches us to be skeptical and patient. If there is a fifth sixth force, we’ll know within a few years. Don’t expect to be sure anytime soon. The knowledge cycle in science runs much, much slower than the twittery news cycle, and that’s no accident; if you want to avoid serious errors that could confuse you for a long time to come, don’t rush to judgment.

Matt StrasslerPhysics is Broken!!!

Last Thursday, an experiment reported that the magnetic properties of the muon, the electron’s middleweight cousin, are a tiny bit different from what particle physics equations say they should be. All around the world, the headlines screamed: PHYSICS IS BROKEN!!! And indeed, it’s been pretty shocking to physicists everywhere. For instance, my equations are working erratically; many of the calculations I tried this weekend came out upside-down or backwards. Even worse, my stove froze my coffee instead of heating it, I just barely prevented my car from floating out of my garage into the trees, and my desk clock broke and spilled time all over the floor. What a mess!

Broken, eh? When we say a coffee machine or a computer is broken, it means it doesn’t work. It’s unavailable until it’s fixed. When a glass is broken, it’s shattered into pieces. We need a new one. I know it’s cute to say that so-and-so’s video “broke the internet.” But aren’t we going a little too far now? Nothing’s broken about physics; it works just as well today as it did a month ago.

More reasonable headlines have suggested that “the laws of physics have been broken”. That’s better; I know what it means to break a law. (Though the metaphor is imperfect, since if I were to break a state law, I’d be punished, whereas if an object were to break a fundamental law of physics, that law would have to be revised!) But as is true in the legal system, not all physics laws, and not all violations of law, are equally significant.

What’s a physics law, anyway? Crudely, physics is a strategy for making predictions about the behavior of physical objects, based on a set of equations and a conceptual framework for using those equations. Sometimes we refer to the equations as laws; sometimes parts of the conceptual framework are referred to that way.

But that story has layers. Physics has an underlying conceptual foundation, which includes the pillar of quantum physics and its view of reality, and the pillar of Einstein’s relativity and its view of space and time. (There are other pillars too, such as those of statistical mechanics, but let me not complicate the story now.) That foundation supports many research areas of physics. Within particle physics itself, these two pillars are combined into a more detailed framework, with concepts and equations that go by the name of “quantum effective field theory” (“QEFT”). But QEFT is still very general; this framework can describe an enormous number of possible universes, most with completely different particles and forces from the ones we have in our own universe. We can start making predictions for real-world experiments only when we put the electron, the muon, the photon, and all the other familiar particles and forces into our equations, building up a specific example of a QEFT known as “The Standard Model of particle physics.”

All along the way there are equations and rules that you might call “laws.” They too come in layers. The Standard Model itself, as a specific QEFT, has few high-level laws: there are no principles telling us why quarks exist, why there is one type of photon rather than two, or why the weak nuclear force is so weak. The few laws it does have are mostly low-level, true of our universe but not essential to it.

I’m bringing attention to these layers because an experiment might cause a problem for one layer but not another. I think you could only fairly suggest that “physics is broken” if data were putting a foundational pillar of the entire field into question. And to say “the laws of physics have been violated”, emphasis on the word “the“, is a bit melodramatic if the only thing that’s been violated is a low-level, dispensable law.

Has physics, as a whole, ever broken? You could argue that Newton’s 17th century foundation, which underpinned the next two centuries of physics, broke at the turn of the 20th century. Just after 1900, Newton-style equations had to be replaced by equations of a substantially different type; the ways physicists used the equations changed, and the concepts, the language, and even the goals of physics changed. For instance, in Newtonian physics, you can predict the outcome of any experiment, at least in principle; in post-Newtonian quantum physics, you often can only predict the probability for one or another outcome, even in principle. And in Newtonian physics we all agree what time it is; in Einsteinian physics, different observers experience time differently and there is no universal clock that we all agree on. These were immense changes in the foundation of the field.

Conversely, you could also argue that physics didn’t break; it was just remodeled and expanded. No one who’d been studying steam engines or wind erosion or electrical circuit diagrams had to throw out their books and start again from scratch. In fact this “broken” Newtonian physics is still taught in physics classes, and many physicists and engineers never use anything else. If you’re studying the physics of weather, or building a bridge, Newtonian physics is just fine. The fact that Newton-style equations are an incomplete description of the world — that there are phenomena they can’t describe properly — doesn’t invalidate them when they’re applied within their wheelhouse.

No matter which argument you prefer, it’s hard to see how to justify the phrase “physics is broken” without a profound revolution that overthrows foundational concepts. It’s rare for a serious threat to foundations to arise suddenly, because few experiments can single-handedly put fundamental principles at risk. [The infamous case of the “faster-than-light neutrinos” provides an exception. Had that experiment been correct, it would have invalidated Einstein’s relativity principles. But few of us were surprised when a glaring error turned up.]

In the Standard Model, the electron, muon and tau particles (known as the “charged leptons”) are all identical except for their masses. (More fundamentally, they have different interactions with the Higgs field, from which their rest masses arise.) This almost-identity is sometimes stated as a “principle of lepton universality.” Oh, wow, a principle — a law! But here’s the thing. Some principles are enormously important; the principles of Einsteinian relativity determine how cause and effect work in our universe, and you can’t drop them without running into big paradoxes. Other principles are weak, and could easily be discarded without making a mess of any other part of physics. The principle of lepton universality is one of these. In fact, if you extend the Standard Model by adding new particles to its equations, it can be difficult to avoid ruining this fragile principle. [In a sense, the Higgs field has already violated the principle, but we don’t hold that against it.]

All the fuss is about a new experimental result which confirms an older one and slightly disagrees with the latest theoretical predictions, which are made using the Standard Model’s equations. What could be the cause of the discrepancy? One possibility is that it arises from a previously unknown difference between muons and electrons — from a violation of the principle of lepton universality. For those who live and breathe particle physics, breaking lepton universality would be a big deal; there’d be lots of adventure in trying to figure out which of the many possible extensions of the Standard Model could actually explain what broke this law. That’s why the scientists involved sound so excited.

But the failure of lepton universality wouldn’t come as a huge surprise. From certain points of view, the surprise is that the principle has survived this long! Since this low-level law is easily violated, its demise may not lead us to a profound new understanding of the world. It’s way too early for headlines that argue that what’s at stake is the existence of “forms of matter and energy vital to the nature and evolution of the cosmos.” No one can say how much is at stake; it might be a lot, or just a little.

In particular, there’s absolutely no evidence that physics is broken, or even that particle physics is broken. The pillars of physics and QEFT are not (yet) threatened. Even to say that “the Standard Model might be broken” seems a bit melodramatic to me. Does adding a new wing to a house require “breaking” the house? Typically you can still live in the place while it’s being extended. The Standard Model’s many successes suggest that it might survive largely intact as a recognizable part of a larger, more complete set of equations.

In any case, right now it’s still too early to say anything so loudly. The apparent discrepancy may not survive the heavy scrutiny it is coming under. There’s plenty of controversy about the theoretical prediction for muon magnetism; the required calculation is extraordinarily complex, elaborate and difficult.

So, from my perspective, the headlines of the past week are way over the top. The idea that a single measurement of the muon’s magnetism could “shake physics to its core“, as claimed in another headline I happened upon, is amusing at best. Physics, and its older subdisciplines, have over time become very difficult to break, or even shake. That’s the way it should be, when science is working properly. And that’s why we can safely base the modern global economy on scientific knowledge; it’s unlikely that a single surprise could instantly invalidate large chunks of its foundation.

Some readers may view the extreme, click-baiting headlines as harmless. Maybe I’m overly concerned about them. But don’t they implicitly suggest that one day we will suddenly find physics “upended”, and in need of a complete top-to-bottom overhaul? To imply physics can “break” so easily makes a mockery of science’s strengths, and obscures the process by which scientific knowledge is obtained. And how can it be good to claim “physics is broken” and “the laws of physics have been broken” over and over and over again, in stories that almost never merit that level of hype and eventually turn out to have been much ado about nada? The constant manufacturing of scientific crisis cannot possibly be lost on readers, who I suspect are becoming increasingly jaded. At some point readers may become as skeptical of science journalism, and the science it describes, as they are of advertising; it’s all lies, so caveat emptor. That’s not where we want our society to be. As we are seeing in spades during the current pandemic, there can be serious consequences when “TRUST IN SCIENCE IS BROKEN!!!

A final footnote: Ironically, the Standard Model itself poses one of the biggest threats to the framework of QEFT. The discovery of the Higgs boson and nothing else (so far) at the Large Hadron Collider poses a conceptual challenge — the “naturalness” problem. There’s no sharp paradox, which is why I can’t promise you that the framework of QEFT will someday break if it isn’t resolved. But the breakdown of lepton universality might someday help solve the naturalness problem, by requiring a more “natural” extension of the Standard Model, and thus might actually save QEFT instead of “breaking” it.

Matt StrasslerIs Someone Making Artificial Earthquakes under La Palma?

There’s a plot afoot. It’s a plot that involves a grid of earthquake locations, under the island of La Palma.

Conspiracy theory would be hysterically funny if it weren’t so widespread and so incredibly dangerous. Today it threatens democracy, human health, and world peace, among many other things. In the internet age, scientists and rational bloggers will have no choice but to take up arms against it on a regular basis.

The latest conspiracy theory involves the ongoing eruption of the Cumbre Vieja volcanic system on the island of La Palma. This eruption, unlike the recent one in Iceland, is no fun and no joke; it is occurring above a populated area. Over the past month, thousands of homes have been destroyed by incessant lava flows, and many more are threatened. The only good news is that, because the eruption is relatively predictable and not overly explosive, no one has yet been injured.

The source of the latest conspiracy theory is a graph of earthquakes associated with the eruption. You can check this yourself by going to and zooming in on the island of La Palma. You’ll see something like the plot below, which claims to show earthquake locations. You can see something is strange about it: the earthquakes are shown as occurring on a grid.

Earthquakes occurring under the island of La Palma, as plotted by the emsc website.

Clearly there’s something profoundly unnatural about this. That is exactly what thousands and thousands of people are concluding around the world. They are absolutely correct. There’s no way this could be natural.

When faced with something unnatural like this, there are two possible conclusions that a human can draw.

  1. The earthquakes are natural, but their positions appear on a grid because of something unnatural about the way the data is plotted.
  2. The data is plotted correctly, and the earthquakes really are happening on a grid — which suggests that the earthquakes can’t be made by nature, and must be human-made.

Now, when faced with these two options, what does a reasonable person guess is more likely? Option 2 requires a spectacular technology that can set off huge explosions five to twenty miles (10-30 km) underground without anyone noticing, by a group of people who are evil enough to want to set off earthquakes miles underground and clever enough to keep their super-high-tech methods secret, but dumb enough to set off the earthquakes in a grid so that a simple look at the earthquake’s locations by non-experts perusing the internet reveals their dastardly plot. Option 1 requires a tiny amount of human error or computer error.

No research is needed to conclude Option 1 is more plausible, but five minutes’ research confirms it’s true. First, other websites plotting the same earthquakes do not show the grid pattern. Second, as this video pointed out and as you yourself can check, the same website, plotting earthquakes in other locations such as Hawaii, again shows the grid pattern — so it’s a fact of the emsc website, not of the La Palma earthquakes. Third, as pointed out by the excellent Volcano Discovery website, looking at the actual data that the emsc website uses, one sees that the latitude and longitude are rounded off to the nearest 1/100th, and thus north-south and east-west locations on the map are rounded off to (roughly) the nearest kilometer. This “rounding off” moves each earthquake location to the nearest point on a grid. That’s the cause. No conspiracy, no magical technology, just a plotting issue. There’s nothing more here than nature doing its thing: making earthquakes, just as it does with every volcanic eruption on Earth.

This effect, where writing numbers to a particular choice of significant figures leads to a plot with a grid pattern, is well known to every scientist. Here’s an example of how it works. Below are thirty points chosen at random in a small region, shown at left. I plotted them using a wide range at the top, and then zoomed in to make the lower plot.

Left: 30 random data points. Top: the data points plotted on a wide scale. Bottom: the same data, zoomed in.

Next, the points are rounded to one significant figure after the decimal point, using the same methods we are all taught in school, and the points are replotted. Instant grid.

Left: the same data as above, rounded to one significant figure after the decimal point. Right: the plot of the rounded data; the lower plot of the previous figure is shifted into a grid pattern.

In short, we’re not looking at a plot to destroy La Palma and set off a tsunami. We’re looking at a plot of rounded-off locations. I agree that’s not nearly as exciting; but as any scientist with some experience will tell you, boring explanations are usually true and conspiracies, especially wild ones, are usually not.

What’s the point of this post? Well, aside from being a source that you can send to any friends, relatives or acquaintances who are falling for this ridiculous conspiracy theory, it’s an apolitical context in which to contemplate the real problem.

The real problem is that we face an increasing flood of half-reasoned badly-researched pseudo-science, combined with irrational knee-jerk conspiratorialism, the whole thing driven by an unholy mixture of fear, maliciousness, narcissism and greed. It’s a war between calm reason and emotional darkness, a war in which people are actually dying, and in which nations are actually at risk. At this rate, the voices of rationality may soon be drowned. So perhaps we might consider this question: how can an apolitical conspiracy such as this one be used as an example, one from which we can learn lessons that we can apply more broadly, in territory that’s much more complex and dangerous?

p.s., predictably, someone questioned whether the statement about Hawaiian earthquakes on the emsc website appearing on a grid was true. Well, here’s the plot below — the earthquakes aren’t so many, so the grid isn’t full, but you can see every plotted earthquake lies on a grid point. And the same is true for earthquakes on the island of Crete, as shown in the second plot. All of it data that has been rounded off to the nearest 1/100th of latitude and longitude.

Robert HellingYou got me wordle!

 Since a few days, I am following the hype and play wordle. I think I got lucky the first days but I had already put in some strategy as in starting with words where the possible results are most telling. I was thinking that getting the vowels right early is a good idea so I tend to start with "HOUSE" (continuing three vowels and an S) possibly followed by "FAINT" (containing the remaining vowels plus important N and T).

With this start it never took me more than four guesses so far and twice I managed to find the solution in three guesses.

Of course, over time you start thinking how to optimise this. I knew that Donald Knuth had written a paper solving the original Mastermind showing that five moves are sufficient to always find the answer. So today, I sat down and wrote a perl script to help. It does not do the full minimax (but that shouldn't be too hard from where I am) but at least tells you which of your possible next guesses leaves the best worst case in terms of number of remaining words after knowing the result of your guess. 

In that metric, it turns out "ARISE" is the optional first guess (leaving at most 168 out of the possible 2314 words on this list after knowing the result). In any case, here is the source: 

NB: Since i started playing, there was no word that contained the same letter more than once, so I am not 100% sure how those cases are handled (like what color do the two 'E' in "AGREE" receive if the solution is "AISLE" (in mastermind logic, the second would be green the other grey, not yellow) and what when the solution were "EARLY"? So my script does not handle those cases correct probably (for EARLY it would color both yellow).

#!/usr/local/bin/perl -w

use strict;

# Load the word list of possible answers
my @words = ();
open (IN, "answers.txt") || die "Cannot open answers: $!\n";
while(<IN>) {
push @words, uc($_);
close IN;

my %letters = ();
my @appears = ();

# Positions at which letter $l can still appear
foreach my $c (0..25) {
my $l = chr(65 + $c);
$letters{$l} = [1,1,1,1,1];

# Running without an initial guess shows that ARISE is the best guess at it leaves 168 words.

&filter("ARISE", &bewerten("ARISE", "SOLAR"));
#&filter("SMART", &bewerten("SMART", "SOLAR"));

# Find the remaining words
my @remain = @words;
# Only keep words containing the letters in @appeads
foreach my $a(@appears) {
@remain = grep {/$a/} @remain;
my $re = &makeregex;

# Apply positional constraints
@remain = grep {/$re/} @remain;

my $min = @remain;
my $best = '';

# Loop over all possible guesses and targets and count how ofter a potential result appears for a guess
foreach my $g(@remain) {
my %results = ();
foreach my $t(@remain) {
++$results{&bewerten($g, $t)}
my $max = 0;
foreach my $res(keys %results) {
$max = $results{$res} if $results{$res} > $max;
#print "$g leaves at most $max.\n";
if ($min > $max) {
$min = $max;
$best = $g;

print "Best guess: $best leaves at most $min.\n";

# Assemble a regex for the postional informatiokn
sub makeregex {
my $rem = '';
foreach my $p (0..4) {
$rem .= '[';
foreach my $l (sort keys %letters) {
$rem .= $l if $letters{$l}->[$p];
$rem .= ']';
return $rem;

# Find new constraints arising from the result of a guess
sub filter {
my ($guess, $result) = @_;

my @a = split //, $result;
my @w = split //, uc($guess);
foreach my $p (0..4) {
my $l = $w[$p];
if ($a[$p] == 0) {
$letters{$l} = [0,0,0,0,0];
} elsif ($a[$p] == 1) {
&setletter($l, $p, 0);
push @appears, $l;
} else {
foreach my $o (sort keys %letters) {
&setletter($o, $p, 0);
&setletter($l, $p, 1);

# Update the positional information for letter $l at position $p with value $v
sub setletter {
my ($l, $p, $v) = @_;
my @a = @{$letters{$l}};
$a[$p] = $v;
$letters{$l} = \@a;

# Find the result for $guess given the $target
sub bewerten {
my ($guess, $target) = @_;
my @g = split //, $guess;
my @t = split //, $target;

my @result = (0,0,0,0,0);
foreach my $p(0..4) {
if($g[$p] eq $t[$p]) {
$result[$p] = 2;
$t[$p] = '';
$g[$p] = 'x';
$target = join('', @t);
foreach my $p(0..4) {
if($target =~ /$g[$p]/) {
$result[$p] = 1;
return join('', @result);

January 16, 2022

Matt StrasslerMapping the Unknown, Times Three

I took a short break from other projects this weekend. Poking around, I found three particularly lovely science stories, which I thought some readers might enjoy as well. They all involve mapping, but the distances involved are amazingly different.

Starbirth Near to Home

The first one has been widely covered in the media (although a lot of the articles were confused as to what is new news and what is old news.) Our galaxy is full of stars, but also of gas (mainly hydrogen and helium) and dust (tiny grains of material made mainly from heavier elements forged in stars, such as silicon). That gas and dust, referred to as the “interstellar medium”, is by no means uniform; it is particularly thin in certain regions which are roughly in the shape of bubbles. These bubbles were presumably “blown” by the force of large stellar explosions, i.e. supernovas, whose blast waves cleared out the gas and dust nearby.

It’s been known for several decades that the Sun sits near the middle of such a bubble. The bubble and the Sun are moving relative to one another, so the Sun’s probably only been inside the bubble for a few million years; since we’re just passing through, it’s an accident that right now we’re near its center. Called simply the “Local Bubble”, it’s an irregularly shaped region where the density of gas and dust is 1% of its average across the galaxy. If you orient the galaxy in your mind so that its disk, where most of the stars lie, is horizontal, then the bubble stretches several hundred light-years across in the horizontal direction, and is elongated vertically. [For scale: a light-second is 186,000 miles or 300,000 km; the Sun is 8 light-minutes from Earth; the next-nearest star is 4 light-years away; and our Milky Way galaxy is about 100,000 light years across.] It’s been thought for some time that this bubble was created some ten to twenty million years ago by the explosion of one or more stars, probably siblings that were born close by in time and in space, and which at their deaths were hundreds of light-years from the Sun, far enough away to do no harm to Earthly life. [For scale: recall the Sun and Earth are about 5 billion years old.]

Meanwhile, it’s long been suggested that explosion debris from such supernovas can sweep up gas and dust like a snowplow, and that the compression of the gas can lead it to start forming stars. It’s a beautiful story; large stars live fast and hot, and die young, but perhaps as they expire they create the conditions for the next generation. [A star with 40 times the raw material as the Sun has will burn so hot that it will last only a million years, and most such stars die by explosion, unlike Sun-like stars.]

If this is true, then in the region near the Sun, most if not all the gas clouds where stars are currently forming should lie on the current edge of this bubble, and moreover, all relatively young stars, less than 10 million years old, should have formed on the past edge of the bubble. Unfortunately, this has been hard to prove, because measuring the locations, motions and ages of all the stars and clouds isn’t easy. But the extraordinary Gaia satellite has made this possible. Using Gaia’s data as well as other observations, a team of researchers (Catherine Zucker, Alyssa A. Goodman, João Alves, Shmuel Bialy, Michael Foley, Joshua S. Speagle, Josefa Groβschedl, Douglas P. Finkbeiner, Andreas Burkert, Diana Khimey & Cameren Swiggum) has claimed here that indeed the star-forming regions lying within a few hundred light-years of the Sun all lie on the Local Bubble’s surface, and that nearby stars younger than ten million years were born on the then-smaller shell of the expanding bubble. Moreover they claim that the bubble probably formed 14-15 million years ago, at a time when the Sun was about 500 light-years distant.

The one exception to this rule appears to be a star-forming region on the backside of a second bubble, adjoining to the Local Bubble. The discovery of this “Per-Tau superbubble,” and the fact that two star-forming regions lie on opposite sides of it, was announced by the (nearly) same team just a few months ago; in fact this earlier paper gives the first direct evidence of the explosion-snowplow-to-star-formation effect. (You can see a 3d image of that adjoining bubble, created by the research team, here. and you can see its relation to the Local Bubble in the right-hand-side of this image, where the Local Bubble is marked in purple and the adjoining “Per-Tau” bubble is marked in green. Other visuals and captions are at this site.) The supernova or supernovas that may have created this bubble also probably went off between 5 and 20 million years ago, based on the size of the bubble and the slow motion of the bubble walls.

That stars can reproduce by generating new ones from the debris of the old is reminiscent of a similar effect in thunderstorms; when a thunderstorm complex is mature and loses its powerful updrafts, its cold air drops downward and then flows outward, and the resulting outflow can generate new storms. (See this blog post for some detailed discussion.) Perhaps there are other examples of natural engines generating offspring in a similar way; do you know of any? I have long wondered whether life could have gotten its start through a mechanism like this one, a rudimentary form of imperfect physical reproduction on which evolution might then have begun to act.

Galaxies Across the Cosmos

Meanwhile the Dark Energy Spectroscopic Instrument (DESI), seeking to understand the history of the universe with precision, has been engaged in a similar mission to Gaia but on a different scale and with different methods. It is mapping distances not to nearby stars in our own galaxy but to other galaxies, covering a very substantial fraction of the visible universe. Its data has now allowed the creation of the largest ever 3D map of a region of the cosmos, extending across part of the sky out to distances reaching five billion light years from Earth. (I’m particularly fond of this type of map, because my first scientific paper was an attempt, led by Jerry Ostriker, to make sense of the structures seen in the first 2D slice map of the universe, made by Valerie de Lapparent, Margaret Geller and John Huchra, scarcely reaching out 200 million light-years.) The 3D map, seen as a movie that sweeps through 2D slices of the map, can be found here. Each dot on the map is a galaxy; our own galaxy is located at the lower left corner. Within the map (ignoring dark wedges which presumably arise from regions that are blocked from view by nearby objects in the sky) you can see emptier regions threaded by dense filaments and other features.

And DESI has much more to tell us; it has measured far more galaxies than appear in this map; it has measured the distance to many galaxies extending out to 10 billion light years, most of the distance across the visible universe; and it’s just getting started, with many more years of mapping left to do. By the way, the technology behind DESI is pretty amazing; I leave it to you to read about it on the DESI website.

The Ocean Deep

Finally I bring us closer to home, to another frontier where humans still have remote mapping to do: the bottom of the sea. The story begins with partial failure in an experiment to explore the deepest region of the ocean, the Challenger Deep, which extends below sea level 25% further than Mount Everest extends above it. In 2014 a research team put two thick glass spheres full of scientific instruments into the water above the Challenger Deep, expecting one to reach the very bottom and the other to remain just above it. Unfortunately the first one must have had a small but fatal flaw in its construction, because when still roughly 2 km (1.2 miles) above the sea floor, it imploded under the immense pressure. One can only begin to imagine the disappointment of those who built the instrument and were expecting reams of data from it. At least the other sphere survived.

But now comes a story of scientists at their most creative, making lemonade from this lemon. The implosion that destroyed the first instrument created a shock wave which reverberated all across the ocean floor, back to the ocean surface, yet again back to the floor, and so on; and those reverberations were detected by the surviving instrument. By carefully investigating all these echoes from the disaster, a team of three scientists has recently claimed to have obtained the most precise measurement ever of the depth of the Challenger Deep: 10,983 meters (36033 feet) with an uncertainty of only 6 meters (20 feet) up or down, one fourth the uncertainty of any previous measurement. While I’m in no position to evaluate the validity of this claim, it represents an inspiring effort to make the very best of bad news, and to treat an experiment’s failure as an experiment in its own right.

Now if someone could just boldly go and map the internet

January 15, 2022

Doug NatelsonBrief items - papers, packings, books

 It's a very busy time, so no lengthy content, but here are a few neat things I came across this week.

  •  A new PRL came out this week that seems to have a possible* analytic solution to Hilbert's 18th problem, about the density of random close-packed spheres in 2D and 3D.  This is a physics problem because it's closely related to the idea of jamming and the onset of mechanical rigidity of a collection of solid objects.  (*I say possible only because I don't know any details about any subtle constraints in the statement of the problem.)
  • The Kasevich group at Stanford has an atom interferometric experiment that they claim is a gravitational analog of the Aharonov-Bohm effect.  This is a cool experiment, where there is a shift in the quantum phase of propagating atomic clouds due to the local gravitational potential caused by a nearby massive object.  (Phase goes like the argument of \(\exp(-i S(x(t))/\hbar)\), where the action can include a term related to the gravitational potential, \(m \times \Phi_{G}(x(t)\).)  At a quick read, though, I don't see how this is really analogous to the AB effect.  In the AB case, there is a relative phase due to magnetic flux enclosed by the interfering paths even when the magnetic field is arbitrarily small at the actual location of the path.  I need to read this more closely, or perhaps someone can explain in the comments.
  • A colleague pointed out to me this great review article all about charge shot noise in mesoscopic electronic systems.  
  • Speaking of gravity, there has been interest in recent years about "warp drives", geometries of space-time allowed by general relativity that seem to permit superluminal travel for an observer in some particular region.  One main objection to these has been that past proposed incarnations violate various energy conditions in GR - requiring enormous quantities "negative matter", for example, which does not seem to exist.  Interestingly, people have been working on normal-matter-only ideas for these, and making some progress as in this preprint.  Exercises like this can be really important for illuminating subtle issues with GR, just like worrying about "fast light" experiments can make us refine arguments about causality and signaling.  
  • Thomas Wong from Creighton University has a free textbook (link on that page) to teach about quantum computing, where the assumed starting math knowledge is trig.  It looks very accessible!
  • People recommended two other books to me recently that I have not yet had time to read.  The Alchemy of Us is a materials-and-people focused book from Ainissa Ramirez, and Sticky: The Secret Science of Surfaces is all about surfaces and friction, by Laurie Winkless.  Gotta make time for these once the semester craziness is better in hand....

David Hoggdetecting shifts with cross-correlations

Lily Zhao (Flatiron) has been looking at the variability of the SDSS-IV BOSS spectrograph calibration frames, in the thought that if we can understand them well enough, we might be able to substantially reduce the observing overheads for the robotic phase of SDSS-V. To start, we are doing simple cross-correlations between calibration images. After all, the dominant calibration changes from exposure to exposure appear to be shifts. But these cross-correlations have weird features in them that I don't understand: In general if you are cross-correlating two images that are very similar, the cross-correlation function should look very symmetric, I think? But Zhao's cross-correlation functions look asymmetric in weird ways. We ended our session puzzled.

January 14, 2022

Matt von HippelThe Unpublishable Dirty Tricks of Theoretical Physics

As the saying goes, it is better not to see laws or sausages being made. You’d prefer to see the clean package on the outside than the mess behind the scenes.

The same is true of science. A good paper tells a nice, clean story: a logical argument from beginning to end, with no extra baggage to slow it down. That story isn’t a lie: for any decent paper in theoretical physics, the conclusions will follow from the premises. Most of the time, though, it isn’t how the physicist actually did it.

The way we actually make discoveries is messy. It involves looking for inspiration in all the wrong places: pieces of old computer code and old problems, trying to reproduce this or that calculation with this or that method. In the end, once we find something interesting enough, we can reconstruct a clearer, cleaner, story, something actually fit to publish. We hide the original mess partly for career reasons (easier to get hired if you tell a clean, heroic story), partly to be understood (a paper that embraced the mess of discovery would be a mess to read), and partly just due to that deep human instinct to not let others see us that way.

The trouble is, some of that “mess” is useful, even essential. And because it’s never published or put into textbooks, the only way to learn it is word of mouth.

A lot of these messy tricks involve numerics. Many theoretical physics papers derive things analytically, writing out equations in symbols. It’s easy to make a mistake in that kind of calculation, either writing something wrong on paper or as a bug in computer code. To correct mistakes, many things are checked numerically: we plug in numbers to make sure everything still works. Sometimes this means using an approximation, trying to make sure two things cancel to some large enough number of decimal places. Sometimes instead it’s exact: we plug in prime numbers, and can much more easily see if two things are equal, or if something is rational or contains a square root. Sometimes numerics aren’t just used to check something, but to find a solution: exploring many options in an easier numerical calculation, finding one that works, and doing it again analytically.

“Ansatze” are also common: our fancy word for an educated guess. These we sometimes admit, when they’re at the core of a new scientific idea. But the more minor examples go un-mentioned. If a paper shows a nice clean formula and proves it’s correct, but doesn’t explain how the authors got it…probably, they used an ansatz. This trick can go hand-in-hand with numerics as well: make a guess, check it matches the right numbers, then try to see why it’s true.

The messy tricks can also involve the code itself. In my field we often use “computer algebra” systems, programs to do our calculations for us. These systems are programming languages in their own right, and we need to write computer code for them. That code gets passed around informally, but almost never standardized. Mathematical concepts that come up again and again can be implemented very differently by different people, some much more efficiently than others.

I don’t think it’s unreasonable that we leave “the mess” out of our papers. They would certainly be hard to understand otherwise! But it’s a shame we don’t publish our dirty tricks somewhere, even in special “dirty tricks” papers. Students often start out assuming everything is done the clean way, and start doubting themselves when they notice it’s much too slow to make progress. Learning the tricks is a big part of learning to be a physicist. We should find a better way to teach them.

David Hoggother people's code

I spent much of the day understanding Other People's Code (tm). One piece of code is legacy code from Aaron Dotter (Harvard) that computes magnitudes from flux densities. I was looking at it to confirm my frantic writing of yesterday. Another piece of code is the code by Weichi Yao (NYU) that implemented our group-equivariant machine learning. I was looking at it to see if we can modify it to impose units equivariance (dimensional scaling symmetries). I think we can, and I confirmed that it Just Works (tm).

David Hoggmagnitudes are (logarithmic) ratios of signals

A couple of weeks ago I had intense arguments with Belokurov (Cambridge) and Farr (Stony Brook) about the definition of a photometric bandpass and a photometric magnitude. And a few months ago I had long conversations with Breivik (Flatiron) about the bolometric correction. For some reason these things took over my mind today and I couldn't stop myself from starting a short pedagogical note about how magnitudes are related to spectral energy distributions. It's not trivial! Indeed, the integral isn't the integral you naively think it might be, because most photometric systems count photons (they don't integrate energy). People often say that a magnitude is negative-2.5 times the log of a flux. But that's not right! It is negative-2.5 times the log of a ratio of signals measured in two experiments.

January 13, 2022

Scott Aaronson On tardigrades, superdeterminism, and the struggle for sanity

(Hopefully no one has taken taken that title yet!)

I waste a large fraction of my existence just reading about what’s happening in the world, or discussion and analysis thereof, in an unending scroll of paralysis and depression. On the first anniversary of the January 6 attack, I read the recent revelations about just how close the seditionists actually came to overturning the election outcome (e.g., by pressuring just one Republican state legislature to “decertify” its electors, after which the others would likely follow in a domino effect), and how hard it now is to see a path by which democracy in the United States will survive beyond 2024. Or I read about Joe Manchin, who’s already entered the annals of history as the man who could’ve halted the slide to the abyss and decided not to. Of course, I also read about the wokeists, who correctly see the swing of civilization getting pushed terrifyingly far out of equilibrium to the right, so their solution is to push the swing terrifyingly far out of equilibrium to the left, and then they act shocked when their own action, having added all this potential energy to the swing, causes it to swing back even further to the right, as swings tend to do. (And also there’s a global pandemic killing millions, and the correct response to it—to authorize and distribute new vaccines as quickly as the virus mutates—is completely outside the Overton Window between Obey the Experts and Disobey the Experts, advocated by no one but a few nerds. When I first wrote this post, I forgot all about the global pandemic.) And I see all this and I am powerless to stop it.

In such a dark time, it’s easy to forget that I’m a theoretical computer scientist, mainly focused on quantum computing. It’s easy to forget that people come to this blog because they want to read about quantum computing. It’s like, who gives a crap about that anymore? What doth it profit a man, if he gaineth a few thousand fault-tolerant qubits with which to calculateth chemical reaction rates or discrete logarithms, and he loseth civilization?

Nevertheless, in the rest of this post I’m going to share some quantum-related debunking updates—not because that’s what’s at the top of my mind, but in an attempt to find my way back to sanity. Picture that: quantum mechanics (and specifically, the refutation of outlandish claims related to quantum mechanics) as the part of one’s life that’s comforting, normal, and sane.

There’s been lots of online debate about the claim to have entangled a tardigrade (i.e., water bear) with a superconducting qubit; see also this paper by Vlatko Vedral, this from CNET, this from Ben Brubaker on Twitter. So, do we now have Schrödinger’s Tardigrade: a living, “macroscopic” organism maintained coherently in a quantum superposition of two states? How could such a thing be possible with the technology of the early 21st century? Hasn’t it been a huge challenge to demonstrate even Schrödinger’s Virus or Schrödinger’s Bacterium? So then how did this experiment leapfrog (or leaptardigrade) over those vastly easier goals?

Short answer: it didn’t. The experimenters couldn’t directly measure the degree of freedom in the tardigrade that’s claimed to be entangled with the qubit. But it’s consistent with everything they report that whatever entanglement is there, it’s between the superconducting qubit and a microscopic part of the tardigrade. It’s also consistent with everything they report that there’s no entanglement at all between the qubit and any part of the tardigrade, just boring classical correlation. (Or rather that, if there’s “entanglement,” then it’s the Everett kind, involving not merely the qubit and the tardigrade but the whole environment—the same as we’d get by just measuring the qubit!) Further work would be needed to distinguish these possibilities. In any case, it’s of course cool that they were able to cool a tardigrade to near absolute zero and then revive it afterwards.

I thank the authors of the tardigrade paper, who clarified a few of these points in correspondence with me. Obviously the comments section is open for whatever I’ve misunderstood.

People also asked me to respond to Sabine Hossenfelder’s recent video about superdeterminism, a theory that holds that quantum entanglement doesn’t actually exist, but the universe’s initial conditions were fine-tuned to stop us from choosing to measure qubits in ways that would make its nonexistence apparent: even when we think we’re applying the right measurements, we’re not, because the initial conditions messed with our brains or our computers’ random number generators. (See, I tried to be as non-prejudicial as possible in that summary, and it still came out sounding like a parody. Sorry!)

Sabine sets up the usual dichotomy that people argue against superdeterminism only because they’re attached to a belief in free will. She rejects Bell’s statistical independence assumption, which she sees as a mere dogma rather than a prerequisite for doing science. Toward the end of the video, Sabine mentions the objection that, without statistical independence, a demon could destroy any randomized controlled trial, by tampering with the random number generator that decides who’s in the control group and who isn’t. But she then reassures the viewer that it’s no problem: superdeterministic conspiracies will only appear when quantum mechanics would’ve predicted a Bell inequality violation or the like. Crucially, she never explains the mechanism by which superdeterminism, once allowed into the universe (including into macroscopic devices like computers and random number generators), will stay confined to reproducing the specific predictions that quantum mechanics already told us were true, rather than enabling ESP or telepathy or other mischief. This is stipulated, never explained or derived.

To say I’m not a fan of superdeterminism would be a super-understatement. And yet, nothing I’ve written previously on this blog—about superdeterminism’s gobsmacking lack of explanatory power, or about how trivial it would be to cook up a superdeterministic “mechanism” for, e.g., faster-than-light signaling—none of it seems to have made a dent. It’s all come across as obvious to the majority of physicists and computer scientists who think as I do, and it’s all fallen on deaf ears to superdeterminism’s fans.

So in desperation, let me now try another tack: going meta. It strikes me that no one who saw quantum mechanics as a profound clue about the nature of reality could ever, in a trillion years, think that superdeterminism looked like a promising route forward given our current knowledge. The only way you could think that, it seems to me, is if you saw quantum mechanics as an anti-clue: a red herring, actively misleading us about how the world really is. To be a superdeterminist is to say:

OK, fine, there’s the Bell experiment, which looks like Nature screaming the reality of ‘genuine indeterminism, as predicted by QM,’ louder than you might’ve thought it even logically possible for that to be screamed. But don’t listen to Nature, listen to us! If you just drop what you thought were foundational assumptions of science, we can explain this away! Not explain it, of course, but explain it away. What more could you ask from us?

Here’s my challenge to the superdeterminists: when, in 400 years from Galileo to the present, has such a gambit ever worked? Maxwell’s equations were a clue to special relativity. The Hamiltonian and Lagrangian formulations of classical mechanics were clues to quantum mechanics. When has a great theory in physics ever been grudgingly accommodated by its successor theory in a horrifyingly ad-hoc way, rather than gloriously explained and derived?

Update: Oh right, and the QIP’2022 list of accepted talks is out! And I was on the program committee! And they’re still planning to hold QIP in person, in March at Caltech, will you fancy that! actually I have no idea—but if they’re going to move to virtual, I’m awaiting an announcement just like everyone else.

David Hoggbuilding dimensionless monomials

I got stuck today on a problem that seems trivial but is in fact not at all trivial: Given N inputs, how to make all possible dimensionless monomials of those N inputs, at (or less than) degree d. For our purposes (which are the units-equivariant machine-learning projects I have with Soledad Villar, JHU), a dimensionless monomial of degree d is a product of integer powers of the inputs, in which those powers can be positive or negative, such that the dimensions cancel out completely, and for which the the max (or sum or some norm) of the absolute values of the powers is &leqd. We have a complete basis of dimensionless monomials, such that any valid dimensionless monomial can be expressed as a monomial of the basis monomials. Because of this, the dimensionless monomials can be seen as the vertices of a Bravais lattice, technically. The problem is just to traverse the entire lattice within some kind of ball. Why is this hard? Or am I just dull? I feel like there are fill algorithms that should do this correctly.

January 11, 2022

David Hoggrefereeing can be very valuable

Christina Eilers (MIT) and I discussed our referee report today, on our paper on re-calibrating abundance ratios as measured by APOGEE to remove log-g-dependent systematics. The referee report came quickly! And it was very useful: The referee found an assumption that we are making that we had not explicitly stated in the paper. And this is important: As my loyal reader knows, I believe that a data-analysis paper is correct only insofar as it is consistent with its explicitly stated (and hopefully tested and justified) assumptions. So if a paper is missing an assumption, it is wrong!

Tommaso DorigoDeep Learning Boosts The Performance Of Particle Detectors

The title of this post is no news for particle physicists - particle detectors are complex instruments and they work by interpreting the result of stochastic phenomena taking place when radiation interacts with the matter of which detectors are built, and it looks only natural that deep learning algorithms can help improve our measurements in such a complex environment.

However, in this post I will give an example of something qualitatively different to providing an improvement of a measurement: one where a deep convolutional network model may extract information that we were simply incapable of making sense of. This means that the algorithm allows us to employ our detector in a new way.

read more

January 09, 2022

Richard EastherDo Look Up

It’s always nice to wake up to good news, and it is not necessarily a common experience as we reach the end of the second year of a pandemic. But for astronomers, two decades of anxiety were laid to rest over the weekend as the last mirror segments of the James Webb Space Telescope, or JWST, were locked into place.

Part of me always responds a little cynically when NASA’s public relations machine hypes the difficulty of forthcoming space-based feats, whether it is the 7 Minutes of Terror as a rover lands on Mars or the 344 single points of failure that could each spell doom for this wonderful telescope. There is some butt-covering in there if things do go wrong, but it is also the patter of an old-school magician as he prepares to saw his assistant in half. NASA is, at heart, a bureaucracy and does not set out to do things it does not believe it can accomplish. 

But things do go wrong in space. The probe that drilled a hundred million hole in the side of Mars because metric and imperial units were accidentally combined when computing its trajectory. Or the Hubble telescope itself, whose mirror was mis-configured in a way that an old-school lens-grinder polishing a glass blank would be embarrassed to reveal. And rockets do sometimes fail. So there is always room for worry. 

In fact, many failures result from organisational complexity – miscommunications that plant the seeds for a catastrophe which only become apparent when everything is fitted together and hurled into the heavens. And the JWST had a remarkably baroque bureaucratic journey.  Its origin is often traced to a 1989 workshop on successors to the Hubble Space Telescope – held before Hubble itself was in space. The green light for NASA to build it followed a decade later, with a US$1.6 billion budget and 10 year roadmap to launch. 

By 2010 it was clear that this budget and timeline had been somewhere between a shared, consensual hallucination and an active untruth. Launch was still years away and the budget had swelled to something like $10 billion; at one point the joke was the JWST slipped a year every year. Cynics argued that this reflected a belief that it is easier to ask for forgiveness than permission, trusting Congress to chase sunk-cost rather than scuttling a program with billions already spent. 

The telescope was the subject of Congressional hearings and became something close to a public scandal, suffering a “near death experience” when its future budget was briefly erased during the wrangling on Capitol Hill. Likewise, the management of the JWST itself was the subject of a blunt report that drove changes in its leadership and process. 

The following decade saw the project stay (mostly) inside the new budget cap but more schedule slippages during testing meant that it finally made it to space on Christmas Day 2021. By then, the project’s many travails had ratcheted up the anxiety-levels of everyone in the field to the point that you could almost hear your colleagues gently buzzing from stress. 

So it was a massive relief and perhaps something of a surprise that a pitch-perfect Christmas-day flight was followed by an almost flawless deployment of the massive sunshade and the unfolding of the telescope mirror.  The next few months will be spent aligning the individual segments of the mirror and cooling the instruments to their operating temperature, but the JWST now seems to promise two decade’s worth of unprecedented observations of the universe. 

So I’m happy.

Not just happy in fact, as watching a group of people come together to do something this hard and this unprecedentedly complex – often in the face of administrative inertia – reminds me that it is not only possible to reach for our dreams but that sometimes we manage to take hold of them.

Jordan EllenbergTo A Crackpot

I still have a lot of text files from when I was in college and even high school, sequentially copied from floppy to floppy to hard drive to hard drive over the decades. I used to write poems and they were not good and neither is this one, but to my surprise it had some lines in it that I remembered but did not remember that I wrote myself. What was I doing with the line breaks though? I am pretty sure this would have been written in my junior year of college, maybe spring of 1992. Around this same time I submitted a short story to a magazine and the editor wrote back to me saying “free-floating anxiety cannot be what drives a narrative,” but I disagreed, obviously.

To a Crackpot

He eschews the shoulders
of giants. He chooses instead
the company of thin men, coffee-stained,
stooped with knowledge. They huddle
on the sidewalk, nodding, like crows
or rabbis. He speaks:
the world is hollow and we live
on the inside. (Murmurs of assent.) There
is a hole at the top where the water runs in. The sun
is smaller than my hand, and the stars
are smaller than the sun.

A woman walks by, drawing
his eye. She has no idea. Beneath their feet,
out in the dark, secret engines. The Earth turns like milk.

January 08, 2022

Doug NatelsonCondensed matter and a sense of wonder

I had an interesting conversation with a colleague last week about the challenges of writing a broadly appealing, popular book about condensed matter.  This is a topic I've been mulling for (too many) years - see this post from the heady days of 2010.

He made a case that condensed matter is inherently less wondrous to the typical science-interested person than, e.g., "the God Particle" (blech) or black holes.  This is basically my first point in the old post linked above.  He was arguing that people have a hard time ever seeing something that captures the imagination in items or objects that they have around them all the time.  The smartphone is an incredible piece of technology and physics, but what people care about is how to get better download speeds, not how or why any of it works.  

I'm curious:  Do readers think this is on-target?  Is "lack of wonder" the main issue, or one of many?  

Jacques Distler Spinor Helicity Variables in QED

I’m teaching Quantum Field Theory this year. One of the things I’ve been trying to emphasize is the usefulness of spinor-helicity variables in dealing with massless particles. This is well-known to the “Amplitudes” crowd, but hasn’t really trickled down to the textbooks yet. Mark Srednicki’s book comes close, but doesn’t (IMHO) quite do a satisfactory job of it.

Herewith are some notes.

The first step in constructing perturbation theory is to quantize the free fields. Following Weinberg and Srednicki, I’m using the “mostly-plus” signature convention (my 2-component spinor conventions are those of Dreiner et al if you define the macro \def\signofmetric{1} in the LaTeX file). For k 2=0k^2=0, we can define helicity spinors

(1)(kσ) αβ˙=λ αλ β˙ ,(kσ¯) α˙β=λ α˙λ β(k\cdot\sigma)_{\alpha\dot\beta}= -\lambda_\alpha\lambda^\dagger_{\dot\beta},\qquad (k\cdot\overline{\sigma})^{\dot\alpha\beta} = -\lambda^{\dagger\dot\alpha}\lambda^\beta

which allow us to straightforwardly canonically-quantize.


For a Weyl fermion, =iψ σ¯ψ \mathcal{L}= i\psi^\dagger\overline{\sigma}\cdot\partial \psi the general solution to the equations of motion is ψ α(x) =d 3k(2π) 32|k|λ α(ξ k e ikx+η ke ikx) ψ α˙ (x) =d 3k(2π) 32|k|λ α˙ (η k e ikx+ξ ke ikx) \begin{aligned} \psi_\alpha(x)&=\int\frac{d^3\vec{k}}{{(2\pi)}^3 2|\vec{k}|}\lambda_\alpha \left(\xi^\dagger_{\vec{k}}e^{-ik\cdot x}+\eta_{\vec{k}}e^{ik\cdot x}\right)\\ \psi^\dagger_{\dot\alpha}(x)&=\int\frac{d^3\vec{k}}{{(2\pi)}^3 2|\vec{k}|}\lambda^\dagger_{\dot\alpha}\left(\eta^\dagger_{\vec{k}}e^{-ik\cdot x}+\xi_{\vec{k}}e^{ik\cdot x}\right) \end{aligned} The Equal-Time Anti-Commutation Relations {ψ α(x,0),ψ β˙ (x,0)}=σ αβ˙ 0δ (3)(xx) \{\psi_\alpha(\vec{x},0),\psi^\dagger_{\dot\beta}(\vec{x}',0)\}=\sigma^0_{\alpha\dot\beta}\delta^{(3)}(\vec{x}-\vec{x}') become the canonical anti-commutation relations {ξ k,ξ k } =(2π) 32|k|δ (3)(kk) {η k,η k } =(2π) 32|k|δ (3)(kk) \begin{aligned} \{\xi_{\vec{k}},\xi^\dagger_{\vec{k}'}\}&= {(2\pi)}^3 2|\vec{k}| \delta^{(3)}(\vec{k}-\vec{k}')\\ \{\eta_{\vec{k}},\eta^\dagger_{\vec{k}'}\}&= {(2\pi)}^3 2|\vec{k}| \delta^{(3)}(\vec{k}-\vec{k}')\\ \end{aligned} for creation and annihilation operators for fermions of definite helicity.

The upshot, after tracking this through the LSZ reduction formula, is that external fermion lines are contracted with the corresponding helicity spinor (λ i\lambda_i or λ i \lambda^\dagger_i) depending on the helicity of the i thi^{\text{th}} incoming/outgoing particle. When we take the absolute square of the amplitude, we use (1) to rewrite λ iλ i =k iσ\lambda_i\lambda^\dagger_i=-k_i\cdot\sigma, etc.


There’s a certain amount of hand-wringing associated to quantizing the free Maxwell Lagrangian, =14F μνF μν \mathcal{L} = -\tfrac{1}{4}F_{\mu\nu}F^{\mu\nu} If we take the canonical variables to be A μA^\mu and π μ=δδ 0A μ\pi_\mu =\frac{\delta \mathcal{L}}{\delta\partial_0 A^\mu}, then the gauge-invariance entails that the symplectic structure is degenerate (π 0\pi_0 vanishes identically). The usual approach is to fix a gauge (Weinberg and Srednicki use Coulomb gauge) and then work very hard (replacing Poisson brackets with Dirac brackets, because the constraints are 2nd class, …).

On the other hand, if we

  1. realize that the phase space is the space of classical solutions and
  2. introduce spinor helicity variables, as before,

it’s easy to write down the general solution to the equations of motion

(2)F μν(x)=12d 3k(2π) 32|k| (λσ μνλε (k)+λ σ¯ μνλ ε + (k))e ikx+ +(λσ μνλε +(k)+λ σ¯ μνλ ε (k))e ikx\begin{aligned} F^{\mu\nu}(x)= \frac{1}{\sqrt{2}}\int\frac{d^3\vec{k}}{{(2\pi)}^3 2|\vec{k}|}&\left(\lambda\sigma^{\mu\nu}\lambda\varepsilon^\dagger_{-}(\vec{k})+ \lambda^\dagger\overline{\sigma}^{\mu\nu}\lambda^\dagger \varepsilon^\dagger_{+}(\vec{k})\right)e^{-ik\cdot x}+\\ &+\left(\lambda\sigma^{\mu\nu}\lambda\varepsilon_{+}(\vec{k})+ \lambda^\dagger\overline{\sigma}^{\mu\nu}\lambda^\dagger \varepsilon_{-}(\vec{k})\right)e^{ik\cdot x} \end{aligned}

The (non-degenerate) symplectic structure on the space of classical solutions leads to the Equal-Time Commutation Relations

(3)[F 0i(x,0),F jk(x,0)]=i(δ ikx jδ ijx k)δ (3)(xx)[F_{0i}(\vec{x},0),F_{j k}(\vec{x}',0)]=i\left(\delta_{i k}\frac{\partial}{\partial x^j}-\delta_{i j}\frac{\partial}{\partial x^k}\right)\delta^{(3)}(\vec{x}-\vec{x}')

which, in turn, give the canonical commutation relations [ε +(k),ε + (k)] =(2π) 32|k|δ (3)(kk) [ε (k),ε (k)] =(2π) 32|k|δ (3)(kk) \begin{aligned} [\varepsilon_+(\vec{k}),\varepsilon^\dagger_+(\vec{k}')]&={(2\pi)}^3 2|\vec{k}| \delta^{(3)}(\vec{k}-\vec{k}')\\ [\varepsilon_-(\vec{k}),\varepsilon^\dagger_-(\vec{k}')]&={(2\pi)}^3 2|\vec{k}| \delta^{(3)}(\vec{k}-\vec{k}')\\ \end{aligned} of the creation and annihilation operators for photons of definite helicity.

Unfortunately, to couple to charged matter fields, we need an expression for A μA^\mu, not just F μνF^{\mu\nu}, so (2) does not quite suffice for our purposes. But, again, helicity spinors come to the rescue.

Introduce a fixed fiducial null vector kˇ 2=0\check{k}^2=0 and the corresponding helicity spinors (kˇσ) αβ˙=μ αμ β˙ ,(kˇσ¯) α˙β=μ α˙μ β (\check{k}\cdot\sigma)_{\alpha\dot\beta}= -\mu_\alpha\mu^\dagger_{\dot\beta},\qquad (\check{k}\cdot\overline{\sigma})^{\dot\alpha\beta} = -\mu^{\dagger\dot\alpha}\mu^\beta We then can write

(4)A μ(x) =12d 3k(2π) 32|k|(μ σ¯ μλμ λ ε (k)+μσ μλ μλε + (k))e ikx+ +(λ σ¯ μμλμε (k)+λσ μμ λ μ ε +(k))e ikx =12d 3k(2π) 32|k|(μ σ¯ μλμ λ ε (k)+μσ μλ μλε + (k))e ikx (μσ μλ μλε (k)+μ σ¯ μλμ λ ε +(k))e ikx \begin{aligned} A^\mu(x)&= \frac{1}{\sqrt{2}}\int\frac{d^3\vec{k}}{{(2\pi)}^3 2|\vec{k}|}\left(\frac{\mu^\dagger\overline{\sigma}^\mu\lambda}{\mu^\dagger\lambda^\dagger}\varepsilon^\dagger_{-}(\vec{k})+ \frac{\mu\sigma^\mu\lambda^\dagger}{\mu\lambda} \varepsilon^\dagger_{+}(\vec{k})\right)e^{-ik\cdot x}+\\ &\qquad+\left(\frac{\lambda^\dagger\overline{\sigma}^\mu\mu}{\lambda\mu}\varepsilon_{-}(\vec{k})+ \frac{\lambda\sigma^\mu\mu^\dagger}{\lambda^\dagger\mu^\dagger} \varepsilon_{+}(\vec{k})\right)e^{ik\cdot x}\\ &=\frac{1}{\sqrt{2}}\int\frac{d^3\vec{k}}{{(2\pi)}^3 2|\vec{k}|}\left(\frac{\mu^\dagger\overline{\sigma}^\mu\lambda}{\mu^\dagger\lambda^\dagger}\varepsilon^\dagger_{-}(\vec{k})+ \frac{\mu\sigma^\mu\lambda^\dagger}{\mu\lambda} \varepsilon^\dagger_{+}(\vec{k})\right)e^{-ik\cdot x}\\ &\qquad-\left(\frac{\mu\sigma^\mu\lambda^\dagger}{\mu\lambda}\varepsilon_{-}(\vec{k})+ \frac{\mu^\dagger\overline{\sigma}^\mu\lambda}{\mu^\dagger\lambda^\dagger} \varepsilon_{+}(\vec{k})\right)e^{ik\cdot x}\\ \end{aligned}

which satisfies A=0\partial\cdot A=0 and (exercise for the reader) μA ν νA μ =12d 3k(2π) 32|k|(λσ μνλε (k)+λ σ¯ μνλ ε + (k))e ikx+ +(λσ μνλε +(k)+λ σ¯ μνλ ε (k))e ikx =F μν(x) \begin{aligned} \partial^\mu A^\nu-\partial^\nu A^\mu&= \frac{1}{\sqrt{2}}\int\frac{d^3\vec{k}}{{(2\pi)}^3 2|\vec{k}|}\left(\lambda\sigma^{\mu\nu}\lambda\varepsilon^\dagger_{-}(\vec{k})+ \lambda^\dagger\overline{\sigma}^{\mu\nu}\lambda^\dagger \varepsilon^\dagger_{+}(\vec{k})\right)e^{-ik\cdot x}+\\ &\qquad+\left(\lambda\sigma^{\mu\nu}\lambda\varepsilon_{+}(\vec{k})+ \lambda^\dagger\overline{\sigma}^{\mu\nu}\lambda^\dagger \varepsilon_{-}(\vec{k})\right)e^{ik\cdot x}\\ &=F^{\mu\nu}(x) \end{aligned} as before. Together, these ensure that changing the reference momentum kˇ\check{k} changes A μ(x)A^\mu(x) by a harmonic gauge transformation.

To completely justify (4), we choose R-ξ\xi gauge, and use BV-BRST quantization, but that’s the subject for another blog post.

Here, it suffices to say that the Feynman rules contract every external photon line with a μσ μλ μλ\frac{\mu\sigma^\mu\lambda^\dagger}{\mu\lambda} or a μ σ¯ μλμ λ \frac{\mu^\dagger\overline{\sigma}^\mu\lambda}{\mu^\dagger\lambda^\dagger}, depending on the helicity of the incoming/outgoing photon. We’re free to make any choice of reference momentum kˇ\check{k} that we want, but verifying that the final answer is independent of kˇ\check{k} is a nice check on our calculations.

Notoriously, Lorentz gauge A=0\partial\cdot A = 0 does not completely fix the gauge: we can still shift A μA μ+ μfA_\mu\to A_\mu+\partial_\mu f, where ff is any solution to the scalar wave equation, f=0\square f = 0.

January 07, 2022

Matt von HippelThe arXiv SciComm Challenge

Fellow science communicators, think you can explain everything that goes on in your field? If so, I have a challenge for you. Pick a day, and go through all the new papers on in a single area. For each one, try to give a general-audience explanation of what the paper is about. To make it easier, you can ignore cross-listed papers. If your field doesn’t use arXiv, consider if you can do the challenge with another appropriate site.

I’ll start. I’m looking at papers in the “High Energy Physics – Theory” area, announced 6 Jan, 2022. I’ll warn you in advance that I haven’t read these papers, just their abstracts, so apologies if I get your paper wrong!

arXiv:2201.01303 : Holographic State Complexity from Group Cohomology

This paper says it is a contribution to a Proceedings. That means it is based on a talk given at a conference. In my field, a talk like this usually won’t be presenting new results, but instead summarizes results in a previous paper. So keep that in mind.

There is an idea in physics called holography, where two theories are secretly the same even though they describe the world with different numbers of dimensions. Usually this involves a gravitational theory in a “box”, and a theory without gravity that describes the sides of the box. The sides turn out to fully describe the inside of the box, much like a hologram looks 3D but can be printed on a flat sheet of paper. Using this idea, physicists have connected some properties of gravity to properties of the theory on the sides of the box. One of those properties is complexity: the complexity of the theory on the sides of the box says something about gravity inside the box, in particular about the size of wormholes. The trouble is, “complexity” is a bit subjective: it’s not clear how to give a good definition for it for this type of theory. In this paper, the author studies a theory with a precise mathematical definition, called a topological theory. This theory turns out to have mathematical properties that suggest a well-defined notion of complexity for it.

arXiv:2201.01393 : Nonrelativistic effective field theories with enhanced symmetries and soft behavior

We sometimes describe quantum field theory as quantum mechanics plus relativity. That’s not quite true though, because it is possible to define a quantum field theory that doesn’t obey special relativity, a non-relativistic theory. Physicists do this if they want to describe a system moving much slower than the speed of light: it gets used sometimes for nuclear physics, and sometimes for modeling colliding black holes.

In particle physics, a “soft” particle is one with almost no momentum. We can classify theories based on how they behave when a particle becomes more and more soft. In normal quantum field theories, if they have special behavior when a particle becomes soft it’s often due to a symmetry of the theory, where the theory looks the same even if something changes. This paper shows that this is not true for non-relativistic theories: they have more requirements to have special soft behavior, not just symmetry. They “bootstrap” a few theories, using some general restrictions to find them without first knowing how they work (“pulling them up by their own bootstraps”), and show that the theories they find are in a certain sense unique, the only theories of that kind.

arXiv:2201.01552 : Transmutation operators and expansions for 1-loop Feynman integrands

In recent years, physicists in my sub-field have found new ways to calculate the probability that particles collide. One of these methods describes ordinary particles in a way resembling string theory, and from this discovered a whole “web” of theories that were linked together by small modifications of the method. This method originally worked only for the simplest Feynman diagrams, the “tree” diagrams that correspond to classical physics, but was extended to the next-simplest diagrams, diagrams with one “loop” that start incorporating quantum effects.

This paper concerns a particular spinoff of this method, that can find relationships between certain one-loop calculations in a particularly efficient way. It lets you express calculations of particle collisions in a variety of theories in terms of collisions in a very simple theory. Unlike the original method, it doesn’t rely on any particular picture of how these collisions work, either Feynman diagrams or strings.

arXiv:2201.01624 : Moduli and Hidden Matter in Heterotic M-Theory with an Anomalous U(1) Hidden Sector

In string theory (and its more sophisticated cousin M theory), our four-dimensional world is described as a world with more dimensions, where the extra dimensions are twisted up so that they cannot be detected. The shape of the extra dimensions influences the kinds of particles we can observe in our world. That shape is described by variables called “moduli”. If those moduli are stable, then the properties of particles we observe would be fixed, otherwise they would not be. In general it is a challenge in string theory to stabilize these moduli and get a world like what we observe.

This paper discusses shapes that give rise to a “hidden sector”, a set of particles that are disconnected from the particles we know so that they are hard to observe. Such particles are often proposed as a possible explanation for dark matter. This paper calculates, for a particular kind of shape, what the masses of different particles are, as well as how different kinds of particles can decay into each other. For example, a particle that causes inflation (the accelerating expansion of the universe) can decay into effects on the moduli and dark matter. The paper also shows how some of the moduli are made stable in this picture.

arXiv:2201.01630 : Chaos in Celestial CFT

One variant of the holography idea I mentioned earlier is called “celestial” holography. In this picture, the sides of the box are an infinite distance away: a “celestial sphere” depicting the angles particles go after they collide, in the same way a star chart depicts the angles between stars. Recent work has shown that there is something like a sensible theory that describes physics on this celestial sphere, that contains all the information about what happens inside.

This paper shows that the celestial theory has a property called quantum chaos. In physics, a theory is said to be chaotic if it depends very precisely on its initial conditions, so that even a small change will result in a large change later (the usual metaphor is a butterfly flapping its wings and causing a hurricane). This kind of behavior appears to be present in this theory.

arXiv:2201.01657 : Calculations of Delbrück scattering to all orders in αZ

Delbrück scattering is an effect where the nuclei of heavy elements like lead can deflect high-energy photons, as a consequence of quantum field theory. This effect is apparently tricky to calculate, and previous calculations have involved approximations. This paper finds a way to calculate the effect without those approximations, which should let it match better with experiments.

(As an aside, I’m a little confused by the claim that they’re going to all orders in αZ when it looks like they just consider one-loop diagrams…but this is probably just my ignorance, this is a corner of the field quite distant from my own.)

arXiv:2201.01674 : On Unfolded Approach To Off-Shell Supersymmetric Models

Supersymmetry is a relationship between two types of particles: fermions, which typically make up matter, and bosons, which are usually associated with forces. In realistic theories this relationship is “broken” and the two types of particles have different properties, but theoretical physicists often study models where supersymmetry is “unbroken” and the two types of particles have the same mass and charge. This paper finds a new way of describing some theories of this kind that reorganizes them in an interesting way, using an “unfolded” approach in which aspects of the particles that would normally be combined are given their own separate variables.

(This is another one I don’t know much about, this is the first time I’d heard of the unfolded approach.)

arXiv:2201.01679 : Geometric Flow of Bubbles

String theorists have conjectured that only some types of theories can be consistently combined with a full theory of quantum gravity, others live in a “swampland” of non-viable theories. One set of conjectures characterizes this swampland in terms of “flows” in which theories with different geometry can flow in to each other. The properties of these flows are supposed to be related to which theories are or are not in the swampland.

This paper writes down equations describing these flows, and applies them to some toy model “bubble” universes.

arXiv:2201.01697 : Graviton scattering amplitudes in first quantisation

This paper is a pedagogical one, introducing graduate students to a topic rather than presenting new research.

Usually in quantum field theory we do something called “second quantization”, thinking about the world not in terms of particles but in terms of fields that fill all of space and time. However, sometimes one can instead use “first quantization”, which is much more similar to ordinary quantum mechanics. There you think of a single particle traveling along a “world-line”, and calculate the probability it interacts with other particles in particular ways. This approach has recently been used to calculate interactions of gravitons, particles related to the gravitational field in the same way photons are related to the electromagnetic field. The approach has some advantages in terms of simplifying the results, which are described in this paper.

n-Category Café Optimal Transport and Enriched Categories IV: Examples of Kan-type Centres

Last time we were thinking about categories enriched over ¯ +\bar{\mathbb{R}}_+, the extended non-negative reals; such enriched categories are sometimes called generalized or Lawvere metric spaces. In the context of optimal transport with cost matrix kk, thought of as a ¯ +\bar{\mathbb{R}}_+-profunctor k:𝒮k\colon \mathcal{S}\rightsquigarrow\mathcal{R} between suppliers and receivers, we were interested in the centre of the ‘Kan-type adjunction’ between enriched functor categories, which is the following:

In this post I want to give some examples of the Kan-type centre in low dimension to try to give a sense of what they look like over ¯ +\bar{\mathbb{R}}_+. Here’s the simplest kind of example we will see.


Quick Recap

It’s been a while since my last post on this and I don’t want to a big recap but certainly a small recap is in order! I’m interested in transporting goods from a set {S i} i=1 s\{S_i\}_{i=1}^s of suppliers to a set {R j} j=1 r\{R_j\}_{j=1}^r of receivers. (Think of transporting loaves of bread from bakeries to cafés.) We have a supply of σ i\sigma_i units of good at supplier S iS_i and a demand of ρ j\rho_j at receiver R jR_j and the cost of transporting one unit of goods from S iS_i to R jR_j is k ij[0,)k_{ij}\in [0,\infty).

We saw that in the dual problem we want to find prices v=(v 1,,v s)v=(v_1,\dots,v_s) and u=(u 1,,u r)u=(u_1,\dots, u_r) at the suppliers and receivers respectively that maximize the total revenue ju jρ j iv iσ i \sum_j u_j\rho_j -\sum_i v_i\sigma_i subject to the competitivity constraint k iju jv ik_{ij}\ge u_j-v_i for all i,ji,j.

I’ll model this is the language of ¯ +\bar{\mathbb{R}}_+-categories. Recall that an ¯ +\bar{\mathbb{R}}_+-category is a generalization of a metric space: in an ¯ +\bar{\mathbb{R}}_+-category XX we have a ‘hom-object’ X(a,b)[0,]X(a,b)\in [0,\infty] between each pair of objects a,bXa,b\in X. This hom-object should be thought of as a distance which does not have to be finite, nor symmetric, and you can have distinct objects having zero distance between them. The right notion of ‘functor’ here is that of ‘distance non-increasing map’; the term ‘short map’ is used for short.

I model the situation in the following way.

Suppliers: 𝒮\mathcal{S}, a discrete ¯ +\bar{\mathbb{R}}_+-category
Receivers: \mathcal{R}, a discrete ¯ +\bar{\mathbb{R}}_+-category
Transport cost: k:𝒮k\colon \mathcal{S}\rightsquigarrow \mathcal{R}, an ¯ +\bar{\mathbb{R}}_+-profunctor
Prices at suppliers: v:𝒮¯ +v\colon \mathcal{S}\to \bar{\mathbb{R}}_+, an ¯ +\bar{\mathbb{R}}_+-functor
Prices at receivers:  u:𝒮¯ +u\colon \mathcal{S}\to \bar{\mathbb{R}}_+ an ¯ +\bar{\mathbb{R}}_+-functor

Note that I didn’t mention how to model the supply and demand. That is not of immediate interest to us, so I won’t say anything about that now.

We also saw that in finding prices which maximize total revenue we can restrict our attention to ‘tight price plans’, that is pairs of prices (v;u)(v; u) which lie in the centre Z K(k)Z_{\mathrm{K}}(k) of the following Kan-type adjunction.

The Kan centre is the full subcategory of ¯ + 𝒮ׯ + \bar{\mathbb{R}}_+^\mathcal{S}\times \bar{\mathbb{R}}_+^\mathcal{R} consisting of objects (v,u)(v,u) such that kv=uk\circ v = u and v=kuv=k\triangleright u; in the notation of previous posts I would write v^=u\hat{v} =u and v=u˜v=\tilde{u}.

Note that the Kan-type adjunction is not the same as the Isbell-type (*) adjunction I’ve discussed in previous posts on profunctor nuclei and Isbell tight-spans. I will discuss a relationship between them in a future post.

(*) I don’t like using the terms ‘Kan’ and ‘Isbell’ for these adjunctions, I’d much rather use more descriptive adjectives. It anyone has any ideas…

In this post I want to visualize the Kan centre Z K(k)Z_{\mathrm{K}}(k) for you, and as the Kan centre is a subset of ¯ + 𝒮ׯ + ¯ + 𝒮\bar{\mathbb{R}}_+^\mathcal{S}\times \bar{\mathbb{R}}_+^\mathcal{R}\cong \bar{\mathbb{R}}_+^{\mathcal{S}\sqcup\mathcal{R}}, it would be wise to start in a case with a very small number of suppliers and receivers, namely where |𝒮|3\left | \mathcal{S}\sqcup\mathcal{R}\right|\le 3. We will see that we can in fact visualize, in a certain way, higher dimensional cases, but we will start with a slightly silly example — from an optimal transport perspective — to illustrate some basic features.

A first example

Let’s start with two suppliers and one receivers, |𝒮|=2|\mathcal{S}|=2 and ||=1|\mathcal{R}|=1. I will think of these as bakeries and cafés and I’ll pick some simpler numbers than I did in the old café example I used. The numbers on the arrows represent the cost of transporting one loaf of bread. The numbers in circles represent the supply and demand in loaves of bread.

The profunctor in this situation can be represented by the matrix k=(3 2)k=\left(\begin{smallmatrix}3\\2\end{smallmatrix}\right). (I might be regretting my choice of conventions and I might have preferred to have the transpose of this matrix.)

It is clear what the solution to the optimal transport problem is in this case as there’s only one feasible solution, namely both bakeries send all their bread to the one café. But this should be a useful example none-the-less.

We have the adjunction


k(v 1,v 2)=(min{3+v 1,2+v 2})k(u 1)=(u 1˙3,u 1˙2). k\circ(v_1, v_2)=(\min\{3+v_1, 2+v_2\}) \qquad k\triangleright(u_1) = (u_1\,\dot{{}-{}}\,3, u_1\,\dot{{}-{}}\,2).

You can check that the centre is

{((λ˙1,λ),(2+λ))λ[0,]}¯ + 2ׯ + 1 \{((\lambda\,\dot{{}-{}}\,1, \lambda), (2+\lambda))\mid \lambda\in [0,\infty]\} \subset \bar{\mathbb{R}}_+^2\times \bar{\mathbb{R}}_+^1

and this (or at least a finite portion of it) can be drawn as follows.


You can observe that this is the union of two closed line segments, or, as might be more useful, that this is the union of two open line segments (one bounded, one unbounded) and two points. In fact, in general the Kan centre is a cell complex of open, piece-wise affine cells.

Another thing to observe here is that if (v,u)(v,u) is a pair in the Kan centre then vv determines uu and uu determines vv, so we can project onto ¯ + 2\bar{\mathbb{R}}_+^2 or ¯ + 1\bar{\mathbb{R}}_+^1 without losing any information. Here are the projections, I’ve drawn the boundary of the view box that I’ve picked, but the picture should be unbounded.


Here they are, drawn in a more standard way.


Some generalities

Projecting in this way gives us two views of the Kan centre and will allow us to visualize some higher dimensional examples. You need to bear in mind, however, that neither picture is the ‘true’ centre as this naturally sits in the higher dimensional space. For instance, the total revenue function can be restricted to the tight price plans, i.e. the centre, this is the restriction of a linear function when the centre is considered as a subset of (v,u)(v,u)-space, but it is not the restriction of a linear function when the centre is considered as a subset of vv-space or uu-space. Remember that we are hoping to maximize the total revenue function on the Kan centre.

There’s a very general statement about adjunctions when enriching over commutative quantales like ¯ +\bar{\mathbb{R}}_+. I won’t go into the details, many of which can be found in my Legendre-Fenchel transform paper. Anyway, roughly speaking, for an adjunction LRL\dashv R between ¯ +\bar{\mathbb{R}}_+-categories we have a counit LRidL R\Rightarrow\mathrm{id} and a unit idRL\mathrm{id}\Rightarrow R L, thus we have RLRRR L R\Rightarrow R and RRLRR\Rightarrow R L R (which means 0𝒞(RLR(d),R(d))0\ge \mathcal{C}(R L R(d), R(d)) and 0𝒞(R(d),RLR(d))0\ge \mathcal{C}(R(d), R L R(d)) for all d𝒟d\in \mathcal{D}) and so RRLRR \cong R L R. Similarly LLRLL \cong L R L. We saw that manifested earlier as v˜^˜=v˜\tilde{\hat{\tilde{v}}}= \tilde{v}. This gives us that any monad on an ¯ +\bar{\mathbb{R}}_+-category is idempotent and also the following.

Lemma 1. For an adjunction LRL\dashv R between ¯ +\bar{\mathbb{R}}_+-categories 𝒞\mathcal{C} and 𝒟\mathcal{D} there is the following identification of categories where Im\mathrm{Im} denotes the image of a functor and Fix\mathrm{Fix} denotes the fixed category of an endofunctor.

So in our case this means that the left-hand view or the space of tight price plans on the suppliers can be thought of as the image of the functor kk\triangleright{-}, or as the fixed space of the monad k(k)k\triangleright(k\circ{-}).


Here are some more examples. In each case I’ll give the profunctor and then pictures of the two views of the Kan-type centre, or, if you prefer, the tight price plans.

I computed these (as cell complexes) using some rather inefficient SageMath code; however, having seen what they look like I now realise there will be better algorithms out there. I’ll maybe say more on this next time.

(2 1 3 5)\begin{pmatrix}2&1\\3&5\end{pmatrix}


(2 7 3 5)\begin{pmatrix}2&7\\3&5\end{pmatrix}


(4 6 10 5 11 11 8 14)\begin{pmatrix} 4 & 6 \\ 10 & 5 \\ 11 & 11 \\ 8 & 14 \end{pmatrix}


In the above example I couldn’t actually show you the picture of the left-hand view as I’m not good at putting 4d objects on the the screen. However in the next example, by the magic of three.js, I can (hopefully!) put some fairly basic 3d objects on the screen which you can navigate and move with your mouse/touchpad/fingers. By clicking on the border you should be able to open models in fullscreen if you desire.

Due to laziness on my part, I’ll only show the one-cells; however, the two-cells and three-cells fill in the obvious holes (or at least I think they are obvious).

(1 4 6 2 7 1 8 7 8)\begin{pmatrix} 1 & 4 & 6 \\ 2 & 7 & 1 \\ 8 & 7 & 8 \end{pmatrix}

Your browser will not allow the embedded object! Your browser will not allow the embedded object!

So there’s some examples to get one thinking.

Next time

I hope that next time I will explain some of the structure of these examples and relate these to the notion of the semi-tropical convex hull. (The tropically aware amongst you might have noticed something convex hully going on.) And later on I would like to compare and contrast these with Isbell-type centres.

n-Category Café Intercats

The Topos Institute has a new seminar:

The talks will be streamed and also recorded on YouTube.

It’s a new seminar series on the mathematics of interacting systems, their composition, and their behavior. Split in equal parts theory and applications, we are particularly interested in category-theoretic tools to make sense of information-processing or adaptive systems, or those that stand in a ‘bidirectional’ relationship to some environment. We aim to bring together researchers from different communities, who may already be using similar-but-different tools, in order to improve our own interaction.


Although by no means an exhaustive or prescriptive list, Intercats seminar topics are likely to be related to the following mathematical topics:

  • Generalized lenses, optics, and related structures, such as Chu spaces or Dialectica categories;

  • polynomial functors, and their associated ecosystem;

  • applications of the foregoing to:

    • bidirectional processes,

    • dynamical systems,

    • wiring diagrams,

    • open games,

    • database theory,

    • automatic differentiation,

    • computational statistics,

    • machine learning.


Jules Hedges, University of Strathclyde

David Spivak, Topos Institute

Toby St Clere Smithe, Topos Institute

January 03, 2022

Scott Aaronson The demise of Scientific American: Guest post by Ashutosh Jogalekar

Scott’s foreword

One week ago, E. O. Wilson—the legendary naturalist and conservationist, and man who was universally acknowledged to know more about ants than anyone else in human history—passed away at age 92. A mere three days later, Scientific American—or more precisely, the zombie clickbait rag that now flaunts that name—published a shameful hit-piece, smearing Wilson for his “racist ideas” without, incredibly, so much as a single quote from Wilson, or any other attempt to substantiate its libel (see also this response by Jerry Coyne). SciAm‘s Pravda-like attack included the following extraordinary sentence, which I thought worthy of Alan Sokal’s Social Text hoax:

The so-called normal distribution of statistics assumes that there are default humans who serve as the standard that the rest of us can be accurately measured against.

There are intellectually honest people who don’t know what the normal distribution is. There are no intellectually honest people who, not knowing what it is, figure that it must be something racist.

On Twitter, Laura Helmuth, the editor-in-chief now running SciAm into the ground, described her magazine’s calumny against Wilson as “insightful” (the replies, including from Richard Dawkins, are fun to read). I suppose it was as “insightful” as SciAm‘s disgraceful attack last year on Eric Lander, President Biden’s ultra-competent science advisor and a leader in the war on COVID, for … being a white male, which appears to have been E. O. Wilson’s crime as well. (Think I must be misrepresenting the “critique” of Lander? Read it!)

Anyway, in response to Scientific American‘s libel of Wilson, I wrote on my Facebook that I’ll no longer agree to write for or be interviewed by them (you can read my old stuff free of charge here or here), unless and until there’s a complete change of editorial direction. I encourage all other scientists to commit likewise, thereby making it common knowledge that the entity that now calls itself “Scientific American” bears the same relation to the legendary home of Martin Gardner as does a corpse to a living being. Fortunately, there are high-quality online venues (e.g., Quanta) that partly fill the role that Scientific American abdicated.

After reading my Facebook post, my friend Ashutosh Jogalekar was inspired to post an essay of his own. Ashutosh used to write regularly for Scientific American, until he was fired seven years ago over a column in which he advocated acknowledging Richard Feynman’s flaws, including his arrogance and casual sexism, but also understanding those flaws within the context of Feynman’s whole life, including the tragic death of his first wife Arlene. (Yes, that was really it! Read the piece!) Below, I’m sharing Ashutosh’s moving essay about E. O. Wilson with Ashutosh’s very generous permission. —Scott Aaronson

Guest Post by Ashutosh Jogalekar

As some know, I was “fired” from Scientific American in 2014 for three “controversial” posts (among 200 that I had written for the magazine). When I parted from the magazine I chalked up my departure to an unfortunate misunderstanding more than anything else. I still respected some of the writers at the publication, and while I wore my separation as a badge of honor and in retrospect realized its liberating utility in enabling me to greatly expand my topical range, I occasionally still felt bad and wished things had gone differently.

No more. Now the magazine has done me a great favor by allowing me to wipe the slate of my conscience clean. What happened seven years ago was not just a misunderstanding but clearly one of many first warning signs of a calamitous slide into a decidedly unscientific, irrational and ideology-ridden universe of woke extremism. Its logical culmination two days ago was an absolutely shameless, confused, fact-free and purely ideological hit job on someone who wasn’t just a great childhood hero of mine but a leading light of science, literary achievement, humanism and biodiversity. While Ed (E. O.) Wilson’s memory was barely getting cemented only days after his death, the magazine published an op-ed calling him a racist, a hit job endorsed and cited by the editor-in-chief as “insightful”. One of the first things I did after reading the piece was buy a few Wilson books that weren’t part of my collection.

Ed Wilson was one of the gentlest, most eloquent, most brilliant and most determined advocates for both human and natural preservation you could find. Under Southern charm lay hidden unyielding doggedness and immense stamina combined with a missionary zeal to communicate the wonders of science to both his fellow biologists and the general public. His autobiography, “Naturalist”, is perhaps the finest, most literary statement of the scientific life I have read; it was one of a half dozen books that completely transported me when I read it in college. In book after book of wide-ranging intellectual treats threading through a stunning diversity of disciplines, he sent out clarion calls for saving the planet, for enabling dialogue between the natural and the social sciences, for understanding each other better. In the face of unprecedented challenges to our fragile environment and continued barriers to interdisciplinary communication, this is work that likely will make him go down in history as one of the most important human beings who ever lived, easily of the same caliber and achievement as John Muir or Thoreau. Even in terms of achievement strictly defined by accolades – the National Medal of Science, the Crafoord Prize which recognizes fields excluded by the Nobel Prize, and not just one but two Pulitzer Prizes – few scientists from any field in the 20th century can hold a candle to Ed Wilson. My friend Richard Rhodes who knew Wilson for decades as a close and much-admired friend said that there wasn’t a racist bone in his body; Dick should know since he just came out with a first-rate biography of Wilson weeks before his passing.

The writer who wrote that train wreck is a professor of nursing at UCSF named Monica McLemore. That itself is a frightening fact and should tell everyone how much ignorance has spread itself in our highest institutions. She not only maligned and completely misrepresented Wilson but did not say a word about his decades-long, heroic effort to preserve the planet and our relationship with it; it was clear that she had little acquaintance with Wilson’s words since she did not cite any. It’s also worth noting the gaping moral blindness in her article which completely misses the most moral thing Wilson did – spend decades advocating for saving our planet and averting a catastrophe of extinction, climate change and divisiveness – and instead focuses completely on his non-existent immorality. This is a pattern that is consistently found among those urging “social justice” or “equity” or whatever else: somehow they seem to spend all their time talking about fictional, imagined immorality while missing the real, flesh-and-bones morality that is often the basis of someone’s entire life’s work.

In the end, the simple fact is that McLemore didn’t care about any of this. She didn’t care because she had a political agenda and the facts did not matter to her, even facts as basic as the definition of the normal distribution in statistics. For her, Wilson was some obscure white male scientist who was venerated, and that was reason enough for a supposed “takedown”. And the editor of Scientific American supported and lauded this ignorant, ideology-driven tirade.

Ironically, Wilson would have found this ideological hit job all too familiar. After he wrote his famous book Sociobiology in the 1970s, a volume in which, in a single chapter about human beings, he had the temerity to suggest that maybe, just maybe, human beings operate with the same mix of genes that other creatures do, the book was met by a disgraceful, below-the-belt, ideological response from Wilson’s far left colleagues Richard Lewontin and Stephen Jay Gould who hysterically compared his arguments to thinking that was well on its way down the slippery slope to that dark world where lay the Nazi gas chambers. The gas chamber analogy is about the only thing that’s missing from the recent hit job, but the depressing thing is that we are fighting the same battles in 2021 that Wilson fought forty years before, although turbocharged this time by armies of faithful zombies on social media. The sad thing is that Wilson is no longer around to defend himself, although I am not sure he would have bothered with a piece as shoddy as this one.

The complete intellectual destruction of a once-great science magazine is now clear as day. No more should Scientific American be regarded as a vehicle for sober scientific views and liberal causes but as a political magazine with clearly stated ideological biases and an aversion to facts, an instrument of a blinkered woke political worldview that brooks no dissent. Scott Aaronson has taken a principled stand and said that after this proverbial last straw on the camel’s back, he will no longer write for the magazine or do interviews for them. I applaud Scott’s decision, and with his expertise it’s a decision that actually matters. As far as I am concerned, I now mix smoldering fury at the article with immense relief: the last seven years have clearly shown that leaving Scientific American in 2014 was akin to leaving the Soviet Union in the 1930s just before Stalin appointed Lysenko head biologist. I could not have asked for a happier expulsion and now feel completely vindicated and free of any modicum of regret I might have felt.

To my few friends and colleagues who still write for the magazine and whose opinions I continue to respect, I really wish to ask: Why? Is writing for a magazine which has sacrificed facts and the liberal voice of real science at the altar of political ideology and make believe still worth it? What would it take for you to say no more? As Oscar Wilde would say, one mistake like this is a mistake, two seems more like carelessness; in the roster of the last few years, this is “mistake” 100+, signaling that it’s now officially approved policy. Do you think that being an insider will allow you to salvage the reputation of the magazine? If you think that way, you are no different from the one or two moderate Republicans who think they can still salvage the once-great party of Lincoln and Eisenhower. Both the GOP and Scientific American are beyond redemption from where I stand. Get out, start your own magazine or join another, one which actually respects liberal, diverse voices and scientific facts; let us applaud you for it. You deserve better, the world deserves better. And Ed Wilson’s memory sure as hell deserves better.

Update (from Scott): See here for the Hacker News thread about this post. I was amused by the conjunction of two themes: (1) people who were uncomfortable with my and Ashutosh’s expression of strong emotions, and (2) people who actually clicked through to the SciAm hit-piece, and then reported back to the others that the strong emotions were completely, 100% justified in this case.

John PreskillSpace-time and the city

I felt like a gum ball trying to squeeze my way out of a gum-ball machine. 

I was one of 50-ish physicists crammed into the lobby—and in the doorway, down the stairs, and onto the sidewalk—of a Manhattan hotel last December. Everyone had received a COVID vaccine, and the omicron variant hadn’t yet begun chewing up North America. Everyone had arrived on the same bus that evening, feeding on the neon-bright views of Fifth Avenue through dinnertime. Everyone wanted to check in and offload suitcases before experiencing firsthand the reason for the nickname “the city that never sleeps.” So everyone was jumbled together in what passed for a line.

We’d just passed the halfway point of the week during which I was pretending to be a string theorist. I do that whenever my research butts up against black holes, chaos, quantum gravity (the attempt to unify quantum physics with Einstein’s general theory of relativity), and alternative space-times. These topics fall under the heading “It from Qubit,” which calls for understanding puzzling physics (“It”) by analyzing how quantum systems process information (“Qubit”). The “It from Qubit” crowd convenes for one week each December, to share progress and collaborate.1 The group spends Monday through Wednesday at Princeton’s Institute for Advanced Study (IAS), dogged by photographs of Einstein, busts of Einstein, and roads named after Einstein. A bus ride later, the group spends Thursday and Friday at the Simons Foundation in New York City.

I don’t usually attend “It from Qubit” gatherings, as I’m actually a quantum information theorist and quantum thermodynamicist. Having admitted as much during the talk I presented at the IAS, I failed at pretending to be a string theorist. Happily, I adore being the most ignorant person in a roomful of experts, as the experience teaches me oodles. At lunch and dinner, I’d plunk down next to people I hadn’t spoken to and ask what they see as trending in the “It from Qubit” community. 

One buzzword, I’d first picked up on shortly before the pandemic had begun (replicas). Having lived a frenetic life, that trend seemed to be declining. Rising buzzwords (factorization and islands), I hadn’t heard in black-hole contexts before. People were still tossing around terms from when I’d first forayed into “It from Qubit” (scrambling and out-of-time-ordered correlator), but differently from then. Five years ago, the terms identified the latest craze. Now, they sounded entrenched, as though everyone expected everyone else to know and accept their significance.

One buzzword labeled my excuse for joining the workshops: complexity. Complexity wears as many meanings as the stereotypical New Yorker wears items of black clothing. Last month, guest blogger Logan Hillberry wrote about complexity that emerges in networks such as brains and social media. To “It from Qubit,” complexity quantifies the difficulty of preparing a quantum system in a desired state. Physicists have conjectured that a certain quantum state’s complexity parallels properties of gravitational systems, such as the length of a wormhole that connects two black holes. The wormhole’s length grows steadily for a time exponentially large in the gravitational system’s size. So, to support the conjecture, researchers have been trying to prove that complexity typically grows similarly. Collaborators and I proved that it does, as I explained in my talk and as I’ll explain in a future blog post. Other speakers discussed experimental complexities, as well as the relationship between complexity and a simplified version of Einstein’s equations for general relativity.

Inside the Simons Foundation on Fifth Avenue in Manhattan

I learned a bushel of physics, moonlighting as a string theorist that week. The gum-ball-machine lobby, though, retaught me something I’d learned long before the pandemic. Around the time I squeezed inside the hotel, a postdoc struck up a conversation with the others of us who were clogging the doorway. We had a decent fraction of an hour to fill; so we chatted about quantum thermodynamics, grant applications, and black holes. I asked what the postdoc was working on, he explained a property of black holes, and it reminded me of a property of thermodynamics. I’d nearly reached the front desk when I realized that, out of the sheer pleasure of jawing about physics with physicists in person, I no longer wanted to reach the front desk. The moment dangles in my memory like a crystal ornament from the lobby’s tree—pendant from the pandemic, a few inches from the vaccines suspended on one side and from omicron on the other. For that moment, in a lobby buoyed by holiday lights, wrapped in enough warmth that I’d forgotten the December chill outside, I belonged to the “It from Qubit” community as I hadn’t belonged to any community in 22 months.

Happy new year.

Presenting at the IAS was a blast. Photo credit: Jonathan Oppenheim.

1In person or virtually, pandemic-dependently.

Thanks to the organizers of the IAS workshop—Ahmed Almheiri, Adam Bouland, Brian Swingle—for the invitation to present and to the organizers of the Simons Foundation workshop—Patrick Hayden and Matt Headrick—for the invitation to attend.

January 02, 2022

Scott Aaronson Book Review: “Viral” by Alina Chan and Matt Ridley

Happy New Year, everyone!

It was exactly two years ago that it first became publicly knowable—though most of us wouldn’t know for at least two more months—just how freakishly horrible is the branch of the wavefunction we’re on. I.e., that our branch wouldn’t just include Donald Trump as the US president, but simultaneously a global pandemic far worse than any in living memory, and a world-historically bungled response to that pandemic.

So it’s appropriate that I just finished reading Viral: The Search for the Origin of COVID-19, by Broad Institute genetics postdoc Alina Chan and science writer Matt Ridley. Briefly, I think that this is one of the most important books so far of the twenty-first century.

Of course, speculation and argument about the origin of COVID goes back all the way to that fateful January of 2020, and most of this book’s information was already available in fragmentary form elsewhere. And by their own judgment, Chan and Ridley don’t end their search with a smoking-gun: no Patient Zero, no Bat Zero, no security-cam footage of the beaker dropped on the Wuhan Institute of Virology floor. Nevertheless, as far as I’ve seen, this is the first analysis of COVID’s origin to treat the question with the full depth, gravity, and perspective that it deserves.

Viral is essentially a 300-page plea to follow every lead as if we actually wanted to get to the bottom of things, and in particular, yes, to take the possibility of a lab leak a hell of a lot more seriously than was publicly permitted in 2020. (Fortuitously, much of this shift already happened as the authors were writing the book, but in June 2021 I was still sneered at for discussing the lab leak hypothesis on this blog.) Viral is simultaneously a model of lucid, non-dumbed-down popular science writing and of cogent argumentation. The authors never once come across like tinfoil-hat-wearing conspiracy theorists, railing against the sheeple with their conventional wisdom: they’re simply investigators carefully laying out what they’re confident should become conventional wisdom, with the many uncertainties and error bars explicitly noted. If you read the book and your mind works anything like mine, be forewarned that you might come out agreeing with a lot of it.

I would say that Viral proves the following propositions beyond reasonable doubt:

  • Virologists, including at Shi Zhengli’s group at WIV and at Peter Daszak’s EcoHealth Alliance, were engaged in unbelievably risky work, including collecting virus-laden fecal samples from thousands of bats in remote caves, transporting them to the dense population center of Wuhan, and modifying them to be more dangerous, e.g., through serial passage through human cells and the insertion of furin cleavage sites. Years before the COVID-19 outbreak, there were experts remarking on how risky this research was and trying to stop it. Had they known just how lax the biosecurity was in Wuhan—dangerous pathogens experimented on in BSL-2 labs, etc. etc.—they would have been louder.
  • Even if it didn’t cause the pandemic, the massive effort to collect and enhance bat coronaviruses now appears to have been of dubious value. It did not lead to an actionable early warning about how bad COVID-19 was going to be, nor did it lead to useful treatments, vaccines, or mitigation measures, all of which came from other sources.
  • There are multiple routes by which SARS-CoV2, or its progenitor, could’ve made its way, otherwise undetected, from the remote bat caves of Yunnan province or some other southern location to the city of Wuhan a thousand miles away, as it has to do in any plausible origin theory. Having said that, the regular Yunnan→Wuhan traffic in scientific samples of precisely these kinds of viruses, sustained over a decade, does stand out a bit! On the infamous coincidence of the pandemic starting practically next door to the world’s center for studying SARS-like coronaviruses, rather than near where the horseshoe bats live in the wild, Chan and Ridley memorably quote Humphrey Bogart’s line from Casablanca: “Of all the gin joints in all the towns in all the world, she walks into mine.”
  • The seafood market was probably “just” an early superspreader site, rather than the site of the original spillover event. No bats or pangolins at all, and relatively few mammals of any kind, appear to have been sold at that market, and no sign of SARS-CoV2 was ever found in any of the animals despite searching.
  • Most remarkably, Shi and Daszak have increasingly stonewalled, refusing to answer 100% reasonable questions from fellow virologists. They’ve acted more and more like defendants exercising their right to remain silent than like participants in a joint search for the truth. That might be understandable if they’d already answered ad nauseam and wearied of repeating themselves, but with many crucial questions, they haven’t answered even once. They’ve refused to make available a key database of all the viruses WIV had collected, which WIV inexplicably took offline in September 2019. When, in January 2020, Shi disclosed to the world that WIV had collected a virus called RaTG13, which was 96% identical to SARS-CoV2, she didn’t mention that it was collected from a mine in Mojiang, which the WIV had sampled from over and over because six workers had gotten a SARS-like pneumonia there in 2012 and three had died from it. She didn’t let on that her group had been studying RaTG13 for years—giving, instead, the false impression that they’d just noticed it recently, when searching WIV’s records for cousins of SARS-CoV2. And she didn’t see fit to mention that WIV had collected eight other coronaviruses resembling SARS-CoV2 from the same mine (!). Shi’s original papers on SARS-CoV2 also passed in silence over the virus’s furin cleavage site—even though SARS-CoV2 was the first sarbecoronavirus with that feature, and Shi herself had recently demonstrated adding furin cleavage sites to other viruses to make them more transmissible, and the cleavage site would’ve leapt out immediately to any coronavirus researcher as the most interesting feature of SARS-CoV2 and as key to its transmissibility. Some of these points had to be uncovered by Internet sleuths, poring over doctoral theses and the like, after which Shi would glancingly acknowledge the points in talks without ever explaining her earlier silences. Shi and Daszak refused to cooperate with Chan and Ridley’s book, and have stopped answering questions more generally. When people politely ask Daszak about these matters on Twitter, he blocks them.
  • The Chinese regime has been every bit as obstructionist as you might expect: destroying samples, blocking credible investigations, censoring researchers, and preventing journalists from accessing the Mojiang mine. So Shi at least has the excuse that, even if she’d wanted to come clean with everything relevant she knows about WIV’s bat coronavirus work, she might not be able to do so without endangering herself or loved ones. Daszak has no such excuse.

It’s important to understand that, even in the worst case—that (1) there was a lab leak, and (2) Shi and Daszak are knowingly withholding information relevant to it—they’re far from monsters. Even in Viral‘s relentlessly unsparing account, they come across as genuine believers in their mission to protect the world from the next pandemic.

And it’s like: imagine devoting your life to that mission, having most of the world refuse to take you seriously, and then the calamity happens exactly like you said … except that, not only did your efforts fail to prevent it, but there’s a live possibility that they caused it. It’s conceivable that your life’s work managed to save minus 15 million lives and create minus $50 trillion in economic value.

Very few scientists in history have faced that sort of psychic burden, perhaps not even the ones who built the atomic bomb. I hope I’d maintain my scientific integrity under such an astronomical weight, but I’m doubtful that I would. Would you?

Viral very wisely never tries to psychoanalyze Shi and Daszak. I fear that one might need a lot of conceptual space between “knowing” and “not knowing,” “suspecting” and “not suspecting,” to do justice to the planet-sized enormity of what’s at stake here. Suppose, for example, that an initial investigation in January 2020 reassured you that SARS-CoV2 probably hadn’t come from your lab: would you continue trying to get to the bottom of things, or would you thereafter decide the matter was closed?

For all that, I agree with Chan and Ridley that COVID-19 might well have had a zoonotic origin after all. And one point Viral makes abundantly clear is that, if our goal is to prevent the next pandemic, then resolving the mystery of COVID-19 actually matters less than one might think. This is because, whichever possibility—zoonotic spillover or lab leak—turns out to be the truth of this case, the other possibility would remain absolutely terrifying and would demand urgent action as well. Read the book and see for yourself.

Searching my inbox, I found an email from April 16, 2020 where I told someone who’d asked me that the lab-leak hypothesis seemed perfectly plausible to me (albeit no more than plausible), that I couldn’t understand why it wasn’t being investigated more, but that I was hesitant to blog about these matters. As I wrote seven months ago, I now see my lack of courage on this as having been a personal failing. Obviously, I’m just a quantum computing theorist, not a biologist, so I don’t have to have any thoughts whatsoever about the origin of COVID-19 … but I did have some, and I didn’t share them here only because of the likelihood that I’d be called an idiot on social media. Having now read Chan and Ridley, though, I think I’d take being called an idiot for this book review more as a positive signal about my courage than as a negative signal about my reasoning skills!

At one level, Viral stands alongside, I dunno, Eichmann in Jerusalem among the saddest books I’ve ever read. It’s 300 pages of one of the great human tragedies of our lifetime balancing on a hinge between happening and not happening, and we all know how it turns out. On another level, though, Viral is optimistic. Like with Richard Feynman’s famous “personal appendix” about the Space Shuttle Challenger explosion, the very act of writing such a book reflects a view that you’re still allowed to ask questions; that one or two people armed with nothing but arguments can run rings around governments, newspapers, and international organizations; that we don’t yet live in a post-truth world.

n-Category Café Adjoint School 2022

Every year since 2018 we’ve been having annual courses on applied category theory where you can do research with experts. It’s called the Adjoint School.

You can apply to be a student at the 2022 Adjoint School now, and applications are due January 29th! Go here:

Read on for more about how this works!

The school will be run online from February to June, 2022, and then — coronavirus permitting — there will be in-person research at the University of Strathclyde in Glasgow, Scotland the week of July 11–15, 2022. This is also the location of the applied category theory conference ACT2022.

The 2022 Adjoint School is organized by Angeline Aguinaldo, Elena Di Lavore, Sophie Libkind, and David Jaz Myers. You can read more about how it works here:

There are four topics to work on, and you can see descriptions of them below.

Who should apply?

Anyone, from anywhere in the world, who is interested in applying category-theoretic methods to problems outside of pure mathematics. This is emphatically not restricted to math students, but one should be comfortable working with mathematics. Knowledge of basic category-theoretic language — the definition of monoidal category for example — is encouraged.

We will consider advanced undergraduates, PhD students, post-docs, as well as people working outside of academia. Members of groups which are underrepresented in the mathematics and computer science communities are especially encouraged to apply.

Also check out our inclusivity statement.

Topic 1: Compositional Thermodynamics

Mentors: Spencer Breiner and Joe Moeller

TA: Owen Lynch

Description: Thermodynamics is the study of the relationships between heat, energy, work, and matter. In category theory, we model flows in physical systems using string diagrams, allowing us to formalize physical axioms as diagrammatic equations. The goal of this project is to establish such a compositional framework for thermodynamical networks. A first goal will be to formalize the laws of thermodynamics in categorical terms. Depending on the background and interest of the participants, further topics may include the Carnot and Otto engines, more realistic modeling for real-world systems, and software implementation within the AlgebraicJulia library.


Topic 2: Fuzzy Type Theory for Opinion Dynamics

Mentor: Paige North

TA: Hans Reiss

Description: When working in type theory (or most logics), one is interested in proving propositions by constructing witnesses to their incontrovertible truth. In the real world, however, we can often only hope to understand how likely something is to be true, and we look for evidence that something is true. For example, when a doctor is trying to determine if a patient has a certain condition, they might ask certain questions and perform certain tests, each of which constitutes a piece of evidence that the patient does or does not have that condition. This suggests that a fuzzy version of type theory might be appropriate for capturing and analyzing real-world situations. In this project, we will explore the space of fuzzy type theories which can be used to reason about the fuzzy propositions of disease and similar dynamics.


Topic 3: A Compositional Theory of Timed and Probabilistic Processes: CospanSpan(Graph)

Mentor: Nicoletta Sabadini

TA: Mario Román

Description coming soon!

Topic 4: Algebraic Structures in Logic and Relations

Mentor: Filippo Bonchi

Description coming soon!

John BaezAdjoint School 2022

Every year since 2018 we’ve been having annual courses on applied category theory where you can do research with experts. It’s called the Adjoint School.

You can apply to be a student at the 2022 Adjoint School now, and applications are due January 29th! Go here:

2022 Adjoint School: application.

The school will be run online from February to June, 2022, and then—coronavirus permitting—there will be in-person research at the University of Strathclyde in Glasgow, Scotland the week of July 11 – 15, 2022. This is also the location of the applied category theory conference ACT2022.

The 2022 Adjoint School is organized by Angeline Aguinaldo, Elena Di Lavore, Sophie Libkind, and David Jaz Myers. You can read more about how it works here:

About the Adjoint School.

There are four topics to work on, and you can see descriptions of them below.

Who should apply?

Anyone, from anywhere in the world, who is interested in applying category-theoretic methods to problems outside of pure mathematics. This is emphatically not restricted to math students, but one should be comfortable working with mathematics. Knowledge of basic category-theoretic language—the definition of monoidal category for example—is encouraged.

We will consider advanced undergraduates, PhD students, post-docs, as well as people working outside of academia. Members of groups which are underrepresented in the mathematics and computer science communities are especially encouraged to apply.

Also check out our inclusivity statement.

Topic 1: Compositional Thermodynamics

Mentors: Spencer Breiner and Joe Moeller

TA: Owen Lynch

Description: Thermodynamics is the study of the relationships between heat, energy, work, and matter. In category theory, we model flows in physical systems using string diagrams, allowing us to formalize physical axioms as diagrammatic equations. The goal of this project is to establish such a compositional framework for thermodynamical networks. A first goal will be to formalize the laws of thermodynamics in categorical terms. Depending on the background and interest of the participants, further topics may include the Carnot and Otto engines, more realistic modeling for real-world systems, and software implementation within the AlgebraicJulia library.


• John C. Baez, Owen Lynch, and Joe Moeller, Compositional thermostatics.

• F. William Lawvere, State categories, closed categories and the existence of semi-continuous entropy functions.

Topic 2: Fuzzy Type Theory for Opinion Dynamics

Mentor: Paige North

TA: Hans Reiss

Description: When working in type theory (or most logics), one is interested in proving propositions by constructing witnesses to their incontrovertible truth. In the real world, however, we can often only hope to understand how likely something is to be true, and we look for evidence that something is true. For example, when a doctor is trying to determine if a patient has a certain condition, they might ask certain questions and perform certain tests, each of which constitutes a piece of evidence that the patient does or does not have that condition. This suggests that a fuzzy version of type theory might be appropriate for capturing and analyzing real-world situations. In this project, we will explore the space of fuzzy type theories which can be used to reason about the fuzzy propositions of disease and similar dynamics.


• Daniel R. Grayson, An introduction to univalent foundations for mathematicians.

• Jakob Hansen and Robert Ghrist, Opinion dynamics on discourse sheaves.

Topic 3: A Compositional Theory of Timed and Probabilistic Processes: CospanSpan(Graph)

Mentor: Nicoletta Sabadini

TA: Mario Román

Description coming soon!

Topic 4: Algebraic Structures in Logic and Relations

Mentor: Filippo Bonchi

Description coming soon!

January 01, 2022

Tommaso DorigoTetra-Neutronium!

The neutron, discovered in 1932 by Chadwick, is a fascinating particle whose existence allows for the stability of heavy nuclei and a wealth of atoms of different properties. Without neutrons, Hydrogen would be the only stable element: protons cannot be brought together and bound in a stable system, so e.g. Helium-2 (an atom made of two protons with two electrons) is very short-lived, as are atoms with more protons and no neutrons. So our Universe would be a very dull place.

read more

Doug NatelsonA book review, and wishes for a happy new year

 I was fortunate enough to receive a copy of Andy Zangwill's recent biography of Phil Anderson, A Mind Over Matter:  Philip Anderson and the Physics of the Very Many.  It's a great book that I would recommend to any physics student or graduate interested in learning about one of the great scientists of the 20th century.  Zangwill does an excellent job with the difficult task of describing (in a way accessible to scientists, if not necessarily always the lay-public) the rise of solid-state physics in the last century and its transformation, with significant guidance from Anderson, into what we now call condensed matter.  This alone is reason to read the book - it's more accessible than the more formally historical (also excellent) Out of the Crystal Maze and a good pairing with Solid State Insurrection (which I discussed here).  

This history seamlessly provides context for the portrait of Anderson, a brilliant, intuitive theorist who prized profound, essential models over computational virtuosity, and who had a litany of achievements that is difficult to list in its entirety.  The person described in the book gibes perfectly with my limited direct interactions with him and the stories I heard from my thesis advisor and other Bell Labs folks.  Some lines ring particularly true (with all that says about the culture of our field):  "Anderson never took very long to decide if a physicist he had just met was worth his time and respect."

On a separate note:  Thanks for reading, and I wish you a very happy new year!  I hope that you and yours have a safe, healthy, and fulfilling 2022.

December 31, 2021

Jordan EllenbergThe year of not reading long books

  • 30 Dec 2021:  Project Hail Mary, by Andy Weir.
  • 5 Dec 2021: The Green Futures of Tycho, by William Sleator.
  • 30 Nov 2021:  The Cup of Fury, by Upton Sinclair.
  • 28 Nov 2021: Horse Walks Into A Bar, by David Grossman (Jessica Cohen, trans.)
  • 5 Nov 2021: All Of The Marvels, by Douglas Wolk.
  • 13 Oct 2021: Great Days, by Donald Barthelme.
  • 30 Sep 2021:  Beautiful World, Where Are You? by Sally Rooney.
  • 18 Sep 2021:  Because Internet, by Gretchen McCulloch.
  • 10 Sep 2021:  Hidden Valley Road, by Robert Kolker.
  • 26 Aug 2021:  The Hairdresser of Harare, by Tendai Huchu.
  • 7 Aug 2021:  Max Beckmann at the St. Louis Art Museum:  The Paintings, by Lynette Roth.
  • 2 Aug 2021:  To Live, by Yu Hua. (Michael Berry, trans.)
  • 1 Aug 2021:  Blackman’s Burden, by Mack Reynolds.
  • 29 Jul 2021:  Highly Irregular, by Arika Okrent.
  • 21 Jul 2021: Sleeping Beauties, by Stephen King and Owen King.
  • 12 Jul 2021:  Giovanni’s Room, by James Baldwin.
  • 7 Jul 2021:  Daisy Miller, by Henry James.
  • 29 Jun 2021:  Parable of the Sower, by Octavia Butler.
  • 6 Jun 2021:  Walkman, by Michael Robbins.
  • 4 Jun 2021:  Darryl, by Jackie Ess.
  • 25 May 2021:  Likes, by Sarah Shun-Lien Bynum.
  • 18 May 2021:  Subdivision, by J. Robert Lennon.
  • 20 Apr 2021:  Big Trouble, by J. Anthony Lukas (not finished yet, will I finish it?)
  • 15 Apr 2021:  No More Parades, by Ford Madox Ford (not finished yet, will I finish it? (Finished Sep 2021)
  • 26 Mar 2021:  Some Do Not…, by Ford Madox Ford.
  • 12 Jan 2021:  Metropolitan Life, by Fran Lebowitz.

My plan was that 2021 was going to be the year of reading long books. Why not? When the year started, there was no vaccine and I was teaching on the screen and so I was at home a lot of the time. Perfect time to really sink into a long book. I had gotten through and really valued some long ones in recent years — Bleak House, Vanity Fair, 1Q84, Hermione Lee’s Wharton biography — so why not Ford Madox Ford’s four-book Parade’s End, which I have wanted to read ever since I read and fell for The Good Soldier (aka the next book you have to read if you think Gatsby is the perfect novel and want “something else like that” but don’t think there could be something else like that — there is and this is it.)

But I didn’t read Parade’s End in 2021. I’ve read most of it! Two of the novels and half of the third. I will probably finish it. But — too high-modernist, too stream-of-consciousness, too hard under the circumstances. I have read a lot of it but I have never really been in it. It has made an impression but I could tell you only fragments about what happened in it. So my plan to read this, and The Man Without Qualities, and USA by Dos Passos, fell away. Sorry, early 20th century high modernism. And sorry too to Kevin Young’s Bunk and the 800-page history of Maryland and The Rest Is Noise and all the other I’ll-get-to-this bricks that didn’t even get taken off the shelf, as I’d imagined they might.

I read what I usually read. Stuff that caught my eye on the shelf. New books by people I know. Paperbacks small enough to fit in the pocket of my cargo shorts, for long walks — in 2021, we tried to spend a lot of time outside. (The small ones were: Blackman’s Burden and Great Days.)

I found I didn’t really read for pleasure this year. There were a lot of these books that I liked and admired; I don’t think there was one that gave me the sensation of “the thing I feel most like doing right now is reading this book.” But I’m not sure when a book (or a TV show, or movie) last made me feel that way, so it may be a mode of reading I’m done with! Anyway, books I liked: Douglas Wolk’s All of the Marvels — he read every Marvel comic ever written and wrote about what we found there. Nobody is better than Douglas at digging insight out of pieces of culture other people may see as low, or simple, or disposable. I’ve known that since I started listening to the mixtapes he was making in 1987. Of course one cannot help feeling that I enjoy reading Douglas reading those comics more than I would actually enjoy reading the comics. Darryl, by Jackie Ess, a perversely funny book about money that very cleverly disguises itself as a perversely funny book about sex.

Subdivision and Sleeping Beauties are an interesting matched set. Both involve dreamworlds. Lennon’s book is clearly marked as “literary fiction,” and rightly, but has the virtues of a great horror novel — authentic scariness, disorientation. Sleeping Beauties — well, you know I love Stephen King, and that I think he invented the virtues of a great modern horror novel, and I will keep reading these the same way I kept buying R.E.M. albums in the 21st century, but they are now kind of about themselves, gesturing at the virtues instead of really inhabiting them. The things people like to scold “literary fiction” for — shapeless plot & preoccupation with the married lives of middle-aged people — are here, and the presence of bizarrely-powered superbeings, gun battles, and rent flesh can’t change that.

The Green Futures of Tycho was as disturbing as I remembered. I loved it as a kid, but do I want my kids to read it? I think so? I wonder if there is fiction marketed as YA now which presents such a bleak picture of human nature. My sense is they make this stuff sunnier now. Here’s the first line of Sleator’s memoir, Oddballs:

When my sister Vicky and I were teenagers we talked a lot about hating
people. Hating came easily to us. We would be walking down the street,
notice a perfect stranger, and be suddenly struck by how much we hated
that person. And at the dinner table we would go on and on about all
the popular kids we hated at high school. Our father, who has a very logical
mind, sometimes cautioned us about this. “Don’t waste your hate,” he would
say. “Save it up for important things, like your family, or the President.”
We responded by quoting the famous line from Medea: “Loathing is endless.
Hate is a bottomless cup; I pour and pour.”

Yu Hua’s To Live was simple and incredibly moving. The Cup of Fury, Upton Sinclair’s gossipy memoir about how all the writers he knew, himself excepted, drank too much, was strangely entertaining. I think I liked Daisy Miller but I can’t remember a thing about it now. Horse Walks Into a Bar was annoying, built on a single conceit (entire novel as stand-up routine) that got old after, well, less than an entire novel. Beautiful World, Where Are You I already blogged about.

What should I read next year?

Update: Wait, I read Susanna Clarke’s Piranesi this year! I somehow didn’t put it on the list. It actually might have been the book that came closest to pleasure reading for me. Strangely, it was shortlisted for a Hugo this year; I say strangely because it doesn’t read at all to me like SF or fantasy, but rather as fiction that has taken some of the good things from SF without at all feeling like a creature of that genre. Maybe anything with fantastic elements is Hugo-nominable, but in practice, I’ll bet most people don’t see it that way. (Then again: The Good Place won a Hugo!)

Matt von HippelClassicality Has Consequences

Last week, I mentioned some interesting new results in my corner of physics. I’ve now finally read the two papers and watched the recorded talk, so I can satisfy my frustrated commenters.

Quantum mechanics is a very cool topic and I am much less qualified than you would expect to talk about it. I use quantum field theory, which is based on quantum mechanics, so in some sense I use quantum mechanics every day. However, most of the “cool” implications of quantum mechanics don’t come up in my work. All the debates about whether measurement “collapses the wavefunction” are irrelevant when the particles you measure get absorbed in a particle detector, never to be seen again. And while there are deep questions about how a classical world emerges from quantum probabilities, they don’t matter so much when all you do is calculate those probabilities.

They’ve started to matter, though. That’s because quantum field theorists like me have recently started working on a very different kind of problem: trying to predict the output of gravitational wave telescopes like LIGO. It turns out you can do almost the same kind of calculation we’re used to: pretend two black holes or neutron stars are sub-atomic particles, and see what happens when they collide. This trick has grown into a sub-field in its own right, one I’ve dabbled in a bit myself. And it’s gotten my kind of physicists to pay more attention to the boundary between classical and quantum physics.

The thing is, the waves that LIGO sees really are classical. Any quantum gravity effects there are tiny, undetectably tiny. And while this doesn’t have the implications an expert might expect (we still need loop diagrams), it does mean that we need to take our calculations to a classical limit.

Figuring out how to do this has been surprisingly delicate, and full of unexpected insight. A recent example involves two papers, one by Andrea Cristofoli, Riccardo Gonzo, Nathan Moynihan, Donal O’Connell, Alasdair Ross, Matteo Sergola, and Chris White, and one by Ruth Britto, Riccardo Gonzo, and Guy Jehu. At first I thought these were two groups happening on the same idea, but then I noticed Riccardo Gonzo on both lists, and realized the papers were covering different aspects of a shared story. There is another group who happened upon the same story: Paolo Di Vecchia, Carlo Heissenberg, Rodolfo Russo and Gabriele Veneziano. They haven’t published yet, so I’m basing this on the Gonzo et al papers.

The key question each group asked was, what does it take for gravitational waves to be classical? One way to ask the question is to pick something you can observe, like the strength of the field, and calculate its uncertainty. Classical physics is deterministic: if you know the initial conditions exactly, you know the final conditions exactly. Quantum physics is not. What should happen is that if you calculate a quantum uncertainty and then take the classical limit, that uncertainty should vanish: the observation should become certain.

Another way to ask is to think about the wave as made up of gravitons, particles of gravity. Then you can ask how many gravitons are in the wave, and how they are distributed. It turns out that you expect them to be in a coherent state, like a laser, one with a very specific distribution called a Poisson distribution: a distribution in some sense right at the border between classical and quantum physics.

The results of both types of questions were as expected: the gravitational waves are indeed classical. To make this work, though, the quantum field theory calculation needs to have some surprising properties.

If two black holes collide and emit a gravitational wave, you could depict it like this:

All pictures from arXiv:2112.07556

where the straight lines are black holes, and the squiggly line is a graviton. But since gravitational waves are made up of multiple gravitons, you might ask, why not depict it with two gravitons, like this?

It turns out that diagrams like that are a problem: they mean your two gravitons are correlated, which is not allowed in a Poisson distribution. In the uncertainty picture, they also would give you non-zero uncertainty. Somehow, in the classical limit, diagrams like that need to go away.

And at first, it didn’t look like they do. You can try to count how many powers of Planck’s constant show up in each diagram. The authors do that, and it certainly doesn’t look like it goes away:

An example from the paper with Planck’s constants sprinkled around

Luckily, these quantum field theory calculations have a knack for surprising us. Calculate each individual diagram, and things look hopeless. But add them all together, and they miraculously cancel. In the classical limit, everything combines to give a classical result.

You can do this same trick for diagrams with more graviton particles, as many as you like, and each time it ought to keep working. You get an infinite set of relationships between different diagrams, relationships that have to hold to get sensible classical physics. From thinking about how the quantum and classical are related, you’ve learned something about calculations in quantum field theory.

That’s why these papers caught my eye. A chunk of my sub-field is needing to learn more and more about the relationship between quantum and classical physics, and it may have implications for the rest of us too. In the future, I might get a bit more qualified to talk about some of the very cool implications of quantum mechanics.

John BaezThe Kepler Problem (Part 2)

I’m working on a math project involving the periodic table of elements and the Kepler problem—that is, the problem of a particle moving in an inverse square force law. That’s one reason I’ve been blogging about chemistry lately! I hope to tell you all about this project sometime—but right now I just want to say some very basic stuff about the ‘eccentricity vector’.

This vector is a conserved quantity for the Kepler problem. It was named the ‘Runge–Lenz vector’ after Lenz used it in 1924 to study the hydrogen atom in the framework of the ‘old quantum mechanics’ of Bohr and Sommerfeld: Lenz cite Runge’s popular German textbook on vector analysis from 1919, which explains this vector. But Runge never claimed any originality: he attributed this vector to Gibbs, who wrote about it in his book on vector analysis in 1901!

Nowadays many people call it the ‘Laplace–Runge–Lenz vector’, honoring Laplace’s discussion of it in his famous treatise on celestial mechaics in 1799. But in fact this vector goes back at least to Jakob Hermann, who wrote about it in 1710, triggering further work on this topic by Johann Bernoulli in the same year.

Nobody has seen signs of this vector in work before Hermann. So, we might call it the Hermann–Bernoulli–Laplace–Gibbs–Runge–Lenz vector, or just the Hermann vector. But I prefer to call it the eccentricity vector, because for a particle in an inverse square law its magnitude is the eccentricity of that orbit!

Let’s suppose we have a particle whose position \vec q \in \mathbb{R}^3 obeys this version of the inverse square force law:

\ddot{\vec q} = - \frac{\vec q}{q^3}

where I remove the arrow from a vector when I want to talk about its magnitude. So, I’m setting the mass of this particle equal to 1, along with the constant saying the strength of the force. That’s because I want to keep the formulas clean! With these conventions, the momentum of the particle is

\vec p = \dot{\vec q}

For this system it’s well-known that the following energy is conserved:

H = \frac{1}{2} p^2 - \frac{1}{q}

as well as the angular momentum vector:

\vec L = \vec q \times \vec p

But the interesting thing for me today is the eccentricity vector:

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

Let’s check that it’s conserved! Taking its time derivative,

\dot{\vec e} = \dot{\vec p} \times \vec L + \vec p \times \dot{\vec L} - \frac{\vec p}{q} + \frac{\dot q}{q^2} \,\vec q

But angular momentum is conserved so the second term vanishes, and

\dot q = \frac{d}{dt} \sqrt{\vec q \cdot \vec q} =  \frac{\vec p \cdot \vec q}{\sqrt{\vec q \cdot \vec q}} = \frac{\vec p \cdot \vec q}{q}

so we get

\dot{\vec e} = \dot{\vec p} \times \vec L - \frac{\vec p}{q} +  \frac{\vec p \cdot \vec q}{q^2}\, \vec q

But the inverse square force law says

\dot{\vec p} = - \frac{\vec q}{q^3}


\dot{\vec e} = - \frac{1}{q^3} \, \vec q \times \vec L - \frac{\vec p}{q} +  \frac{\vec p \cdot \vec q}{q^2}\, \vec q

How can we see that this vanishes? Mind you, there are various geometrical ways to think about this, but today I’m in the mood for checking that my skills in vector algebra are sufficient for a brute-force proof—and I want to record this proof so I can see it later!

To get anywhere we need to deal with the cross product in the above formula:

\vec q \times \vec L = \vec q \times (\vec q \times \vec p)

There’s a nice identity for the vector triple product:

\vec a \times (\vec b \times \vec c) = (\vec a \cdot \vec c) \vec b - (\vec a \cdot \vec b) \vec c

I could have fun talking about why this is true, but I won’t now! I’ll just use it:

\vec q \times \vec L = \vec q \times (\vec q \times \vec p) = (\vec q \cdot \vec p) \vec q - q^2 \, \vec p

and plug this into our formula

\dot{\vec e} = - \frac{1}{q^3} \, \vec q \times \vec L - \frac{\vec p}{q} +  \frac{\vec p \cdot \vec q}{q^2}\, \vec q


\dot{\vec e} = -\frac{1}{q^3} \Big((\vec q \cdot \vec p) \vec q - q^2 \vec p \Big) - \frac{\vec p}{q} +  \frac{\vec p \cdot \vec q}{q^3}\, \vec q

But look—everything cancels! So

\dot{\vec e} = 0

and the eccentricity vector is conserved!

So, it seems that the inverse square force law has 7 conserved quantities: the energy H, the 3 components of the angular momentum \vec L, and the 3 components of the eccentricity vector \vec e. But they can’t all be independent, since the particle only has 6 degrees of freedom: 3 for position and 3 for momentum. There can be at most 5 independent conserved quantities, since something has to change. So there have to be at least two relations betwen the conserved quantities we’ve found.

The first of these relations is pretty obvious: \vec e and \vec L are at right angles, so

\vec e \cdot \vec L = 0

But wait, why are they at right angles? Because

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

The first term is orthogonal to \vec L because it’s a cross product of \vec p and \vec L; the second is orthogonal to \vec L because \vec L is a cross product of \vec q and \vec p.

The second relation is a lot less obvious, but also more interesting. Let’s take the dot product of \vec e with itself:

e^2 = \left(\vec p \times \vec L - \frac{\vec q}{q}\right) \cdot \left(\vec p \times \vec L - \frac{\vec q}{q}\right)

or in other words,

e^2 = (\vec p \times \vec L) \cdot (\vec p \times \vec L) - \frac{2}{q} \vec q \cdot (\vec p \times \vec L) + 1

But remember this nice cross product identity:

(\vec a \times \vec b) \cdot (\vec a \times \vec b) + (\vec a \cdot \vec b)^2 = a^2 b^2

Since \vec p and L are at right angles this gives

(\vec p \times \vec L) \cdot (\vec p \times \vec L) = p^2 L^2


e^2 = p^2 L^2 - \frac{2}{q} \vec q \cdot (\vec p \times \vec L) + 1

Then we can use the cyclic identity for the scalar triple product:

\vec a \cdot (\vec b \times \vec c) = \vec c \cdot (\vec a \times \vec b)

to rewrite this as

e^2 = p^2 L^2 - \frac{2}{q} \vec L \cdot (\vec q \times \vec p) + 1

or simply

e^2 = p^2 L^2 - \frac{2}{q} L^2 + 1

or even better,

e^2 = 2 \left(\frac{1}{2} p^2 - \frac{1}{q}\right) L^2 + 1

But this means that

e^2 = 2HL^2 + 1

which is our second relation between conserved quantities for the Kepler problem!

This relation makes a lot of sense if you know that e is the eccentricity of the orbit. Then it implies:

• if H > 0 then e > 1 and the orbit is a hyperbola.

• if H = 0 then e = 1 and the orbit is a parabola.

• if H < 0 then 0 < e < 1 and the orbit is an ellipse (or circle).

But why is e the eccentricity? And why does the particle move in a hyperbola, parabola or ellipse in the first place? We can show both of these things by taking the dot product of \vec q and \vec e:

\begin{array}{ccl}   \vec q \cdot \vec e &=&   \vec q \cdot \left(\vec p \times \vec L - \frac{\vec q}{q} \right)  \\ \\    &=& \vec q \cdot (\vec p \times \vec L) - q   \end{array}

Using the cyclic property of the scalar triple product we can rewrite this as

\begin{array}{ccl}   \vec q \cdot \vec e &=&   \vec L \cdot (\vec q \times \vec p) - q  \\ \\  &=& L^2 - q  \end{array}

Now, we know that \vec q moves in the plane orthogonal to \vec L. In this plane, which contains the vector \vec e, the equation \vec q \cdot \vec e = L^2 - q defines a conic of eccentricity e. I won’t show this from scratch, but it may seem more familiar if we rotate the whole situation so this plane is the xy plane and \vec e points in the x direction. Then in polar coordinates this equation says

er \cos \theta = L^2 - r


r = \frac{L^2}{1 + e \cos \theta}

This is well-known, at least among students of physics who have solved the Kepler problem, to be the equation of a conic of eccentricity e.

Another thing that’s good to do is define a rescaled eccentricity vector. In the case of elliptical orbits, where H < 0, we define this by

\vec M = \frac{\vec e}{\sqrt{-2H}}

Then we can take our relation

e^2 = 2HL^2 + 1

and rewrite it as

1 = e^2 - 2H L^2

and then divide by -2H getting

- \frac{1}{2H} = \frac{e^2}{-2H} + L^2


- \frac{1}{2H} = L^2 + M^2

This suggests an interesting similarity between \vec L and \vec M, which turns out to be very important in a deeper understanding of the Kepler problem. And with more work, you can use this idea to show that -1/4H is the Hamiltonian for a free particle on the 3-sphere. But more about that some other time, I hope!

For now, you might try this:

• Wikipedia, Laplace–Runge–Lenz vector.

and of course this:

The Kepler problem (part 1).

December 29, 2021

n-Category Café Spaces of Extremal Magnitude

Mark Meckes and I have a new paper on magnitude!

Tom Leinster and Mark Meckes, Spaces of extremal magnitude. arXiv:2112.12889, 2021.

It’s a short one: 7 pages. But it answers two questions that have been lingering since the story of magnitude began.

In the beginning, there was the magnitude of finite metric spaces, a special case of the magnitude of finite enriched categories. Simon Willerton and I did some early investigation of how to extend magnitude from finite to infinite metric spaces — or more specifically, compact metric spaces. Interesting as the results of that investigation were — and we found out lots of cool stuff about the magnitude of spheres, Cantor sets, and so on — there was something ad hoc at its heart: the very definition of the magnitude of a compact space.

That foundational problem was largely settled by Mark in a pair of papers about ten years ago. They show definitively how to generalize magnitude from finite to compact metric spaces, at least under the mild technical condition that the spaces concerned are positive definite (which I won’t go into here). There are several different ways that one might imagine extending the definition of magnitude from finite to compact spaces, and Mark proved that under this hypothesis, they all give exactly the same result.

But some questions remained. A particularly prominent one was this. By definition, the magnitude of a nonempty positive definite compact metric space lies in the interval [1,][1, \infty]. Given any m[1,)m \in [1, \infty), it’s easy to find a space with magnitude mm. But is there anything with magnitude \infty?

This question has been open since 2010, but we settle it in our new paper. The answer is yes. One might have guessed that no such space exists: after all, compactness is a kind of finiteness condition, so perhaps it wouldn’t be surprising if it implied finiteness of magnitude. But our counterexample reveals the fuzziness in that thinking.

The counterexample is an infinite-dimensional simplex in sequence space 1\ell^1, as follows.

First imagine that we’ve chosen two positive real numbers, a 1a_1 and a 2a_2, and consider the convex hull of the points

(0,0),(a 1,0),(0,a 2) (0, 0), \ (a_1, 0), \ (0, a_2)

in 2\mathbb{R}^2. It’s a 2-simplex. Similarly, given positive reals a 1,a 2,a 3a_1, a_2, a_3, the convex hull of

(0,0,0),(a 1,0,0),(0,a 2,0),(0,0,a 3) (0, 0, 0), \ (a_1, 0, 0), \ (0, a_2, 0), (0, 0, a_3)

in 3\mathbb{R}^3 is a 3-simplex.

Now do the same thing in 1\ell^1 for an infinite sequence a 1,a 2,a_1, a_2, \ldots. Write XX for the convex hull, with the metric from the 1\ell^1 norm. (Or to be precise, it’s the closed convex hull.) Now:

  • XX is compact \iff a n0a_n \to 0 as nn \to \infty

  • XX has finite magnitude \iff na n<\sum_n a_n \lt \infty.

So we get a compact space of infinite magnitude by taking any sequence (a n)(a_n) of positive reals that converges to 00 but has infinite sum.

In other words, the gap between compactness and finite magnitude is the same as the gap we emphasize when we teach a first course on sequences and series: for a series to converge is a stronger condition than for its sequence of entries to converge to 00.

The second question we settle in our paper is about spaces of minimal magnitude. I mentioned earlier that the magnitude of a nonempty positive definite compact metric space (let me just say “space”) is in the interval [1,][1, \infty]. We now know that there are spaces with magnitude exactly \infty. It’s also trivial — once you have the definitions! — that for any infinite space XX, the magnitude |tX||t X| of the scaled-up space tXt X converges to \infty as tt \to \infty. But what about the other end of the scale: spaces of magnitude close to or equal to 11?

As it turns out, the situation for 11 is the opposite way round from the one for \infty. What I mean is this. It’s easy to say which spaces have magnitude exactly 11: it’s just the one-point space. Now given any space XX, the rescaled space tXt X looks more and more like a point as t0t \to 0, so you might expect that

lim t0|tX|=1. \lim_{t \to 0} |t X| = 1.

But in fact, this is where the complexity lies. Long ago, Simon found an example of a 6-point space XX such that lim t0|tX|=6/5\lim_{t \to 0} |t X| = 6/5. So we have a nontrivial situation on our hands.

Say that a space XX has the one-point property if lim t0|tX|=1\lim_{t \to 0} |t X| = 1. Although not every space has the one-point property, there are lots that do. For instance, it’s not so hard to show that every compact subset of n\mathbb{R}^n with the taxicab metric (the metric induced by the 1-norm) has the one-point property. And with much more effort, it was shown that compact subsets of n\mathbb{R}^n with the Euclidean metric have the one-point property too, first by Juan Antonio Barceló and Tony Carbery, then, by different arguments, by Simon and by Mark.

In our new paper, we find a single sufficient condition that unifies these results:

Let VV be a finite-dimensional normed vector space that is positive definite as a metric space. Then every nonempty compact subset of VV has the one-point property.

For instance, we could take VV to be N\mathbb{R}^N with either the taxicab or Euclidean metric, or more generally, we could give it the metric induced by the pp-norm for any p[1,2]p \in [1, 2]. An equivalent condition on VV is that it’s isometrically isomorphic to a linear subspace of L 1[0,1]L^1[0, 1], which gives a 1-norm flavour to this theorem too.

So our paper consists of these two theorems: one on spaces with magnitude \infty, and one on spaces with magnitude close to 11.

The proofs have something in common. Longtime Café readers will remember that around the time when magnitude got going, Simon and I made a conjecture about the magnitude of convex subsets of Euclidean space. That conjecture turned out to be false, although several aspects of it were correct.

But less publicized was an analogous conjecture for convex subsets of N\mathbb{R}^N with the taxicab metric. And this conjecture turned out to be true! Or at least, true when the convex set XX has nonempty interior. The result is that

|X|= i=0 N2 iV i(X) |X| = \sum_{i = 0}^N 2^{-i} V'_i(X)

(Theorem 4.6 here), where V iV'_i is a 1-norm analogue of the iith intrinsic volume. For example, this implies that the magnitude function |tX||t X| of XX is a polynomial:

|tX|= i=0 N2 iV i(X)t i. |t X| = \sum_{i = 0}^N 2^{-i} V'_i(X) \cdot t^i.

And even if XX has empty interior, we still have an inequality one way round: |X||X| \leq \sum \ldots. (It may be an equality for all we know — that remains unsettled.)

In any case, this result gets used in the proofs of both theorems in our new paper: first, to find the magnitude of that infinite-dimensional simplex, and second, to bound the magnitude of polytopes, which turns out to be the key to the whole thing.

By now, I feel like this post must already be nearly as long as the paper itself, so I’ll link to it once more and stop.

December 28, 2021

Terence TaoThe inverse theorem for the U^3 Gowers uniformity norm on arbitrary finite abelian groups: Fourier-analytic and ergodic approaches

Asgar Jamneshan and myself have just uploaded to the arXiv our preprint “The inverse theorem for the {U^3} Gowers uniformity norm on arbitrary finite abelian groups: Fourier-analytic and ergodic approaches“. This paper, which is a companion to another recent paper of ourselves and Or Shalom, studies the inverse theory for the third Gowers uniformity norm

\displaystyle  \| f \|_{U^3(G)}^8 = {\bf E}_{h_1,h_2,h_3,x \in G} \Delta_{h_1} \Delta_{h_2} \Delta_{h_3} f(x)

on an arbitrary finite abelian group {f}, where {\Delta_h f(x) := f(x+h) \overline{f(x)}} is the multiplicative derivative. Our main result is as follows:

Theorem 1 (Inverse theorem for {U^3(G)}) Let {G} be a finite abelian group, and let {f: G \rightarrow {\bf C}} be a {1}-bounded function with {\|f\|_{U^3(G)} \geq \eta} for some {0 < \eta \leq 1/2}. Then:
  • (i) (Correlation with locally quadratic phase) There exists a regular Bohr set {B(S,\rho) \subset G} with {|S| \ll \eta^{-O(1)}} and {\exp(-\eta^{-O(1)}) \ll \rho \leq 1/2}, a locally quadratic function {\phi: B(S,\rho) \rightarrow {\bf R}/{\bf Z}}, and a function {\xi: G \rightarrow \hat G} such that

    \displaystyle  {\bf E}_{x \in G} |{\bf E}_{h \in B(S,\rho)} f(x+h) e(-\phi(h)-\xi(x) \cdot h)| \gg \eta^{O(1)}.

  • (ii) (Correlation with nilsequence) There exists an explicit degree two filtered nilmanifold {H/\Lambda} of dimension {O(\eta^{-O(1)})}, a polynomial map {g: G \rightarrow H/\Lambda}, and a Lipschitz function {F: H/\Lambda \rightarrow {\bf C}} of constant {O(\exp(\eta^{-O(1)}))} such that

    \displaystyle  |{\bf E}_{x \in G} f(x) \overline{F}(g(x))| \gg \exp(-\eta^{-O(1)}).

Such a theorem was proven by Ben Green and myself in the case when {|G|} was odd, and by Samorodnitsky in the {2}-torsion case {G = {\bf F}_2^n}. In all cases one uses the “higher order Fourier analysis” techniques introduced by Gowers. After some now-standard manipulations (using for instance what is now known as the Balog-Szemerédi-Gowers lemma), one arrives (for arbitrary {G}) at an estimate that is roughly of the form

\displaystyle  |{\bf E}_{x \in G} {\bf E}_{h,k \in B(S,\rho)} f(x+h+k) b(x,k) b(x,h) e(-B(h,k))| \gg \eta^{O(1)}

where {b} denotes various {1}-bounded functions whose exact values are not too important, and {B: B(S,\rho) \times B(S,\rho) \rightarrow {\bf R}/{\bf Z}} is a symmetric locally bilinear form. The idea is then to “integrate” this form by expressing it in the form

\displaystyle  B(h,k) = \phi(h+k) - \phi(h) - \phi(k) \ \ \ \ \ (1)

for some locally quadratic {\phi: B(S,\rho) \rightarrow {\bf C}}; this then allows us to write the above correlation as

\displaystyle  |{\bf E}_{x \in G} {\bf E}_{h,k \in B(S,\rho)} f(x+h+k) e(-\phi(h+k)) b(x,k) b(x,h)| \gg \eta^{O(1)}

(after adjusting the {b} functions suitably), and one can now conclude part (i) of the above theorem using some linear Fourier analysis. Part (ii) follows by encoding locally quadratic phase functions as nilsequences; for this we adapt an algebraic construction of Manners.

So the key step is to obtain a representation of the form (1), possibly after shrinking the Bohr set {B(S,\rho)} a little if needed. This has been done in the literature in two ways:

  • When {|G|} is odd, one has the ability to divide by {2}, and on the set {2 \cdot B(S,\frac{\rho}{10}) = \{ 2x: x \in B(S,\frac{\rho}{10})\}} one can establish (1) with {\phi(h) := B(\frac{1}{2} h, h)}. (This is similar to how in single variable calculus the function {x \mapsto \frac{1}{2} x^2} is a function whose second derivative is equal to {1}.)
  • When {G = {\bf F}_2^n}, then after a change of basis one can take the Bohr set {B(S,\rho)} to be {{\bf F}_2^m} for some {m}, and the bilinear form can be written in coordinates as

    \displaystyle  B(h,k) = \sum_{1 \leq i,j \leq m} a_{ij} h_i k_j / 2 \hbox{ mod } 1

    for some {a_{ij} \in {\bf F}_2} with {a_{ij}=a_{ji}}. The diagonal terms {a_{ii}} cause a problem, but by subtracting off the rank one form {(\sum_{i=1}^m a_{ii} h_i) ((\sum_{i=1}^m a_{ii} k_i) / 2} one can write

    \displaystyle  B(h,k) = \sum_{1 \leq i,j \leq m} b_{ij} h_i k_j / 2 \hbox{ mod } 1

    on the orthogonal complement of {(a_{11},\dots,a_{mm})} for some coefficients {b_{ij}=b_{ji}} which now vanish on the diagonal: {b_{ii}=0}. One can now obtain (1) on this complement by taking

    \displaystyle  \phi(h) := \sum_{1 \leq i < j \leq m} b_{ij} h_i h_k / 2 \hbox{ mod } 1.

In our paper we can now treat the case of arbitrary finite abelian groups {G}, by means of the following two new ingredients:

  • (i) Using some geometry of numbers, we can lift the group {G} to a larger (possibly infinite, but still finitely generated) abelian group {G_S} with a projection map {\pi: G_S \rightarrow G}, and find a globally bilinear map {\tilde B: G_S \times G_S \rightarrow {\bf R}/{\bf Z}} on the latter group, such that one has a representation

    \displaystyle  B(\pi(x), \pi(y)) = \tilde B(x,y) \ \ \ \ \ (2)

    of the locally bilinear form {B} by the globally bilinear form {\tilde B} when {x,y} are close enough to the origin.
  • (ii) Using an explicit construction, one can show that every globally bilinear map {\tilde B: G_S \times G_S \rightarrow {\bf R}/{\bf Z}} has a representation of the form (1) for some globally quadratic function {\tilde \phi: G_S \rightarrow {\bf R}/{\bf Z}}.

To illustrate (i), consider the Bohr set {B(S,1/10) = \{ x \in {\bf Z}/N{\bf Z}: \|x/N\|_{{\bf R}/{\bf Z}} < 1/10\}} in {G = {\bf Z}/N{\bf Z}} (where {\|\|_{{\bf R}/{\bf Z}}} denotes the distance to the nearest integer), and consider a locally bilinear form {B: B(S,1/10) \times B(S,1/10) \rightarrow {\bf R}/{\bf Z}} of the form {B(x,y) = \alpha x y \hbox{ mod } 1} for some real number {\alpha} and all integers {x,y \in (-N/10,N/10)} (which we identify with elements of {G}. For generic {\alpha}, this form cannot be extended to a globally bilinear form on {G}; however if one lifts {G} to the finitely generated abelian group

\displaystyle  G_S := \{ (x,\theta) \in {\bf Z}/N{\bf Z} \times {\bf R}: \theta = x/N \hbox{ mod } 1 \}

(with projection map {\pi: (x,\theta) \mapsto x}) and introduces the globally bilinear form {\tilde B: G_S \times G_S \rightarrow {\bf R}/{\bf Z}} by the formula

\displaystyle  \tilde B((x,\theta),(y,\sigma)) = N^2 \alpha \theta \sigma \hbox{ mod } 1

then one has (2) when {\theta,\sigma} lie in the interval {(-1/10,1/10)}. A similar construction works for higher rank Bohr sets.

To illustrate (ii), the key case turns out to be when {G_S} is a cyclic group {{\bf Z}/N{\bf Z}}, in which case {\tilde B} will take the form

\displaystyle  \tilde B(x,y) = \frac{axy}{N} \hbox{ mod } 1

for some integer {a}. One can then check by direct construction that (1) will be obeyed with

\displaystyle  \tilde \phi(x) = \frac{a \binom{x}{2}}{N} - \frac{a x \binom{N}{2}}{N^2} \hbox{ mod } 1

regardless of whether {N} is even or odd. A variant of this construction also works for {{\bf Z}}, and the general case follows from a short calculation verifying that the claim (ii) for any two groups {G_S, G'_S} implies the corresponding claim (ii) for the product {G_S \times G'_S}.

This concludes the Fourier-analytic proof of Theorem 1. In this paper we also give an ergodic theory proof of (a qualitative version of) Theorem 1(ii), using a correspondence principle argument adapted from this previous paper of Ziegler, and myself. Basically, the idea is to randomly generate a dynamical system on the group {G}, by selecting an infinite number of random shifts {g_1, g_2, \dots \in G}, which induces an action of the infinitely generated free abelian group {{\bf Z}^\omega = \bigcup_{n=1}^\infty {\bf Z}^n} on {G} by the formula

\displaystyle  T^h x := x + \sum_{i=1}^\infty h_i g_i.

Much as the law of large numbers ensures the almost sure convergence of Monte Carlo integration, one can show that this action is almost surely ergodic (after passing to a suitable Furstenberg-type limit {X} where the size of {G} goes to infinity), and that the dynamical Host-Kra-Gowers seminorms of that system coincide with the combinatorial Gowers norms of the original functions. One is then well placed to apply an inverse theorem for the third Host-Kra-Gowers seminorm {U^3(X)} for {{\bf Z}^\omega}-actions, which was accomplished in the companion paper to this one. After doing so, one almost gets the desired conclusion of Theorem 1(ii), except that after undoing the application of the Furstenberg correspondence principle, the map {g: G \rightarrow H/\Lambda} is merely an almost polynomial rather than a polynomial, which roughly speaking means that instead of certain derivatives of {g} vanishing, they instead are merely very small outside of a small exceptional set. To conclude we need to invoke a “stability of polynomials” result, which at this level of generality was first established by Candela and Szegedy (though we also provide an independent proof here in an appendix), which roughly speaking asserts that every approximate polynomial is close in measure to an actual polynomial. (This general strategy is also employed in the Candela-Szegedy paper, though in the absence of the ergodic inverse theorem input that we rely upon here, the conclusion is weaker in that the filtered nilmanifold {H/\Lambda} is replaced with a general space known as a “CFR nilspace”.)

This transference principle approach seems to work well for the higher step cases (for instance, the stability of polynomials result is known in arbitrary degree); the main difficulty is to establish a suitable higher step inverse theorem in the ergodic theory setting, which we hope to do in future research.

December 27, 2021

Michael SchmittATLAS empirical electron-muon test

A search for an unexpected asymmetry in the production of e+μ and eμ+ pairs in proton-proton collisions recorded by the ATLAS detector at √s = 13 TeV

This paper (arXiv:2112.08090), written by the ATLAS Collaboration, is original and interesting. The numbers of events with (e+μ) and with (eμ+) should be the same, according to the standard model (SM). So, if they are observed not to be the same, then new physics is at play. This specific idea was first proposed by Lester and Brunt: (arXiv:1612.02697) who focused on R-parity violating supersymmetric models. I will focus on the experimental analysis and skip the theoretical models.

The results are formulated in terms of a ratio, ρ, which is N(e+mu) / N(emu+). Interestingly, the case ρ > 1 is easier to investigate than ρ < 1, and ATLAS does not derive results for the latter case. The problem is that non-prompt (a.k.a. “fake”) electrons are much more common than non-prompt muons, and W+are more common than W, so W+ → μ+ and fake e is more common than W+ → e+ and fake μ. This bias is partially corrected for, and any new physics leading to ρ > 1 needs to be strong enough to overcome the remaining bias. The selection of electrons and muons is not unusual, and kinematically, they must have pT > 25 GeV and |η| < 2.47.

The bias (i.e., prevalence) of efakeμ+real over e+fakemureal is handled by subtracting the appropriate number of background events (i.e., those with fake leptons) using a data-driven method. The contamination of the main signal region by fake leptons is small — only 2%. A very small effect due to the toroidal magnetic fields is corrected for by applying weights to events.

The event selection is rather simple, which is part of what makes this analysis appealing. An opposite-sign, electron-muon pair is required. In addition, ΣT, the sum of the transverse masses for the two leptons, must be at least 200 GeV. A subset of such events containing at least one jet (with pT > 20 GeV) is also analyzed. (Additional signal regions that are more model-inspired are defined, but I won’t be discussing them here.). The event yields are studied as functions of either MT2 or Hp, where Hp is just the scalar sum of the pT of the electron, muon, and leading jet.

Roughly 100k events are selected. The main contribution comes from top quark pair events — a process that is very well understood at the LHC. Diboson production is also important at the edge of the kinematic range. This nice plot shows the yield as a function of Hp.

By eye there is no difference in yields for the two final states eμ+ and e+μ. A careful statistical treatment based on the likelihood for Poisson distributions in each bin leads to the following quantitative result: ρ = 0.987 ± 0.022 where I have neglected a small asymmetry in the error bar. Clearly, this measurement of ρ is consistent with unity and therefore with SM expectations. There is no evidence for ρ>1. Statistical p-values for all bins are quite reasonable — the largest deviation from unity is a downward fluctuation (3.1σ) in a single bin.

The authors go one to derive upper limits on the number of new physics signal events as a function of how those events are shared between the two final states (i.e., the fraction z that enter eμ+). I find it striking that in some regions this analysis is sensitive to mere tens of events.

Of course, it would be wonderful to have a deviation from the SM expectation. Unfortunately, we do not have that here.

I like this analysis for its relatively inclusive nature and straight forward simplicity. There are many SM analyses which hope to see a deviation in a kinematic quantity such as jet pT or ST (which is similar to Hp here). What is relatively new is that this analysis looks at charge and flavor in the hopes of observing a telltale deviation. The two specific models considered in this paper — RPV-supersymmetry and leptoquarks — serve to demonstrate that new physics could lead to an observable deviation. I’m sure the authors will expand their final state and maybe tease out hints for new physics in a more sophisticated analysis.

Doug NatelsonUS News graduate program rankings - bear this in mind

The US News rankings of graduate programs have a surprisingly out-sized influence.  Prospective graduate students seem to pay a lot of attention to these, as do some administrators.  All ranking schemes have issues:  Trying to encapsulate something as complex and multi-variate as research across a whole field + course offerings + student life + etc. in a single number is inherently an oversimplification. 

The USNWR methodology is not secret - here is how they did their 2018 rankings.   As I wrote over a decade ago, it's a survey.  That's all.  No detailed metrics about publications or research impact or funding or awards or graduate rates or post-graduation employment.  It's purely a reputational survey of department chairs/heads and "deans, other administrators and/or senior faculty at schools and programs of Ph.D. Physics programs", to quote the email I received earlier this month.   (It would be nice to know who gets the emails besides chairs - greater transparency would be appreciated.)

This year for physics, they appear to have sent the survey to 188 departments (the ones in the US that granted PhDs in the last five years), and historically the response rate is about 22%.  This implies that the opinions of a distressingly small number of people are driving these rankings, which are going to have a non-perturbative effect on graduate recruiting (for example) for the next several years.   I wish people would keep that in mind when they look at these numbers as if they are holy writ.  

(You also have to be careful about rough analytics-based approaches.  If you ranked departments based purely on publications-per-faculty-member, for example, you would select for departments that are largely made up of particle physics experimentalists.  Also, as the NRC found out the last time they did their decadal rankings, the quality of data entry is incredibly important.)

My advice to students:  Don't place too much emphasis on any particular ranking scheme, and actually look closely at department and research group websites when considering programs.  

December 26, 2021

John BaezClemens non Papa

As I’ve explored more music from the Franco-Flemish school, I’ve gotten to like some of the slightly less well-known composers—though usually famous in their day—such as Jacobus Clemens non Papa, who lived in Flanders from roughly 1510 to 1555. I enjoy his clear, well-balanced counterpoint. It’s peppy, well-structured, but unromantic: no grand gestures or strong emotions, just lucid clarity. That’s quite appealing to me these days.

On a website about Flemish music I read that:

The style of his work stayed “northern”, without any Italian influences. As far as is known Clemens never ventured out of the Low Countries to pursue a career at a foreign court or institution, unlike many of his contemporaries. This is reflected in most of his religious pieces, where the style is generally reliant on counterpoint arrangements where every voice is independently formed.

Not much is known of his life. The name ‘Clemens non Papa’ may be a bit of a joke, since his last name was Clemens, but there was also a pope of that name, so it may have meant ‘Clemens — not the Pope’.

That makes it all the more funny that if you look for a picture of Clemens non Papa, you’ll quickly be led to Classical, which has a nice article about him—with this picture:

Yes, this is Pope Clement VII.

Clemens non Papa was one of the best musicians of the fourth generation of the Franco-Flemish school, along with Nicolas Gombert, Thomas Crequillon and my personal favorite, Pierre de Manchicourt. He was extremely prolific! He wrote 233 motets, 15 masses, 15 Magnificats, 159 settings of the Psalms in Dutch, and a bit over 100 secular pieces, including 89 chansons.

But unfortunately, he doesn’t seem to have inspired the tireless devotion among modern choral groups that more famous Franco-Flemish composers have. I’m talking about projects like The Clerks’ complete recordings of the sacred music of Ockeghem in five CDs, The Sixteen’s eight CDs of Palestrina, or the Tallis Scholars’ nine CDs of masses by Josquin. There’s something about early music that incites such massive projects! I think I know what it is: it’s beautiful, and a lot has been lost or forgotten, so you when you fall in love with it you start wanting to preserve and share it.

Maybe someday we’ll see complete recordings of the works of Clemens non Papa! But right now all we have are small bits—and let me list some.

A great starting-point is Clemens non Papa: Missa Pastores quidnam vidistis by the Tallis Scholars. This whole album is currently available as a YouTube playlist:

Another important album is Behold How Joyful – Clemens non Papa: Mass and Motets by the Brabant Ensemble. It too is is available as a playlist on YouTube:

The Brabant Ensemble have another album of Clemens non Papa’s music, Clemens non Papa: Missa pro defunctis, Penitential Motets. I haven’t heard it.

Next, the Egidius Kwartet has a wonderful set of twelve CDs called De Leidse Koorboeken—yet another of the massive projects I mentioned—in which they sing everything in the Leiden Choirbooks. These were six volumes of polyphonic Renaissance music of the Franco-Flemish school copied for a church in Leiden sometime in the 15th or 16th century, which somehow survived an incident in 1566 when a mob burst into that church and ransacked it.

You can currently listen to the Egidius Kwartet’s performances of the complete Leiden Choirbooks on YouTube playlists:

Volume 2 contains these pieces by Clemens non Papa—click to listen to them:

Heu mihi Domine, a4. Anima mea turbata est, a4.

Maria Magdalena, a5. Cito euntes, a5.

Jherusalem surge, a5. Leva in circuitu, a5.

Magnificat quarti thoni, a4.

Magnificat sexti thoni, a4.

Magnificat octavi toni, a4-5.

Volume 3 contains these:

Cum esset anna, a5.

Domine probasti, a5.

Advenit ignis divinus, a5.

Volume 4 contains these:

Angelus domini ad pastores, a4 – Secunda pars: Parvulus filius, a4.

Pastores loquebantur, a5 – Secunda pars: Et venerunt festinantes, a5.

Congratulamini mihi omnes, a4.

Sancti mei qui in carne – Secunda pars: Venite benedicti patris.

Pater peccavi, a4 – Secunda pars: Quanti mercenarii, a4.

Volume 5 contains this:

Ave Maria.

And finally, the group Henry’s Eight has a nice album Pierre de la Rue: Missa cum incundate, curently available as a YouTube playlist, which includes two pieces by Clemens non Papa:

Here are those pieces—click to hear them:

Ego flos campi.

Pater peccavi.

Here also is a live performance of Ego flos campi by the Choir of St James, in Winchester Cathedral:

Happy listening! And if you know a big trove of recordings of music by Clemens non Papa, let me know. I just know what’s on Discogs.

December 25, 2021

John BaezThe Binary Octahedral Group (Part 2)

Part 1 introduced the ‘binary octahedral group’. This time I just want to show you some more pictures related to this group. I’ll give just enough explanation to hint at what’s going on. For more details, check out this webpage:

• Greg Egan, Symmetries and the 24-cell.

Okay, here goes!

You can inscribe two regular tetrahedra in a cube:

Each tetrahedron has 4! = 24 symmetries permuting its 4 vertices.

The cube thus has 48 symmetries, twice as many. Half map each tetrahedron to itself, and half switch the two tetrahedra.

If we consider only rotational symmetries, not reflections, we have to divide by 2. The tetrahedron has 12 rotational symmetries. The cube has 24.

But the rotation group SO(3) has a double cover SU(2). So the rotational symmetry groups of tetrahedron and cube have double covers too, with twice as many elements: 24 and 48, respectively.

But these 24-element and 48-element groups are different from the ones mentioned before! They’re called the binary tetrahedral group and binary octahedral group—since we could have used the symmetries of an octahedron instead of a cube.

Now let’s think about these groups using quaternions. We can think of SU(2) as consisting of the ‘unit quaternions’—that is, quaternions of length 1. That will connect what we’re doing to 4-dimensional geometry!

The binary tetrahedral group

Viewed this way, the binary tetrahedral group consists of 24 unit quaternions. 8 of them are very simple:

\pm 1, \; \pm i, \; \pm j, \; \pm k

These form a group called the quaternion group, and they’re the vertices of a shape that’s the 4d analogue of a regular octahedron. It’s called the 4-dimensional cross-polytope and it looks like this:

The remaining 16 elements of the binary tetrahedral group are these:

\displaystyle{ \frac{\pm 1 \pm i \pm j \pm k}{2} }

They form the vertices of a 4-dimensional hypercube:

Putting the vertices of the hypercube and the cross-polytope together, we get all 8 + 16 = 24 elements of the binary tetrahedral group. These are the vertices of a 4-dimensional shape called the 24-cell:

This shape is called the 24-cell not because it has 24 vertices, but because it also has 24 faces, which happen to be regular octahedra. You can see one if you slice the 24-cell like this:

The slices here have real part 1, ½, 0, -½, and -1 respectively. Note that the slices with real part ±½ contain the vertices of a hypercube, while the rest contain the vertices of a cross-polytope.

And here’s another great way to think about the binary tetrahedral group. We’ve seen that if you take every other vertex of a cube you get the vertices of a regular tetrahedron. Similarly, if you take every other vertex of a 4d hypercube you get a 4d cross-polytope. So, you can take the vertices of a 4d hypercube and partition them into the vertices of two cross-polytopes.

As a result, the 24 elements of the binary tetrahedral group can be partitioned into three cross-polytopes! Greg Egan shows how it looks:

The binary octahedral group

Now that we understand the binary tetrahedral group pretty well, we’re ready for our actual goal: understanding the binary octahedral group! We know this forms a group of 48 unit quaternions, and we know it acts as symmetries of the cube—with elements coming in pairs that act on the cube in the same way, because it’s a double cover of the rotational symmetry group of the cube.

So, we can partition its 48 elements into two kinds: those that preserve each tetrahedron in this picture, and those that switch these two tetahedra:

The first 24 form a copy of the binary tetrahedral group and thus a 24-cell, as we have discussed. The second form another 24-cell! And these two separate 24-cells are ‘dual’ to each other: the vertices of each one hover above the centers of the other’s faces.

Greg has nicely animated the 48 elements of the binary octahedral group here:

He’s colored them according to the rotations of the cube they represent:

• black: identity
• red: ±120° rotation around a V axis
• yellow: 180° rotation around an F axis
• blue: ±90° rotation around an F axis
• cyan: 180° rotation around an E axis

Here ‘V, F, and E axes’ join opposite vertices, faces, and edges of the cube.

Finally, note that because

• we can partition the 48 vertices of the binary octahedral group into two 24-cells


• we can partition the 24 vertices of the 24-cell into three cross-polytopes

it follows that we can partition the 48 vertices of the binary octahedral group into six cross-polytopes.

I don’t know the deep meaning of this fact. I know that the vertices of the 24-cell correspond to the 24 roots of the Lie algebra \mathfrak{so}(8). I know that the famous ‘triality’ symmetry of \mathfrak{so}(8) permutes the three cross-polytopes in the 24-cell, which are in some rather sneaky way related to the three 8-dimensional irreducible representations of \mathfrak{so}(8). I also know that if we take the two 24-cells in the binary octahedral group, and expand one by a factor of \sqrt{2}, so the vertices of other lie exactly at the center of its faces, we get the 48 roots of the Lie algebra \mathfrak{f}_4. But I don’t know how to extend this story to get a nice story about the six cross-polytopes in the binary octahedral group.

All I know is that if you pick a quaternion group sitting in the binary octahedral group, it will have 6 cosets, and these will be six cross-polytopes.

n-Category Café The Binary Octahedral Group

It’s been pretty quiet around the nn-Café lately! I’ve been plenty busy myself: Lisa and I just drove back from DC to Riverside with stops at Roanoke, Nashville, Hot Springs, Okahoma City, Santa Rosa (a small town in New Mexico), Gallup, and Flagstaff. A lot of great places! Hot Springs claims to have the world’s shortest street, but I’m curious what the contenders are. Tomorrow I’m supposed to talk with James Dolan about hyperelliptic curves. And I’m finally writing a paper about the number 24.

But for now, here’s a little Christmas fun with Platonic solids and their symmetries. For more details, see:

All the exciting animations in my post here were created by Greg. And if you click on any of the images in my post here, you’ll learn more.

The complex numbers together with infinity form a sphere called the Riemann sphere. The 6 simplest numbers on this sphere lie at points we could call the north pole, the south pole, the east pole, the west pole, the front pole and the back pole. They’re the corners of an octahedron!

On the Earth, we all know where the north pole and south pole are. I’d say the “front pole” is where the prime meridian meets the equator at 0°N 0°E. It’s called Null Island, but there’s no island there—just a buoy. Here it is:

Where’s the back pole, the east pole and the west pole? I’ll leave two of these as puzzles, but I discovered that in Singapore I’m fairly close to the east pole:

If you think of the octahedron’s corners as the quaternions ±i,±j,±k,\pm i, \pm j, \pm k, you can look for unit quaternions qq such that whenever xx is one of these corners, so is qxq 1q x q^{-1}. There are 48 of these! They form a group called the binary octahedral group.

By how we set it up, the binary octahedral group acts as rotational symmetries of the octahedron: any transformation sending xx to qxq 1q x q^{-1} is a rotation. But this group is a double cover of the octahedron’s rotational symmetry group! That is, pairs of elements of the binary octahedral group describe the same rotation of the octahedron.

If we go back and think of the Earth’s 6 poles as points 0,±1,±i,0, \pm 1,\pm i, \infty on the Riemann sphere instead of ±i,±j,±k\pm i, \pm j, \pm k, we can think of the binary octahedral group as a subgroup of SL(2,\mathrm{SL}(2,\mathbb{C}), since this acts as conformal transformations of the Riemann sphere!

If we do this, the binary octahedral group is actually a subgroup of SU(2)\mathrm{SU}(2), the double cover of the rotation group—which is isomorphic to the group of unit quaternions. So it all hangs together.

It’s fun to actualy see the unit quaternions in the binary octahedral group. First we have 8 that form a group on their own, called the quaternion group:

±1,±i,±j,±k \pm 1, \pm i , \pm j , \pm k

These are the vertices of the 4d analogue of an octahedron, called a cross-polytope. It looks like this:

Then we have 16 that form the corners of a hypercube (the 4d analogue of a cube, also called a tesseract or 4-cube):

±1±i±j±k2 \displaystyle{ \frac{\pm 1 \pm i \pm j \pm k}{2} }

They look like this:

These don’t form a group, but if we take them together with the 8 previous ones we get a 24-element subgroup of the unit quaternions called the binary tetrahedral group. These 24 elements are also the vertices of the 24-cell, which is the one regular polytope in 4 dimensions that doesn’t have a 3d analogue. It looks like this:

This shape is called the 24-cell not because it has 24 vertices, but because it also has 24 faces, which happen to be regular octahedra. You can see one if you slice the 24-cell like this:

The slices here have real part 1, ½, 0, -½, and -1 respectively. Note that the slices with real part ±½ contain the vertices of a hypercube, while the rest contain the vertices of a cross-polytope.

And here’s another great way to think way about binary tetrahedral group. We’ve seen that if you take every other vertex of a cube you get the vertices of a regular tetrahedron. Similarly, if you take every other vertex of a 4d hypercube you get a 4d cross-polytope. So, you can take the vertices of a 4d hypercube and partition them into the vertices of two cross-polytopes. As a result, the 24 elements of the binary tetrahedral group can be partitioned into three cross-polytopes! Greg Egan shows how it looks:

So far we’ve accounted for half the quaternions in the binary octahedral group! Here are the other 24:

±1±i2,±1±j2,±1±k2, \displaystyle{ \frac{\pm 1 \pm i}{\sqrt{2}}, \frac{\pm 1 \pm j}{\sqrt{2}}, \frac{\pm 1 \pm k}{\sqrt{2}}, }

±i±j2,±j±k2,±k±i2 \displaystyle{ \frac{\pm i \pm j}{\sqrt{2}}, \frac{\pm j \pm k}{\sqrt{2}}, \frac{\pm k \pm i}{\sqrt{2}} }

These form the vertices of another 24-cell! So the binary octahedral group can be built by taking the vertices of two separate 24-cells. And in fact, these two 24-cells are ‘dual’ to each other: the vertices of each one hover right above the centers of the faces of the other!

The first 24 quaternions, those in the binary tetrahedral group, give rotations that preserve each one of the two tetrahedra that you can fit around an octahedron—or in a cube:

while the second 24 switch these tetrahedra.

Greg has nicely animated the 48 elements of the binary octahedral group here:

He’s colored them according to the rotations of the octahedron they represent. Remember: in the binary octahedral group, two elements qq and q-q describe each rotational symmetry of the octahedron. And here they are:

Black: the 2 elements

±1\pm 1

act as the identity.

Blue: the 12 elements

±1±i2,±1±j2,±1±k2 \displaystyle{ \frac{\pm 1 \pm i}{\sqrt{2}}, \frac{\pm 1 \pm j}{\sqrt{2}}, \frac{\pm 1 \pm k}{\sqrt{2}} }

describe ±90° rotations around the octahedron’s 3 axes.

Red: the 16 elements

±1±i±j±k2 \displaystyle{ \frac{\pm 1 \pm i \pm j \pm k}{2} }

describe 120° rotations of the octahedron’s 8 triangles.

Yellow: The 6 elements

±i,±j,±k \pm i , \pm j , \pm k

describe 180° rotations around the octahedron’s 3 axes.

Cyan: the 12 elements

±i±j2,±j±k2,±k±i2 \displaystyle{ \frac{\pm i \pm j}{\sqrt{2}}, \frac{\pm j \pm k}{\sqrt{2}}, \frac{\pm k \pm i}{\sqrt{2}} }

describe 180° rotations of the octahedron’s 6 opposite pairs of edges.

Finally, here’s another fun fact about the binary octahedral group that follows from what we’ve already seen. Note that because

  • we can partition the 48 vertices of the binary octahedral group into two 24-cells


  • we can partition the 24 vertices of the 24-cell into three cross-polytopes

it follows that we can partition the 48 vertices of the binary octahedral group into six cross-polytopes!

I don’t know the deep meaning of this fact. I know that the vertices of the 24-cell correspond to the 24 roots of the Lie algebra 𝔰𝔬(8).\mathfrak{so}(8). I know that the the famous ‘triality’ symmetry of 𝔰𝔬(8)\mathfrak{so}(8) permutes the three cross-polytopes in the 24-cell, which are in some rather sneaky way related to the three 8-dimensional irreducible representations of 𝔰𝔬(8).\mathfrak{so}(8). I also know that if we take the two 24-cells in the binary octahedral group, and expand one by a factor of 2,\sqrt{2}, so the vertices of other lie exactly at the center of its faces, we get the 48 roots of the Lie algebra 𝔣 4.\mathfrak{f}_4. But I don’t know how to extend this story to get a nice story about the six cross-polytopes in the binary octahedral group.

All I know is that if you pick a quaternion group sitting in the binary octahedral group, it will have 6 cosets, and these will be six cross-polytopes.

December 24, 2021

Alexey PetrovScience of Santa

How does Santa manage to visit so many children’s homes in such a short time? While this question has been used to doubt the existence of Santa himself, different sciences might provide answers to it. Here are some of them

Astrophysics: Santa is nonluminous material that is postulated to exist in space that could take any of several forms (including the big fat guy in a red suit) that clumps in the locations of children’s houses.

Biology: Santa is a biological entity (such as Dolly the sheep) that produces multiple clones of itself once a year.

Botany: Santa is a mycelium-like vegetative part of a fungus, consisting of a mass of branching, thread-like hyphae that propagate around most of the planet. Once a year it develops multiple fruiting bodies whose locations coincide with the locations of human children’s houses. In simpler terms, Santa is a mushroom.

Quantum Mechanics: Santa consists of multiple quantum objects in an entangled state. His existence signifies the fact that most accepted formulations of quantum mechanics are incomplete.

Cosmology: in its evolution, Santa goes through the epoch of rapid exponential superluminal expansion.

Computer science/Hollywood/sometimes even particle physics: since we all live in a computer simulation, Santa is one of the swarming bots created by the Analyst to control the inhabitants of the Matrix.

Economics: we don’t care how he does it. It positively affects the GDP of many countries.

Matt von HippelScience, Gifts Enough for Lifetimes

Merry Newtonmas, Everyone!

In past years, I’ve compared science to a gift: the ideal gift for the puzzle-fan, one that keeps giving new puzzles. I think people might not appreciate the scale of that gift, though.

Bigger than all the creative commons Wikipedia images

Maybe you’ve heard the old joke that studying for a PhD means learning more and more about less and less until you know absolutely everything about nothing at all. This joke is overstating things: even when you’ve specialized down to nothing at all, you still won’t know everything.

If you read the history of science, it might feel like there are only a few important things going on at a time. You notice the simultaneous discoveries, like calculus from Newton and Liebniz and natural selection from Darwin and Wallace. You can get the impression that everyone was working on a few things, the things that would make it into the textbooks. In fact, though, there was always a lot to research, always many interesting things going on at once. As a scientist, you can’t escape this. Even if you focus on your own little area, on a few topics you care about, even in a small field, there will always be more going on than you can keep up with.

This is especially clear around the holiday season. As everyone tries to get results out before leaving on vacation, there is a tidal wave of new content. I have five papers open on my laptop right now (after closing four or so), and some recorded talks I keep meaning to watch. Two of the papers are the kind of simultaneous discovery I mentioned: two different groups noticing that what might seem like an obvious fact – that in classical physics, unlike in quantum, one can have zero uncertainty – has unexpected implications for our kind of calculations. (A third group got there too, but hasn’t published yet.) It’s a link I would never have expected, and with three groups coming at it independently you’d think it would be the only thing to pay attention to: but even in the same sub-sub-sub-field, there are other things going on that are just as cool! It’s wild, and it’s not some special quirk of my area: that’s science, for all us scientists. No matter how much you expect it to give you, you’ll get more, lifetimes and lifetimes worth. That’s a Newtonmas gift to satisfy anyone.

December 23, 2021

Peter Rohde The Logarithmic Ape Index

The Ape Index (aka Gorilla Index) is a well-known term in the climbing community, and one that is often quoted, albeit primarily by compulsive, male, beta-sprayers trying to impress in compensation for not knowing how to climb.

While it is often speculated that higher Ape Indices correlate with better climbing performance, the actual merits of the figure are highly questionable. Nonetheless, it has become such a household term that it’s likely here to stay.

Vitruvian Man by Leonardo da Vinci (circa 1490) popularised the notion of the ‘ideal man’ fitting perfectly into a square, with matching height and wingspan.

There are multiple, inconsistent definitions for the Ape Index, one based on the difference between height and wingspan, the other based on their ratio.

These inconsistent definitions complicate objective interpretation. Others, on the other hand, like Billy Jackson here, are unaware of the distinction between division and subtraction altogether, which brings into question whether I should be attending their firearm safety course.

If we are to continue referring to the Ape Index, it is important to employ a standardised formalism with consistent interpretation.

There are three desirable mathematical properties we would like an intuitive and consistent measure for the Ape Index to exhibit:

  • Scale-invariance: the measure is preserved across individuals exhibiting the same physical ratios but of different physical sizes, such that someone preserves their index if their body ratios are preserved as they grow.
  • Unitless: the measure is independent of the choice of measurement units, thereby mitigating culture wars between the United States and everyone else because the Americans are offended at being perceived less apey for refusing to use the metric system.
  • Anti-symmetric: negating the measure’s value corresponds to interchanging height and wingspan, equivalently rotating the frame by 90°, i.e $A(w,h)=-A(h,w)$. Hence, the Vitruvian Man fitting a perfect square must identically have value 0.

The two alternate definitions of the Ape Index previously employed exhibit some of these characteristics, but neither exhibits all.

The first method for defining the Ape Index is as the wingspan-to-height ratio,

$A_1(w,h) = \frac{w}{h}.$

This definition exhibits scale-invariance and is unitless, however it suffers the drawback that it is not anti-symmetric. Using this measure, interchanging height and wingspan implements the transformation $A_1\to A_1^{-1}$, as opposed to $A_1\to -A_1$.

The second method for defining the Ape Index is as the difference between wingspan and height,

$A_2(w,h) = w-h.$

This definition is clearly anti-symmetric, but neither scale-invariant nor unitless.

We would ideally like a measure that exhibits all three desirable properties: scale-invariance, unitless and anti-symmetric. We achieve this by introducing the Logarithmic Ape Index (LAI), defined as follows,

$A_L(w,h) = \beta\,\mathrm{log}\left(\frac{w}{h}\right) = \beta\,[\mathrm{log}(w)-\mathrm{log}(h)],$

where $\beta>0$ is a calibration parameter, and by convention we employ the natural logarithm. Upon inspection this inherits the structural properties of both previously defined measures. From the left hand side it is both scale-free and unitless, and from the right hand side anti-symmetric. This algebraic form naturally satisfies all three desired properties based on the characteristics of the logarithmic function.

The parameter $\beta$ is specified by calibrating against a chosen point of reference. We will do so by adopting the convention that an LAI of $A_L=10$ corresponds to an adult gorilla, where maximum heights and wingspans of $h=1.8\mathrm{m}$ and $w=2.6\mathrm{m}$ have been reported respectively. This yields the Gorilla coefficient,

$\beta_G = \frac{10}{\mathrm{log}(w)-\mathrm{log}(h)}= \frac{10}{\mathrm{log}(2.6)-\mathrm{log}(1.8)}\approx 27.19.$

Gorillas, however, are an unfair point of comparison for humans, given how exceedingly apey they are. We therefore also provide a second parameterisation for the LAI using Alex Honnold — arguably the world’s most successful ape — as a human baseline, similarly defined to yield the Honnold coefficient,

$\beta_H = \frac{10}{\mathrm{log}(w)-\mathrm{log}(h)}= \frac{10}{\mathrm{log}(188)-\mathrm{log}(180)}\approx 229.96,$

where $h=180\mathrm{cm}$ and $w=188\mathrm{cm}$.

You can use my free online calculator below to calculate your Logarithmic Ape Index.

*The beta coefficient is not to be confused with the beta-spray coefficient, which scales inverse proportionately to the beta coefficient, $\beta_S=1/\beta$.


Below we reproduce the LAIs of some notable apes using both the Gorilla and Honnold coefficients.

Gorilla Honnold
Gorilla 10 84.56
Kai Harada 1.71 14.5
Tomoa Narasaki 1.55 13.14
Jan Hojer 1.41 11.92
Alex Honnold 1.18 10
Emily Harrington 1.02 8.62
Jongwon Chon 0.91 7.67
Chris Sharma 0.88 7.42
Alex Puccio 0.84 7.08
Stefano Ghisolfi 0.63 5.35
Adam Ondra 0.15 1.23
Perfect human 0 0
Alex Megos 0 0
Hazel Findley 0 0
Lynn Hill 0 0
Tommy Caldwell 0 0
Babis Zangerl -0.34 -2.86

For your information, the author of this article is a perfect Vitruvian man. Woof.

December 22, 2021

Scott Aaronson My values, howled into the wind

I’m about to leave for a family vacation—our first such since before the pandemic, one planned and paid for literally the day before the news of Omicron broke. On the negative side, staring at the case-count graphs that are just now going vertical, I estimate a ~25% chance that at least one of us will get Omicron on this trip. On the positive side, I estimate a ~60% chance that in the next 6 months, at least one of us would’ve gotten Omicron or some other variant even without this trip—so maybe it’s just as well if we get it now, when we’re vaxxed to the maxx and ready and school and university are out.

If, however, I do end this trip dead in an ICU, I wouldn’t want to do so without having clearly set out my values for posterity. So with that in mind: in the comments of my previous post, someone asked me why I identify as a liberal or a progressive, if I passionately support educational practices like tracking, ability grouping, acceleration, and (especially) encouraging kids to learn advanced math whenever they’re ready for it. (Indeed, that might be my single stablest political view, having been held, for recognizably similar reasons, since I was about 5.)

Incidentally, that previous post was guest-written by my colleagues Edith Cohen and Boaz Barak, and linked to an open letter that now has almost 1500 signatories. Our goal was, and is, to fight the imminent dumbing-down of precollege math education in the United States, spearheaded by the so-called “California Mathematics Framework.” In our joint efforts, we’ve been careful with every word—making sure to maintain the assent of our entire list of signatories, to attract broad support, to stay narrowly focused on the issue at hand, and to bend over backwards to concede much as we could. Perhaps because of those cautions, we—amazingly—got some actual traction, reaching people in government (such as Rep. Ro Khanna, D – Silicon Valley) and technology leaders, and forcing the “no one’s allowed to take Algebra in 8th grade” faction to respond to us.

This was disorienting to me. On this blog, I’m used just to howling into the wind, having some agree, some disagree, some take to Twitter to denounce me, but in any case, having no effect of any kind on the real world.

So let me return to howling into the wind. And return to the question of what I “am” in ideology-space, which doesn’t have an obvious answer.

It’s like, what do you call someone who’s absolutely terrified about global warming, and who thinks the best response would’ve been (and actually, still is) a historic surge in nuclear energy, possibly with geoengineering to tide us over?

… who wants to end world hunger … and do it using GMO crops?

… who wants to smash systems of entrenched privilege in college admissions … and believes that the SAT and other standardized tests are the best tools ever invented for that purpose?

… who feels a personal distaste for free markets, for the triviality of what they so often elevate and the depth of what they let languish, but tolerates them because they’ve done more than anything else to lift up the world’s poor?

… who’s happiest when telling the truth for the cause of social justice … but who, if told to lie for the cause of social justice, will probably choose silence or even, if pushed hard enough, truth?

… who wants to legalize marijuana and psychedelics, and also legalize all the promising treatments currently languishing in FDA approval hell?

… who feels little attraction to the truth-claims of the world’s ancient religions, except insofar as they sometimes serve as prophylactics against newer and now even more virulent religions?

… who thinks the covid response of the CDC, FDA, and other authorities was a historic disgrace—not because it infringed on the personal liberties of antivaxxers or anything like that, but on the contrary, because it was weak, timid, bureaucratic, and slow, where it should’ve been like that of a general at war?

… who thinks the Nazi Holocaust was even worse than the mainstream holds it to be, because in addition to the staggering, one-lifetime-isn’t-enough-to-internalize-it human tragedy, the Holocaust also sent up into smoke whatever cultural process had just produced Einstein, von Neumann, Bohr, Szilard, Born, Meitner, Wigner, Haber, Pauli, Cantor, Hausdorff, Ulam, Tarski, Erdös, and Noether, and with it, one of the wellsprings of our technological civilization?

… who supports free speech, to the point of proudly tolerating views that really, actually disgust them at their workplace, university, or online forum?

… who believes in patriotism, the police, the rule of law, to the extent that they don’t understand why all the enablers of the January 6 insurrection, up to and including Trump, aren’t currently facing trial for treason against the United States?

… who’s (of course) disgusted to the core by Trump and everything he represents, but who’s also disgusted by the elite virtue-signalling hypocrisy that made the rise of a Trump-like backlash figure predictable?

… who not only supports abortion rights, but also looks forward to a near future when parents, if they choose, are free to use embryo selection to make their children happier, smarter, healthier, and free of life-crippling diseases (unless the “bioethicists” destroy that future, as a previous generation of Deep Thinkers destroyed our nuclear future)?

… who, when reading about the 1960s Sexual Revolution, instinctively sides with free-loving hippies and against the scolds … even if today’s scolds are themselves former hippies, or intellectual descendants thereof, who now clothe their denunciations of other people’s gross, creepy sexual desires in the garb of feminism and social justice?

What, finally, do you call someone whose image of an ideal world might include a young Black woman wearing a hijab, an old Orthodox man with black hat and sidecurls, a broad-shouldered white guy from the backwoods of Alabama, and a trans woman with purple hair, face tattoos and a nose ring … all of them standing in front of a blackboard and arguing about what would happen if Alice and Bob jumped into opposite ends of a wormhole?

Do you call such a person “liberal,” “progressive,” “center-left,” “centrist,” “Pinkerite,” “technocratic,” “neoliberal,” “libertarian-ish,” “classical liberal”? Why not simply call them “correct”? 🙂

December 20, 2021

Terence TaoThe structure of arbitrary Conze-Lesigne systems

Asgar Jamneshan, Or Shalom, and myself have just uploaded to the arXiv our preprint “The structure of arbitrary Conze–Lesigne systems“. As the title suggests, this paper is devoted to the structural classification of Conze-Lesigne systems, which are a type of measure-preserving system that are “quadratic” or of “complexity two” in a certain technical sense, and are of importance in the theory of multiple recurrence. There are multiple ways to define such systems; here is one. Take a countable abelian group {\Gamma} acting in a measure-preserving fashion on a probability space {(X,\mu)}, thus each group element {\gamma \in \Gamma} gives rise to a measure-preserving map {T^\gamma: X \rightarrow X}. Define the third Gowers-Host-Kra seminorm {\|f\|_{U^3(X)}} of a function {f \in L^\infty(X)} via the formula

\displaystyle  \|f\|_{U^3(X)}^8 := \lim_{n \rightarrow \infty} {\bf E}_{h_1,h_2,h_3 \in \Phi_n} \int_X \prod_{\omega_1,\omega_2,\omega_3 \in \{0,1\}}

\displaystyle {\mathcal C}^{\omega_1+\omega_2+\omega_3} f(T^{\omega_1 h_1 + \omega_2 h_2 + \omega_3 h_3} x)\ d\mu(x)

where {\Phi_n} is a Folner sequence for {\Gamma} and {{\mathcal C}: z \mapsto \overline{z}} is the complex conjugation map. One can show that this limit exists and is independent of the choice of Folner sequence, and that the {\| \|_{U^3(X)}} seminorm is indeed a seminorm. A Conze-Lesigne system is an ergodic measure-preserving system in which the {U^3(X)} seminorm is in fact a norm, thus {\|f\|_{U^3(X)}>0} whenever {f \in L^\infty(X)} is non-zero. Informally, this means that when one considers a generic parallelepiped in a Conze–Lesigne system {X}, the location of any vertex of that parallelepiped is more or less determined by the location of the other seven vertices. These are the important systems to understand in order to study “complexity two” patterns, such as arithmetic progressions of length four. While not all systems {X} are Conze-Lesigne systems, it turns out that they always have a maximal factor {Z^2(X)} that is a Conze-Lesigne system, known as the Conze-Lesigne factor or the second Host-Kra-Ziegler factor of the system, and this factor controls all the complexity two recurrence properties of the system.

The analogous theory in complexity one is well understood. Here, one replaces the {U^3(X)} norm by the {U^2(X)} norm

\displaystyle  \|f\|_{U^2(X)}^4 := \lim_{n \rightarrow \infty} {\bf E}_{h_1,h_2 \in \Phi_n} \int_X \prod_{\omega_1,\omega_2 \in \{0,1\}} {\mathcal C}^{\omega_1+\omega_2} f(T^{\omega_1 h_1 + \omega_2 h_2} x)\ d\mu(x)

and the ergodic systems for which {U^2} is a norm are called Kronecker systems. These systems are completely classified: a system is Kronecker if and only if it arises from a compact abelian group {Z} equipped with Haar probability measure and a translation action {T^\gamma \colon z \mapsto z + \phi(\gamma)} for some homomorphism {\phi: \Gamma \rightarrow Z} with dense image. Such systems can then be analyzed quite efficiently using the Fourier transform, and this can then be used to satisfactory analyze “complexity one” patterns, such as length three progressions, in arbitrary systems (or, when translated back to combinatorial settings, in arbitrary dense sets of abelian groups).

We return now to the complexity two setting. The most famous examples of Conze-Lesigne systems are (order two) nilsystems, in which the space {X} is a quotient {G/\Lambda} of a two-step nilpotent Lie group {G} by a lattice {\Lambda} (equipped with Haar probability measure), and the action is given by a translation {T^\gamma x = \phi(\gamma) x} for some group homomorphism {\phi: \Gamma \rightarrow G}. For instance, the Heisenberg {{\bf Z}}-nilsystem

\displaystyle  \begin{pmatrix} 1 & {\bf R} & {\bf R} \\ 0 & 1 & {\bf R} \\ 0 & 0 & 1 \end{pmatrix} / \begin{pmatrix} 1 & {\bf Z} & {\bf Z} \\ 0 & 1 & {\bf Z} \\ 0 & 0 & 1 \end{pmatrix}

with a shift of the form

\displaystyle  Tx = \begin{pmatrix} 1 & \alpha & 0 \\ 0 & 1 & \beta \\ 0 & 0 & 1 \end{pmatrix} x

for {\alpha,\beta} two real numbers with {1,\alpha,\beta} linearly independent over {{\bf Q}}, is a Conze-Lesigne system. As the base case of a well known result of Host and Kra, it is shown in fact that all Conze-Lesigne {{\bf Z}}-systems are inverse limits of nilsystems (previous results in this direction were obtained by Conze-Lesigne, Furstenberg-Weiss, and others). Similar results are known for {\Gamma}-systems when {\Gamma} is finitely generated, thanks to the thesis work of Griesmer (with further proofs by Gutman-Lian and Candela-Szegedy). However, this is not the case once {\Gamma} is not finitely generated; as a recent example of Shalom shows, Conze-Lesigne systems need not be the inverse limit of nilsystems in this case.

Our main result is that even in the infinitely generated case, Conze-Lesigne systems are still inverse limits of a slight generalisation of the nilsystem concept, in which {G} is a locally compact Polish group rather than a Lie group:

Theorem 1 (Classification of Conze-Lesigne systems) Let {\Gamma} be a countable abelian group, and {X} an ergodic measure-preserving {\Gamma}-system. Then {X} is a Conze-Lesigne system if and only if it is the inverse limit of translational systems {G/\Lambda}, where {G} is a nilpotent locally compact Polish group of nilpotency class two, and {\Lambda} is a lattice in {G} (and also a lattice in the commutator group {[G,G]}), with {G/\Lambda} equipped with the Haar probability measure and a translation action {T^\gamma x = \phi(\gamma) x} for some homomorphism {\phi: \Gamma \rightarrow G}.

In a forthcoming companion paper to this one, Asgar Jamneshan and I will use this theorem to derive an inverse theorem for the Gowers norm {U^3(G)} for an arbitrary finite abelian group {G} (with no restrictions on the order of {G}, in particular our result handles the case of even and odd {|G|} in a unified fashion). In principle, having a higher order version of this theorem will similarly allow us to derive inverse theorems for {U^{s+1}(G)} norms for arbitrary {s} and finite abelian {G}; we hope to investigate this further in future work.

We sketch some of the main ideas used to prove the theorem. The existing machinery developed by Conze-Lesigne, Furstenberg-Weiss, Host-Kra, and others allows one to describe an arbitrary Conze-Lesigne system as a group extension {Z \rtimes_\rho K}, where {Z} is a Kronecker system (a rotational system on a compact abelian group {Z = (Z,+)} and translation action {\phi: \Gamma \rightarrow Z}), {K = (K,+)} is another compact abelian group, and the cocycle {\rho = (\rho_\gamma)_{\gamma \in \Gamma}} is a collection of measurable maps {\rho_\gamma: Z \rightarrow K} obeying the cocycle equation

\displaystyle  \rho_{\gamma_1+\gamma_2}(x) = \rho_{\gamma_1}(T^{\gamma_2} x) + \rho_{\gamma_2}(x) \ \ \ \ \ (1)

for almost all {x \in Z}. Furthermore, {\rho} is of “type two”, which means in this concrete setting that it obeys an additional equation

\displaystyle  \rho_\gamma(x + z_1 + z_2) - \rho_\gamma(x+z_1) - \rho_\gamma(x+z_2) + \rho_\gamma(x) \ \ \ \ \ (2)

\displaystyle  = F(x + \phi(\gamma), z_1, z_2) - F(x,z_1,z_2)

for all {\gamma \in \Gamma} and almost all {x,z_1,z_2 \in Z}, and some measurable function {F: Z^3 \rightarrow K}; roughly speaking this asserts that {\phi_\gamma} is “linear up to coboundaries”. For technical reasons it is also convenient to reduce to the case where {Z} is separable. The problem is that the equation (2) is unwieldy to work with. In the model case when the target group {K} is a circle {{\bf T} = {\bf R}/{\bf Z}}, one can use some Fourier analysis to convert (2) into the more tractable Conze-Lesigne equation

\displaystyle  \rho_\gamma(x+z) - \rho_\gamma(x) = F_z(x+\phi(\gamma)) - F_z(x) + c_z(\gamma) \ \ \ \ \ (3)

for all {\gamma \in \Gamma}, all {z \in Z}, and almost all {x \in Z}, where for each {z}, {F_z: Z \rightarrow K} is a measurable function, and {c_z: \Gamma \rightarrow K} is a homomorphism. (For technical reasons it is often also convenient to enforce that {F_z, c_z} depend in a measurable fashion on {z}; this can always be achieved, at least when the Conze-Lesigne system is separable, but actually verifying that this is possible actually requires a certain amount of effort, which we devote an appendix to in our paper.) It is not difficult to see that (3) implies (2) for any group {K} (as long as one has the measurability in {z} mentioned previously), but the converse turns out to fail for some groups {K}, such as solenoid groups (e.g., inverse limits of {{\bf R}/2^n{\bf Z}} as {n \rightarrow \infty}), as was essentially shown by Rudolph. However, in our paper we were able to find a separate argument that also derived the Conze-Lesigne equation in the case of a cyclic group {K = \frac{1}{N}{\bf Z}/{\bf Z}}. Putting together the {K={\bf T}} and {K = \frac{1}{N}{\bf Z}/{\bf Z}} cases, one can then derive the Conze-Lesigne equation for arbitrary compact abelian Lie groups {K} (as such groups are isomorphic to direct products of finitely many tori and cyclic groups). As has been known for some time (see e.g., this paper of Host and Kra), once one has a Conze-Lesigne equation, one can more or less describe the system {X} as a translational system {G/\Lambda}, where the Host-Kra group {G} is the set of all pairs {(z, F_z)} that solve an equation of the form (3) (with these pairs acting on {X \equiv Z \rtimes_\rho K} by the law {(z,F_z) \cdot (x,k) := (x+z, k+F_z(x))}), and {\Lambda} is the stabiliser of a point in this system. This then establishes the theorem in the case when {K} is a Lie group, and the general case basically comes from the fact (from Fourier analysis or the Peter-Weyl theorem) that an arbitrary compact abelian group is an inverse limit of Lie groups. (There is a technical issue here in that one has to check that the space of translational system factors of {X} form a directed set in order to have a genuine inverse limit, but this can be dealt with by modifications of the tools mentioned here.)

There is an additional technical issue worth pointing out here (which unfortunately was glossed over in some previous work in the area). Because the cocycle equation (1) and the Conze-Lesigne equation (3) are only valid almost everywhere instead of everywhere, the action of {G} on {X} is technically only a near-action rather than a genuine action, and as such one cannot directly define {\Lambda} to be the stabiliser of a point without running into multiple problems. To fix this, one has to pass to a topological model of {X} in which the action becomes continuous, and the stabilizer becomes well defined, although one then has to work a little more to check that the action is still transitive. This can be done via Gelfand duality; we proceed using a mixture of a construction from this book of Host and Kra, and the machinery in this recent paper of Asgar and myself.

Now we discuss how to establish the Conze-Lesigne equation (3) in the cyclic group case {K = \frac{1}{N}{\bf Z}/{\bf Z}}. As this group embeds into the torus {{\bf T}}, it is easy to use existing methods obtain (3) but with the homomorphism {c_z} and the function {F_z} taking values in {{\bf R}/{\bf Z}} rather than in {\frac{1}{N}{\bf Z}/{\bf Z}}. The main task is then to fix up the homomorphism {c_z} so that it takes values in {\frac{1}{N}{\bf Z}/{\bf Z}}, that is to say that {Nc_z} vanishes. This only needs to be done locally near the origin, because the claim is easy when {z} lies in the dense subgroup {\phi(\Gamma)} of {Z}, and also because the claim can be shown to be additive in {z}. Near the origin one can leverage the Steinhaus lemma to make {c_z} depend linearly (or more precisely, homomorphically) on {z}, and because the cocycle {\rho} already takes values in {\frac{1}{N}{\bf Z}/{\bf Z}}, {N\rho} vanishes and {Nc_z} must be an eigenvalue of the system {Z}. But as {Z} was assumed to be separable, there are only countably many eigenvalues, and by another application of Steinhaus and linearity one can then make {Nc_z} vanish on an open neighborhood of the identity, giving the claim.

Terence TaoVenn and Euler type diagrams for vector spaces and abelian groups

A popular way to visualise relationships between some finite number of sets is via Venn diagrams, or more generally Euler diagrams. In these diagrams, a set is depicted as a two-dimensional shape such as a disk or a rectangle, and the various Boolean relationships between these sets (e.g., that one set is contained in another, or that the intersection of two of the sets is equal to a third) is represented by the Boolean algebra of these shapes; Venn diagrams correspond to the case where the sets are in “general position” in the sense that all non-trivial Boolean combinations of the sets are non-empty. For instance to depict the general situation of two sets {A,B} together with their intersection {A \cap B} and {A \cup B} one might use a Venn diagram such as


(where we have given each region depicted a different color, and moved the edges of each region a little away from each other in order to make them all visible separately), but if one wanted to instead depict a situation in which the intersection {A \cap B} was empty, one could use an Euler diagram such as


One can use the area of various regions in a Venn or Euler diagram as a heuristic proxy for the cardinality {|A|} (or measure {\mu(A)}) of the set {A} corresponding to such a region. For instance, the above Venn diagram can be used to intuitively justify the inclusion-exclusion formula

\displaystyle  |A \cup B| = |A| + |B| - |A \cap B|

for finite sets {A,B}, while the above Euler diagram similarly justifies the special case

\displaystyle  |A \cup B| = |A| + |B|

for finite disjoint sets {A,B}.

While Venn and Euler diagrams are traditionally two-dimensional in nature, there is nothing preventing one from using one-dimensional diagrams such as


or even three-dimensional diagrams such as this one from Wikipedia:


Of course, in such cases one would use length or volume as a heuristic proxy for cardinality or measure, rather than area.

With the addition of arrows, Venn and Euler diagrams can also accommodate (to some extent) functions between sets. Here for instance is a depiction of a function {f: A \rightarrow B}, the image {f(A)} of that function, and the image {f(A')} of some subset {A'} of {A}:


Here one can illustrate surjectivity of {f: A \rightarrow B} by having {f(A)} fill out all of {B}; one can similarly illustrate injectivity of {f} by giving {f(A)} exactly the same shape (or at least the same area) as {A}. So here for instance might be how one would illustrate an injective function {f: A \rightarrow B}:


Cartesian product operations can be incorporated into these diagrams by appropriate combinations of one-dimensional and two-dimensional diagrams. Here for instance is a diagram that illustrates the identity {(A \cup B) \times C = (A \times C) \cup (B \times C)}:


In this blog post I would like to propose a similar family of diagrams to illustrate relationships between vector spaces (over a fixed base field {k}, such as the reals) or abelian groups, rather than sets. The categories of ({k}-)vector spaces and abelian groups are quite similar in many ways; the former consists of modules over a base field {k}, while the latter consists of modules over the integers {{\bf Z}}; also, both categories are basic examples of abelian categories. The notion of a dimension in a vector space is analogous in many ways to that of cardinality of a set; see this previous post for an instance of this analogy (in the context of Shannon entropy). (UPDATE: I have learned that an essentially identical notation has also been proposed in an unpublished manuscript of Ravi Vakil.)

As with Venn and Euler diagrams, the diagrams I propose for vector spaces (or abelian groups) can be set up in any dimension. For simplicity, let’s begin with one dimension, and restrict attention to vector spaces (the situation for abelian groups is basically identical). In this one-dimensional model we will be able to depict the following relations and operations between vector spaces:
  • The inclusion {W \leq V} of one vector space {V} in another {W} (here I prefer to use the group notation {\leq} for inclusion rather than the set notation {\subseteq}).
  • The quotient {V/W} of a vector space {V} by a subspace {W}.
  • A linear transformation {T: V \rightarrow W} between vector spaces, as well as the kernel {\mathrm{ker}(T)}, image {\mathrm{im}(T)}, cokernel {\mathrm{coker}(T) = W/\mathrm{im}(T)}, and the coimage {\mathrm{coim}(T) = V/\mathrm{ker}(T)}.
  • A single short or long exact sequence between vector spaces.
  • (A heuristic proxy for) the dimension of a vector space.
  • Direct sum {V \oplus W} of two spaces.

The idea is to use half-open intervals {[a,b)} in the real line for any {a<b} to model vector spaces. In fact we can make an explicit correspondence: let us identify each half-open interval {[a,b)} with the (infinite-dimensional) vector space

\displaystyle  [a,b) \equiv \{ f \in C([a,b]): f(b) = 0 \},

that is {[a,b)} is identified with the space of continuous functions {f:[a,b] \rightarrow {\bf R}} on the interval {[a,b]} that vanish at the right-endpoint {b}. Such functions can be continuously extended by zero to the half-line {[a,+\infty)}.

Note that if {a < b < c} then the vector space {[a,b)} is a subspace of {[a,c)}, if we extend the functions in both spaces by zero to the half-line {[a,+\infty)}; furthermore, the quotient of {[a,c)} by {[a,b)} is naturally identifiable with {[b,c)}. Thus, an inclusion {W \leq V}, as well as the quotient space {V/W}, can be modeled here as follows:


In contrast, if {a < b < c < d}, it is significantly less “natural” to view {[b,c)} as a subspace of {[a,d)}; one could do it by extending functions in {[b,c)} to the right by zero and to the left by constants, but in this notational convention one should view such an identification as “artificial” and to be avoided.

All of the spaces {[a,b)} are infinite dimensional, but morally speaking the dimension of the vector space {[a,b)} is “proportional” to the length {b-a} of the corresponding interval. Intuitively, if we try to discretise this vector space by sampling at some mesh of spacing {\varepsilon}, one gets a finite-dimensional vector space of dimension roughly {(b-a)/\varepsilon}. Already the above diagram now depicts the basic identity

\displaystyle  \mathrm{dim}(V) = \mathrm{dim}(W) + \mathrm{dim}(V/W)

between a finite-dimensional vector space {V}, a subspace {W} of that space, and a quotient of that space.

Note that if {a < b < c < d}, then there is a linear transformation {T} from the vector space {[a,c)} to the vector space {[b,d)} which takes a function {f} in {[a,c)}, restricts it to {[b,c)}, then extends it by zero to {[b,d)}. The kernel of this transformation is {[a,b)}, the image is (isomorphic to) {[b,c)}, the cokernel is (isomorphic to) {[c,d)}, and the coimage is (isomorphic to) {[b,c)}. With this in mind, we can now depict a general linear transformation {T: V \rightarrow W} and its associated spaces by the following diagram:


Note how the first isomorphism theorem and the rank-nullity theorem are heuristically illustrated by this diagram. One can specialise to the case of injective, surjective, or bijective transformations {T} by degenerating one or more of the half-open intervals in the above diagram to the empty interval. A left shift on {[a,b)} gives rise to a nilpotent linear transformation {T} from {[a,b)} to itself:


In a similar spirit, a short exact sequence {0 \rightarrow U \rightarrow V \rightarrow W \rightarrow 0} of vector spaces (or abelian groups) can now be depicted by the diagram


and a long exact sequence {V_1 \rightarrow V_2 \rightarrow V_3 \rightarrow \dots} can similarly be depicted by the diagram


UPDATE: As I have learned from an unpublished manuscript of Ravi Vakil, this notation can also easily depict the cohomology groups {H^1,H^2,H^3,\dots} of a cochain complex {A^0 \rightarrow A^1 \rightarrow A^2 \rightarrow A^3 \rightarrow \dots} by the diagram


and similarly depict the homology groups {H_1, H_2, H_3, \dots} of a chain complex {A_0 \leftarrow A_1 \leftarrow A_2 \leftarrow \dots} by the diagram


One can associate the disjoint union of half-open intervals to the direct sum of the associated vector spaces, giving a way to depict direct sums via this notation:


To increase the expressiveness of this notation we now move to two dimensions, where we can also depict the following additional relations and operations:

  • The intersection {U \cap V} and sum {U+V} of two subspaces {U,V \leq W} of an ambient space {W};
  • Multiple short or long exact sequences;
  • The tensor product {U \otimes V} of two vector spaces {U,V}.

Here, we replace half-open intervals by half-open sets: geometric shapes {S}, such as polygons or disks, which contain some portion of the boundary (drawn using solid lines) but omit some other portion of the boundary (drawn with dashed lines). Each such shape can be associated with a vector space, namely the continuous functions on {\overline{S}} that vanish on the omitted portion of the boundary. All of the relations that were previously depicted using one-dimensional diagrams can now be similarly depicted using two-dimensional diagrams. For instance, here is a two-dimensional depiction of a vector space {V}, a subspace {W}, and its quotient {V/W}:


(In this post I will try to consistently make the lower and left boundaries of these regions closed, and the upper and right boundaries open, although this is not essential for this notation to be applicable.)

But now we can depict some additional relations. Here for instance is one way to depict the intersection {U \cap V} and sum {U+V} of two subspaces {U,V \leq W}:


Note how this illustrates the identity

\displaystyle  \mathrm{dim}(U + V) = \mathrm{dim}(U) + \mathrm{dim}(V) - \mathrm{dim}(U \cap V)

between finite-dimensional vector spaces {U, V}, as well as some standard isomorphisms such as {(U+V)/U \equiv V/(U \cap V)}.

Two finite subgroups {H,K} of an abelian group {G} are said to be commensurable if {H \cap K} is a finite index subgroup of {H+K}. One can depict this by making the area of the region between {H \cap K} and {H+K} small and/or colored with some specific color:


Here the commensurability of {H,K} is equivalent to the finiteness of the groups {H / (H \cap K) \equiv (H+K)/K} and {K / (H \cap K) \equiv (H+K)/H}, which correspond to the gray triangles in the above diagram. Now for instance it becomes intuitively clear why commensurability should be an equivalence relation.

To illustrate how this notation can support multiple short exact sequences, I gave myself the exercise of using this notation to depict the snake lemma, as labeled by this following diagram taken from the just linked Wikipedia page:


This turned out to be remarkably tricky to accomplish without introducing degeneracies (e.g., one of the kernels or cokernels vanishing). Here is one solution I came up with; perhaps there are more elegant ones. In particular, there should be a depiction that more explicitly captures the duality symmetry of the snake diagram.


Here, the maps between the six spaces {A,B,C,A',B',C'} are the obvious restriction maps (and one can visually verify that the two squares in the snake diagram actually commute). Each of the kernel and cokernel spaces of the three vertical restriction maps {a,b,c} are then associated to the union of two of the subregions as indicated in the diagram. Note how the overlaps between these kernels and cokernels generate the long exact “snake”.

UPDATE: by modifying a similar diagram in an unpublished manuscript of Ravi Vakil, I can now produce a more symmetric version of the above diagram, again with a very visible “snake”:


With our notation, the (algebraic) tensor product of an interval {[a,b)} and another interval {[c,d)} is not quite {[a,b) \times [c,d)}, but this becomes true if one uses the {C^*}-algebra version of the tensor product, thanks to the Stone-Weierstrass theorem. So one can plausibly use Cartesian products as a proxy for the vector space tensor product. For instance, here is a depiction of the relation {(U \otimes W) / (V \otimes W) \equiv (U/V) \otimes W} when {V} is a subspace of {U}:


There are unfortunately some limitations to this notation: for instance, no matter how many dimensions one uses for one’s diagrams, these diagrams would suggest the incorrect identity

\displaystyle \mathrm{dim}(U +V + W) = \mathrm{dim} U + \mathrm{dim} V + \mathrm{dim} W - \mathrm{dim} (U \cap V) - \mathrm{dim} (U \cap W) - \mathrm{dim} (V \cap W) + \mathrm{dim}(U \cap V \cap W),

(which incidentally is, at this time of writing, the highest-voted answer to the MathOverflow question “Examples of common false beliefs in mathematics“). (See also this previous blog post for a similar phenomenon when using sets or vector spaces to model entropy of information variables.) Nevertheless it seems accurate enough to be of use in illustrating many common relations between vector spaces and abelian groups. With appropriate grains of salt it might also be usable for further categories beyond these two, though for non-abelian categories one should proceed with caution, as the diagram may suggest relations that are not actually true in this category. For instance, in the category of topological groups one might use the diagram


to describe the fact that an arbitrary topological group splits into a connected subgroup and a totally disconnected quotient, or in the category of finite-dimensional Lie algebras over the reals one might use the diagram


to describe the fact that such algebras split into the solvable radical and a semisimple quotient.

December 18, 2021

Doug NatelsonNo, a tardigrade was not meaningfully entangled with a qubit

This week this paper appeared on the arxiv, claiming to have entangled a tardigrade with a superconducting transmon qubit system.  My readers know that I very rarely call out a paper in a negative way here, because that's not the point of this blog, but this seems to be getting a lot of attention, including in Physics World and New Scientist.  I also don't know how seriously the authors were about this - it could be a tongue-in-cheek piece.  That said, it's important to point out that the authors did not entangle a tardigrade with a qubit in any meaningful sense.  This is not "quantum biology".

Tardigrades are amazingly robust.  We now have a demonstration that you can cool a tardigrade in high vacuum down to millikelvin temperatures, and if you are sufficiently gentle with the temperature and pressure changes, it is possible to revive the little creature.  

What the authors did here was put a tardigrade on top of the capacitive parts of one of two coupled transmon qubits.  The tardigrade is mostly (frozen) water, and here it acts like a dielectric, shifting the resonance frequency of the one qubit that it sat on.   (It is amazing deep down that one can approximate the response of all the polarizable bits of the tardigrade as a dielectric function, but the same could be said for any material.)

This is not entanglement in any meaningful sense. If it were, you could say by the same reasoning that the qubits are entangled with the macroscopic silicon chip substrate.  The tardigrade does not act as a single quantum object with a small number of degrees of freedom.  The dynamics of the tardigrade's internal degrees of freedom do not act to effectively decohere the qubit (which is what happens when a qubit is entangled with many dynamical degrees of freedom that are then traced over).  

Atoms and molecules in our bodies are constantly entangling at a quantum level with each other and with the environment around us.  Decoherence means that trying to look at these tiny constituents and see coherent quantum processes related to entanglement generally becomes hopeless on very short timescales.  People still argue over exactly how the classical world seems to emerge from this constant churning of entanglement - it is fascinating.  Just nothing to do with the present paper. 

December 17, 2021

Matt von HippelCalculations of the Past

Last week was a birthday conference for one of the pioneers of my sub-field, Ettore Remiddi. I wasn’t there, but someone who was pointed me to some of the slides, including a talk by Stefano Laporta. For those of you who didn’t see my post a few weeks back, Laporta was one of Remiddi’s students, who developed one of the most important methods in our field and then vanished, spending ten years on an amazingly detailed calculation. Laporta’s talk covers more of the story, about what it was like to do precision calculations in that era.

“That era”, the 90’s through 2000’s, witnessed an enormous speedup in computers, and a corresponding speedup in what was possible. Laporta worked with Remiddi on the three-loop electron anomalous magnetic moment, something Remiddi had been working on since 1969. When Laporta joined in 1989, twenty-one of the seventy-two diagrams needed had still not been computed. They would polish them off over the next seven years, before Laporta dove in to four loops. Twenty years later, he had that four-loop result to over a thousand digits.

One fascinating part of the talk is seeing how the computational techniques change over time, as language replaces language and computer clusters get involved. As a student, Laporta learns a lesson we all often need: that to avoid mistakes, we need to do as little by hand as possible, even for something as simple as copying a one-line formula. Looking at his review of others’ calculations, it’s remarkable how many theoretical results had to be dramatically corrected a few years down the line, and how much still might depend on theoretical precision.

Another theme was one of Remiddi suggesting something and Laporta doing something entirely different, and often much more productive. Whether it was using the arithmetic-geometric mean for an elliptic integral instead of Gaussian quadrature, or coming up with his namesake method, Laporta spent a lot of time going his own way, and Remiddi quickly learned to trust him.

There’s a lot more in the slides that’s worth reading, including a mention of one of this year’s Physics Nobelists. The whole thing is an interesting look at what it takes to press precision to the utmost, and dedicate years to getting something right.

Scott Aaronson An alarming trend in K-12 math education: a guest post and an open letter

Updates: Our open letter made the WSJ, and has been tweeted by Matt Yglesias and others. See also Boaz Barak’s thread, and a supportive tweet from Rep. Ro Khanna (D-Silicon Valley). If you’re just arriving here, try TodayMag for an overview of some of the issues at stake. Added: Newsweek. See also this post in Spanish. And see a post by Greg Ashman for more details on what’s been happening in California.

Today, I’m turning over Shtetl-Optimized to an extremely important guest post by theoretical computer scientists Boaz Barak of Harvard and Edith Cohen of Google (cross-posted on the windows on theory blog). In addition to the post below, please read—and if relevant, consider signing—our open letter about math education in the US, which now has over 150 now 535 746 952 1225 1415 signatories, including Fields Medalists, Turing Award winners, MacArthur Fellows, and Nobel laureates. Finally, check out our fuller analysis of what the California Mathematics Framework is poised to do and why it’s such an urgent crisis for math education. I’m particularly grateful to my colleagues for their writing efforts, since I would never have been able to discuss what’s happening in such relatively measured words. –Scott Aaronson

Mathematical education at the K-12 level is critical for preparation for STEM careers. An ongoing challenge to the US K-12 system is to improve the preparation of students for advanced mathematics courses and expand access and enrollment in these courses. As stated by a Department of Education report “taking Algebra I before high school … can set students up for a strong foundation of STEM education and open the door for various college and career options.” The report states that while 80% of all students have access to Algebra I in middle school, only 24% enroll. This is also why the goal of Bob Moses’ Algebra Project is to ensure that “every child must master algebra, preferably by eighth grade, for algebra is the gateway to the college-prep curriculum, which in turn is the path to higher education.”

The most significant potential for growth is among African American or Latino students, among whom only 12% enroll in Algebra before high school. This untapped potential has longer-term implications for both society and individuals. For example, although African Americans and Latinos comprise 13% and 18% (respectively) of the overall US population, they only account for 4% and 11% of engineering degrees. There is also a gap in access by income: Calculus is offered in 92% of schools serving the top income quartile but only in 77% of schools serving the bottom quartile (as measured by the share of students eligible for free or reduced-price lunch). Thus minority and low income students have less access to STEM jobs, which yield more than double the median salary of non-STEM jobs, and are projected to grow at a 50% higher rate over the next decade.

Given these disparities, we applaud efforts such as the Algebra Project, the Calculus Project, and Bridge to Enter Advanced Mathematics that increase access to advanced mathematical education to underserved students. However, we are concerned by recent approaches, such as the proposed California Math Framework (CMF) revisions,  that take the opposite direction.

See this document for a detailed analysis and critique of the CMF, but the bottom line is that rather than trying to elevate under-served students, such “reforms” reduce access and options for all students. In particular, the CMF encourages schools to stop offering Algebra I in middle school, while placing obstacles (such as doubling-up, compressed courses, or outside-of-school private courses) in the way of those who want to take advanced math in higher grades. When similar reforms were implemented in San Francisco, they resulted in an “inequitable patchwork scheme” of workarounds that affluent students could access but that their less privileged counterparts could not. The CMF also promotes trendy and shallow courses (such as a nearly math-free version of  “data science”) as recommended alternatives  to foundational mathematical courses such as Algebra and Calculus. These courses do not prepare students even for careers in data science itself!

As educators and practitioners, we have seen first-hand the value of solid foundations in mathematics for pursuing college-level STEM education and a productive STEM career. 

While well-intentioned, we believe that many of the changes proposed by the CMF are deeply misguided and will disproportionately harm under-resourced students. Adopting them would result in a student population that is less prepared to succeed in STEM and other 4-year quantitative degrees in college.  The CMF states that “many students, parents, and teachers encourage acceleration beginning in grade eight (or sooner) because of mistaken beliefs that Calculus is an important high school goal.” Students, parents, and teachers are not mistaken. Neither is the National Society of Black Engineers (NSBE), which set in 2015 a goal to double the number of African American students taking calculus by 2025. Calculus is not the only goal of K-12 math education, but it is important for students who wish to prepare for STEM in college and beyond. 

We agree that calculus is not the “be-all and end-all” of high-school mathematics education. In particular, we encourage introducing options such as logic, probability, discrete mathematics, and algorithms design in the K-12 curriculum, as they can be valuable foundations for STEM education in college. However, when taught beyond a superficial level (which unfortunately is not the case in the CMF “data science” proposals), these topics are not any easier  than calculus. They require the same foundations of logic, algebra, and functions, and fluency with numbers and calculations. Indeed, the career paths with the highest potential for growth require more and deeper mathematical preparation than ever before. Calculus and other mathematical foundations are not important because they are admission requirements for colleges, or because they are relics of the “Sputnik era”. They are important because they provide fundamental knowledge and ways of thinking that are necessary for success in these fast growing and in-demand fields.

We also fully support incorporating introductory data analysis and coding skills in the K-12 curriculum (and there are some good approaches for doing so).  But we strongly disagree with marketing such skills as replacing foundational skills in algebra and calculus when preparing for 4-year college degrees in STEM and other quantitative fields. These topics are important and build on basic math foundations but are not a replacement for such foundations any more than social media skills can replace reading and writing foundations. 

Given the above, we, together with more than 150 scientists, educators, and practitioners in STEM, have signed an open letter expressing our concerns with such trends. The signatories include STEM faculty in public and private universities and practitioners from industry. They include educators with decades of experience teaching students at all levels, as well as researchers that won the highest awards in their fields, including the Fields Medal and the Turing Award. Signers also include people vested in mathematical high-school education, such as Adrian Mims (founder of The Calculus Project) and Jelani Nelson (UC Berkeley EECS professor and founder of AddisCoder) who have spearheaded projects to increase access to underserved populations.

We encourage you to read the letter, and if you are a US-based STEM professional or educator, consider signing it as well:

Unfortunately, in recent years, debates on US education have become politicized. The letter is not affiliated with any political organization, and we believe that the issues we highlight transcend current political debates. After all, expanding access to mathematical education is both socially just and economically wise.

Note: The above guest post reflects the views of its authors, Boaz Barak and Edith Cohen. Any comments below by them, me, or other signatories reflect their own views, not necessarily those of the entire group. –SA

December 15, 2021

Tommaso DorigoRadiation Zero

Interference is a fascinating effect, and one which can be observed in a wide variety of physical systems - any system that involves the propagation of waves from different sources. We can observe interference between waves in the sea or in a lake, or even in our bathtub; we can hear the effect of interference between sound waves; or we can observe the fascinating patterns created by interference effects in light propagation. In addition to all that, we observe interference between the amplitudes of quantum phenomena by studying particle physics processes.

read more

December 13, 2021

John PreskillA quantum-steampunk photo shoot

Shortly after becoming a Fellow of QuICS, the Joint Center for Quantum Information and Computer Science, I received an email from a university communications office. The office wanted to take professional photos of my students and postdocs and me. You’ve probably seen similar photos, in which theoretical physicists are writing equations, pointing at whiteboards, and thinking deep thoughts. No surprise there. 

A big surprise followed: Tom Ventsias, the director of communications at the University of Maryland Institute for Advanced Computer Studies (UMIACS), added, “I wanted to hear your thoughts about possibly doing a dual photo shoot for you—one more ‘traditional,’ one ‘quantum steampunk’ style.”

Steampunk, as Quantum Frontiers regulars know, is a genre of science fiction. It combines futuristic technologies, such as time machines and automata, with Victorian settings. I call my research “quantum steampunk,” as it combines the cutting-edge technology of quantum information science with the thermodynamics—the science of energy—developed during the 1800s. I’ve written a thesis called “Quantum steampunk”; authored a trade nonfiction book with the same title; and presented enough talks about quantum steampunk that, strung together, they’d give one laryngitis. But I don’t own goggles, hoop skirts, or petticoats. The most steampunk garb I’d ever donned before this autumn, I wore for a few minutes at age six or so, for dress-up photos at a theme park. I don’t even like costumes.

But I earned my PhD under the auspices of fellow Quantum Frontiers blogger John Preskill,1 whose career suggests a principle to live by: While unravelling the universe’s nature and helping to shape humanity’s intellectual future, you mustn’t take yourself too seriously. This blog has exhibited a photo of John sitting in Caltech’s information-sciences building, exuding all the gravitas of a Princeton degree, a Harvard degree, and world-impacting career—sporting a baseball glove you’d find in a high-school gym class, as though it were a Tag Heuer watch. John adores baseball, and the photographer who documented Caltech’s Institute for Quantum Information and Matter brought out the touch of whimsy like the ghost of a smile.

Let’s try it, I told Tom.

One rust-colored November afternoon, I climbed to the top of UMIACS headquarters—the Iribe Center—whose panoramic view of campus begs for photographs. Two students were talking in front of a whiteboard, and others were lunching on the sandwiches, fruit salad, and cheesecake ordered by Tom’s team. We took turns brandishing markers, gesturing meaningfully, and looking contemplative.

Then, the rest of my team dispersed, and the clock rewound 150 years.

The professionalism and creativity of Tom’s team impressed me. First, they’d purchased a steampunk hat, complete with goggles and silver wires. Recalling the baseball-glove photo, I suggested that I wear the hat while sitting at a table, writing calculations as I ordinarily would.

What hat? Quit bothering me while I’m working.

Then, the team upped the stakes. Earlier that week, Maria Herd, a member of the communications office, had driven me to the University of Maryland performing-arts center. We’d sifted through the costume repository until finding skirts, vests, and a poofy white shirt reminiscent of the 1800s. I swapped clothes near the photo-shoot area, while the communications team beamed a London street in from the past. Not really, but they nearly did: They’d found a backdrop suitable for the 2020 Victorian-era Netflix hit Enola Holmes and projected the backdrop onto a screen. I stood in front of the screen, and a sheet of glass stood in front of me. I wrote equations on the glass while the photographer, John Consoli, snapped away.

The final setup, I would never have dreamed of. Days earlier, the communications team had located an elevator lined, inside, with metal links. They’d brought colorful, neon-lit rods into the elevator and experimented with creating futuristic backdrops. On photo-shoot day, they positioned me in the back of the elevator and held the light-saber-like rods up. 

But we couldn’t stop anyone from calling the elevator. We’d ride up to the third or fourth floor, and the door would open. A student would begin to step in; halt; and stare my floor-length skirt, the neon lights, and the photographer’s back.

“Feel free to get in.” John’s assistant, Gail Marie Rupert, would wave them inside. The student would shuffle inside—in most cases—and the door would close.

“What floor?” John would ask.


John would twist around, press the appropriate button, and then turn back to his camera.

Once, when the door opened, the woman who entered complimented me on my outfit. Another time, the student asked if he was really in the Iribe Center. I regard that question as evidence of success.

John Consoli took 654 photos. I found the process fascinating, as a physicist. I have a domain of expertise; and I know the feeling of searching for—working toward—pushing for—a theorem or a conceptual understanding that satisfies me, in that domain. John’s area of expertise differs from mine, so I couldn’t say what he was searching for. But I recognized his intent and concentration, as Gail warned him that time had run out and he then made an irritated noise, inched sideways, and stole a few more snapshots. I felt like I was seeing myself in a reflection—not in the glass I was writing on, but in another sphere of the creative life.

The communications team’s eagerness to engage in quantum steampunk—to experiment with it, to introduce it into photography, to make it their own—bowled me over. Quantum steampunk isn’t just a stack of papers by one research group; it’s a movement. Seeing a team invest its time, energy, and imagination in that movement felt like receiving a deep bow or curtsy. Thanks to the UMIACS communications office for bringing quantum steampunk to life.

The Quantum-Steampunk Lab. Not pictured: Shayan Majidy.

1Who hasn’t blogged in a while. How about it, John?

December 12, 2021

Tommaso DorigoPhotons And Neutral Pions

A bit over a half into my course of particle physics for Masters students in Statistical Sciences I usually find myself describing the CMS detector in some detail, and that is what happened last week.
The course

My course has a duration of 64 hours, and is structured in four parts. In the first part, which usually takes about 24 hours to complete, I go over the most relevant part of 20th Century physics. We start from the old quantum theory and then we look at special relativity, the fundaments of quantum mechanics, the theory of scattering, the study of hadrons and the symmetries that lead to the quark model, to finish with the Higgs mechanism and the Standard Model. 

read more

December 07, 2021

Clifford JohnsonCompleting a Story

[A rather technical post follows.]

[caption id="attachment_19916" align="aligncenter" width="499"]Sample image from paper. Will be discussed later in the text. This figure will make more sense later in the post. It is here for decoration. Sit tight.[/caption]

For curious physicists following certain developments over the last two years, I'll put below one or two thoughts about the new paper I posted on the arXiv a few days ago. It is called "Consistency Conditions for Non-Perturbartive Completions of JT Gravity". (Actually, I was writing a different paper, but a glorious idea popped into my head and took over, so this one emerged and jumped out in front of the other. A nice aspect of this story is that I get to wave back at myself from almost 30 years ago, writing my first paper in Princeton, waving to myself 30 years in the future. See my last post about where I happen to be visiting now.) Anyway here are the thoughts:

Almost exactly two years ago I wrote a paper that explained how to define and construct a non-perturbatively stable completion of JT gravity. It had been defined earlier that year as a perturbative [...] Click to continue reading this post

The post Completing a Story appeared first on Asymptotia.

December 06, 2021

Clifford JohnsonA Return

Well, I'm back.

It has been very quiet on the blog recently because I've been largely occupied with the business of moving. Where have I moved to? For the next academic year I'll be on the faculty at Princeton University (as a Presidential Visiting Scholar) in the Physics department. It's sort of funny because, as part of the business of moving forward in my research, I've been looking back a lot on earlier eras of my work recently (as you know from my last two year's exciting activity in non-perturbative matrix models), and rediscovering and re-appreciating (and then enhancing and building on) a lot of the things I was doing decades ago... So now it seems that I'm physically following myself back in time too.

Princeton was in a sense my true physical first point of entry into the USA: My first postdoc was here (at the Institute for Advanced Study, nearby), and I really began [...] Click to continue reading this post

The post A Return appeared first on Asymptotia.

December 05, 2021

Clifford JohnsonFull Circle

snapshot of paper

Yesterday I submitted (with collaborators Felipe Rosso and Andrew Svesko) a new paper to the arXiv that I'm very excited about! It came from one of those lovely moments when a warm flash of realisation splashed through my mind, and several fragments of (seemingly separate things) that had been floating around in my head for some time suddenly all fit together. The fit was so tight and compelling that I had a feeling of certainty that it just "had to be right". It is a great feeling, when that happens. Of course, the details had to be worked out, and everything checked and properly developed, new tools made and some very nice computations done to unpack the consequences of the idea... and that's what resulted in this paper! It is a very natural companion to the cluster of papers I wrote last year, particularly the ones in May and June.

What's the story? It’s all about Jackiw-Teitelboim (JT) gravity, a kind of 2D gravity theory that shows up rather generically as controlling the low temperature physics of a wide class of black holes, including 4D ones in our universe. Understanding the quantum gravity of JT is a very nice step in understanding quantum properties of black holes. This is exciting stuff!

Ok, now I'll get a bit more technical. Some background on all this (JT gravity, matrix models, etc), can be found in an earlier pair of posts. You might recall that in May last year I put out a paper where I showed how to define, fully non-perturbatively, a class of Jackiw-Teitelbiom (JT) supergravity theories that had been defined in 2019 in a massive paper by Stanford and Witten (SW). In effect, I showed how to build them as a particular combination of an infinite number of special "minimal string" models called type 0A strings. Those in turn are made using a special class of random matrix model based on [...] Click to continue reading this post

The post Full Circle appeared first on Asymptotia.

December 04, 2021

Jordan EllenbergBaseball and suffering

When I was younger, baseball made me suffer. I believed what Bart Giamatti said about the game: “It breaks your heart. It is designed to break your heart.” When the Orioles lost a big game I was stuck in a foul cloud for hours or days afterwards. When Tanya first encountered me in this state she literally could not believe it had to do with baseball, and really probed to figure out what had really happened. But it was baseball. That’s what happened. Baseball.

I’m different now. I can watch the Orioles lose while wishing they would win and not feel the same kind of angry, bitter suffering I used to. I don’t know what made it change. It might just be the psychic arc of middle age. It’s not that I care less. When they win — whether it’s the good 2014 Orioles getting the ALCS or the awful contemporary version of the team having a rare good night — I thrill to it, just like I have since I was a kid. When they lose, I move on.

It would be good to bring this change to all areas of life. Not to stop caring, but to stop sinking into anger and suffering when things don’t go the way I want. I don’t know how I did it for baseball, so I don’t know how to do it for anything else. Maybe I should just pretend everything is baseball.

Tommaso DorigoThe Beauty Of Grossular

Old timers of this blog will recall that I am an avid stone collector. In fact, of all experimental sciences I am fond of (Physics, Astronomy, Geology above others) Geology is the one that fascinated me first, as a six or seven year old child. We are talking about almost fifty years ago, when newspaper stands in Italy used to sell small packets containing pictures of soccer players (they were not even adhesive back then: you had to use your own glue to attach them in the proper place within collection albums which were sold separately) . Kids collected those "figurine", and exchanged them with their peer after school hours (or even during school hours). Other collections offered were ones of minerals, fossils, stickers, etcetera.

read more

December 03, 2021

John PreskillBalancing the tradeoff

So much to do, so little time. Tending to one task is inevitably at the cost of another, so how does one decide how to spend their time? In the first few years of my PhD, I balanced problem sets, literature reviews, and group meetings, but at the detriment to my hobbies. I have played drums my entire life, but I largely fell out of practice in graduate school. Recently, I made time to play with a group of musicians, even landing a couple gigs in downtown Austin, Texas, “live music capital of the world.” I have found attending to my non-physics interests makes my research hours more productive and less taxing. Finding the right balance of on- versus off-time has been key to my success as my PhD enters its final year.

Of course, life within physics is also full of tradeoffs. My day job is as an experimentalist. I use tightly focused laser beams, known as optical tweezers, to levitate micrometer-sized glass spheres. I monitor a single microsphere’s motion as it undergoes collisions with air molecules, and I study the system as an environmental sensor of temperature, fluid flow, and acoustic waves; however, by night I am a computational physicist. I code simulations of interacting qubits subject to kinetic constraints, so-called quantum cellular automata (QCA). My QCA work started a few years ago for my Master’s degree, but my interest in the subject persists. I recently co-authored one paper summarizing the work so far and another detailing an experimental implementation.

The author doing his part to “keep Austin weird” by playing the drums dressed as grackle (note the beak), the central-Texas bird notorious for overrunning grocery store parking lots.
Balancing research interests: Trapping a glass microsphere with optical tweezers.
Balancing research interests: Visualizing the time evolution of four different QCA rules.

QCA, the subject of this post, are themselves tradeoff-aware systems. To see what I mean, first consider their classical counterparts cellular automata. In their simplest construction, the system is a one-dimensional string of bits. Each bit takes a value of 0 or 1 (white or black). The bitstring changes in discrete time steps based on a simultaneously-applied local update rule: Each bit, along with its two nearest-neighbors, determine the next state of the central bit. Put another way, a bit either flips, i.e., changes 0 to 1 or 1 to 0, or remains unchanged over a timestep depending on the state of that bit’s local neighborhood. Thus, by choosing a particular rule, one encodes a trade off between activity (bit flips) and inactivity (bit remains unchanged). Despite their simple construction, cellular automata dynamics are diverse; they can produce fractals and encryption-quality random numbers. One rule even has the ability to run arbitrary computer algorithms, a property known as universal computation.

Classical cellular automata. Left: rule 90 producing the fractal Sierpiński’s triangle. Middle: rule 30 can be used to generate random numbers. Right: rule 110 is capable of universal computation.

In QCA, bits are promoted to qubits. Instead of being just 0 or 1 like a bit, a qubit can be a continuous mixture of both 0 and 1, a property called superposition. In QCA, a qubit’s two neighbors being 0 or 1 determine whether or not it changes. For example, when in an active neighborhood configuration, a qubit can be coded to change from 0 to “0 plus 1” or from 1 to “0 minus 1”. This is already a head-scratcher, but things get even weirder. If a qubit’s neighbors are in a superposition, then the center qubit can become entangled with those neighbors. Entanglement correlates qubits in a way that is not possible with classical bits.

Do QCA support the emergent complexity observed in their classical cousins? What are the effects of a continuous state space, superposition, and entanglement? My colleagues and I attacked these questions by re-examining many-body physics tools through the lens of complexity science. Singing the lead, we have a workhorse of quantum and solid-state physics: two-point correlations. Singing harmony we have the bread-and-butter of network analysis: complex-network measures. The duet between the two tells the story of structured correlations in QCA dynamics.

In a bit more detail, at each QCA timestep we calculate the mutual information between all qubits i and all other qubits j. Doing so reveals how much there is to learn about one qubit by measuring another, including effects of quantum entanglement. Visualizing each qubit as a node, the mutual information can be depicted as weighted links between nodes: the more correlated two qubits are, the more strongly they are linked. The collection of nodes and links makes a network. Some QCA form unstructured, randomly-linked networks while others are highly structured. 

Complex-network measures are designed to highlight certain structural patterns within a network. Historically, these measures have been used to study diverse networked-systems like friend groups on Facebook, biomolecule pathways in metabolism, and functional-connectivity in the brain. Remarkably, the most structured QCA networks we observed quantitatively resemble those of the complex systems just mentioned despite their simple construction and quantum unitary dynamics. 

Visualizing mutual information networks. Left: A Goldilocks-QCA generated network. Right: a random network.

What’s more, the particular QCA that generate the most complex networks are those that balance the activity-inactivity trade-off. From this observation, we formulate what we call the Goldilocks principle: QCA that generate the most complexity are those that change a qubit if and only if the qubit’s neighbors contain an equal number of 1’s and 0’s. The Goldilocks rules are neither too inactive nor too active, balancing the tradeoff to be “just right.”  We demonstrated the Goldilocks principle for QCA with nearest-neighbor constraints as well as QCA with nearest-and-next-nearest-neighbor constraints.

To my delight, the scientific conclusions of my QCA research resonate with broader lessons-learned from my time as a PhD student: Life is full of trade-offs, and finding the right balance is key to achieving that “just right” feeling.

November 25, 2021

Sean CarrollThanksgiving

This year we give thanks for something we’ve all heard of, but maybe don’t appreciate as much as we should: electromagnetism. (We’ve previously given thanks for the Standard Model Lagrangian, Hubble’s Law, the Spin-Statistics Theorem, conservation of momentum, effective field theory, the error bar, gauge symmetry, Landauer’s Principle, the Fourier Transform, Riemannian Geometry, the speed of light, the Jarzynski equality, the moons of Jupiter, space, and black hole entropy.)

Physicists like to say there are four forces of nature: gravitation, electromagnetism, the strong nuclear force, and the weak nuclear force. That’s a somewhat sloppy and old-fashioned way of talking. In the old days it made sense to distinguish between “matter,” in the form of particles or fluids or something like that, and “forces,” which pushed around the matter. These days we know it’s all just quantum fields, and both matter and forces arise from the behavior of quantum fields interacting with each other. There is an important distinction between fermions and bosons, which almost maps onto the old-fashioned matter/force distinction, but not quite. If it did, we’d have to include the Higgs force among the fundamental forces, but nobody is really inclined to do that.

The real reason we stick with the traditional four forces is that (unlike the Higgs) they are all mediated by a particular kind of bosonic quantum field, called gauge fields. There’s a lot of technical stuff that goes into explaining what that means, but the basic idea is that the gauge fields help us compare other fields at different points in space, when those fields are invariant under a certain kind of symmetry. For more details, check out this video from the Biggest Ideas in the Universe series (but you might need to go back to pick up some of the prerequisites).

YouTube Video

All of which is just throat-clearing to say: there are four forces, but they’re all different in important ways, and electromagnetism is special. All the forces play some kind of role in accounting for the world around us, but electromagnetism is responsible for almost all of the “interestingness” of the world of our experience. Let’s see why.

When you have a force carried by a gauge field, one of the first questions to ask is what phase the field is in (in whatever physical situation you care about). This is “phase” in the same sense as “phase of matter,” e.g. solid, liquid, gas, etc. In the case of gauge theories, we can think about the different phases in terms of what happens to lines of force — the imaginary paths through space that we would draw to be parallel to the direction of the force exerted at each point.

The simplest thing that lines of force can do is just to extend away from a source, traveling forever through space until they hit some other source. (For electromagnetism, a “source” is just a charged particle.) That corresponds to field being in the Coulomb phase. Infinitely-stretching lines of force dilute in density as the area through which they are passing increases. In three dimensions of space, that corresponds to spheres we draw around the source, whose area goes up as the distance squared. The magnitude of the force therefore goes as the inverse of the square — the famous inverse square law. In the real world, both gravity and electromagnetism are in the Coulomb phase, and exhibit inverse-square laws.

But there are other phases. There is the confined phase, where lines of force get all tangled up with each other. There is also the Higgs phase, where the lines of force are gradually absorbed into some surrounding field (the Higgs field!). In the real world, the strong nuclear force is in the confined phase, and the weak nuclear force is in the Higgs phase. As a result, neither force extends farther than subatomic distances.

Phases of gauge fields.

So there are four gauge forces that push around particles, but only two of them are “long-range” forces in the Coulomb phase. The short-range strong and weak forces are important for explaining the structure of protons and neutrons and nuclei, but once you understand what stable nuclei there are, there work is essentially done, as far as accounting for the everyday world is concerned. (You still need them to explain fusion inside stars, so here we’re just thinking of life here on Earth.) The way that those nuclei come together with electrons to make atoms and molecules and larger structures is all explained by the long-range forces, electromagnetism and gravity.

But electromagnetism and gravity aren’t quite equal here. Gravity is important, obviously, but it’s also pretty simple: everything attracts everything else. (We’re ignoring cosmology etc, focusing in on life here on Earth.) That’s nice — it’s good that we stay attached to the ground, rather than floating away — but it’s not a recipe for intricate complexity.

To get complexity, you need to be able to manipulate matter in delicate ways with your force. Gravity isn’t up to the task — it just attracts. Electromagentism, on the other hand, is exactly what the doctor ordered. Unlike gravity, where the “charge” is just mass and all masses are positive, electromagnetism has both positive and negative charges. Like charges repel, and opposite charges attract. So by deftly arranging collections of positively and negatively charged particles, you can manipulate matter in whatever way you like.

That pinpoint control over pushing and pulling is crucial for the existence of complex structures in the universe, including you and me. Nuclei join with electrons to make atoms because of electromagnetism. Atoms come together to make molecules because of electromagnetism. Molecules interact with each other in different ways because of electromagnetism. All of the chemical processes in your body, not to mention in the world immediately around you, can ultimately be traced to electromagnetism at work.

Electromagnetism doesn’t get all the credit for the structure of matter. A crucial role is played by the Pauli exclusion principle, which prohibits two electrons from inhabiting exactly the same state. That’s ultimately what gives matter its size — why objects are solid, etc. But without the electromagnetic interplay between atoms of different sizes and numbers of electrons, matter would be solid but inert, just sitting still without doing anything interesting. It’s electromagnetism that allows energy to move from place to place between atoms, both via electricity (electrons in motion, pushed by electromagnetic fields) and radiation (vibrations in the electromagnetic fields themselves).

So we should count ourselves lucky that we live in a world where at least one fundamental force is both in the Coulomb phase and has opposite charges, and give appropriate thanks. It’s what makes the world interesting.

November 23, 2021