# Planet Musings

## March 11, 2014

### Doug Natelson — Coolest paper of 2014 so far, by a wide margin.

Sorry for the brief post, but I could not pass this up.

Check this out:  http://arxiv.org/abs/1403.1211

I bow down before the awesomeness of an origami-based microscope.

### David Hogg — quantum gravity, causality, planets per star

Two seminars today: Gia Dvali (NYU and LMU) gave the lunchtime brown-bag talk, about the possible interaction of quantum gravity with the axion and neutrino sectors. The idea is that the "strong CP problem" implies that some invariant of the gluon field is so small that the interaction between gluons and the graviton field could actually dominate or be dynamically significant. It was only moderately incomprehensible!

Late in the day, Jennifer Hill (NYU) spoke about causal inference in the social sciences. She talked about the Rubin formulation of causality, and the problems of having observational (rather than randomized) data. She argued the point that almost all important questions in science are causal questions, so we have to face this! Her methods involve fitting incredibly flexible and complex models to the parts of the problem she doesn't care about to reveal the residual correlations that she does care about.

Early in the day I spoke with Tim Morton (Princeton) about inferring planets-per-star rate statistics from data. The methods in the literature seem highly biased (and naive, as they involve "transforming" rather than modeling the data). We talked about next steps.

## March 10, 2014

### Doug Natelson — March Meeting wrap-up

I've been slow about writing a day 3/4/wrapup of the APS meeting because of general busy-ness.  I saw fewer general interest talks over that last day and a half in part because my own group's talks were clustered in that timeframe.  Still, I did see a couple of interesting bits.
• There was a great talk by Zhenchao Dong about this paper, where they are able to use the plasmonic properties of a scanning tunneling microscope tip to perform surface-enhanced Raman spectroscopy on single molecules (in ultrahigh vacuum and cryogenic conditions) with sub-nm lateral resolution.  The data are gorgeous, though how the lateral resolution can possibly be that good is very mysterious.  Usually the lateral extent of the enhanced optical fields is something like the geometric mean of the tip radius of curvature and the tip-sample distance.  It's very hard to see how that ever gets to the sub-nm level, so something funky must be going on.
• I saw a talk by Yoshihiro Iwasa all about MoS2, including work on optics and ionic liquid gating.
• I went to a session on the presentation of physics to the public.  The talks that I managed to see were quite good, and Dennis Overbye's insights into the NY Times' science reporting were particularly interesting.  He pointed out that it's a very challenging marketplace when so much good (or at least interesting) science writing is given away for free (as in here or here or here).  He did give a shout-out to Peter Woit, particularly mentioning how good Peter's sources are.
I probably should wade into focus topic organizing again;  it seemed like this year there were more issues with parallel sessions about similar topics and some lack of cohesiveness within some of the contributed sessions.  (This isn't meant as a criticism of those who invested their time to do this, which is absolutely appreciated!  I am well aware how hard it is, particularly as the meeting keeps growing.)

### Geraint F. Lewis — Non-linear Chaplygin Gas Cosmologies

Ultra-quick post today, but a new paper on The Arxiv. The title is "Non-linear Chaplygin Gas Cosmologies", catchy eh! But what does it mean?

Essentially, we have two big dark mysteries in the universe, dark matter and dark energy, and what we want to do is try and reduce the number of mysteries by a factor of two. We can do this with a Chaplygin gas, modifying its usual properties so that on small scales it looks like dark matter, while on large scales it acts as dark energy.

In this first paper, we basically lay out our new model, led by Pedro Avelino who was visiting the Sydney Institute for Astronomy from Portugal. It's a bit mathematical, but in coming papers we will detail a number of observational tests we compare the model  to.

That is yet to come, and I will write a lot more detail then, but I have to run. Before I go, I'll note that this is my first paper with Krzysztof Bolejko, and so my Erdos Number has significantly shrunk. More on that too!

But for now, well done Pedro!

# Non-linear Chaplygin Gas Cosmologies

We study the non-linear regime of Unified Dark Energy models, using Generalized Chaplygin Gas cosmologies as a representative example, and introduce a new parameter characterizing the level of small scale clustering in these scenarios. We show that viable Generalized Chaplygin Gas cosmologies, consistent with the most recent observational constraints, may be constructed for any value of the Generalized Chaplygin Gas parameter by considering models with a sufficiently high level of non-linear clustering.

### Andrew Jaffe — Around Asia in search of a meal

I’m recently back from my mammoth trip through Asia (though in fact I’m up in Edinburgh as I write this, visiting as a fellow of the Higgs Centre For Theoretical Physics).

I’ve already written a little about the middle week of my voyage, observing at the James Clerk Maxwell Telescope, and I hope to get back to that soon — at least to post some pictures of and from Mauna Kea. But even more than telescopes, or mountains, or spectacular vistas, I seemed to have spent much of the trip thinking about and eating food. (Even at the telescope, food was important — and the chefs at Halu Pohaku do some amazing things for us sleep-deprived astronomers, though I was too tired to record it except as a vague memory.) But down at sea level, I ate some amazing meals.

When I first arrived in Taipei, my old colleague Proty Wu picked me up at the airport, and took me to meet my fellow speakers and other Taiwanese astronomers at the amazing Din Tai Fung , a world-famous chain of dumpling restaurants. (There are branches in North America but alas none in the UK.) As a scientist, I particularly appreciated the clean room they use to prepare the dumplings to their exacting standards:

Later in the week, a few of us went to a branch of another famous Taipei-based chain, Shin Yeh, for a somewhat traditional Taiwanese restaurant meal. It was amazing, and I wish I could remember some of the specifics. Alas, I’ve only recorded the aftermath:

From Taipei, I was off to Hawaii. Before and after my observing trip, I spent a few days in Honolulu, where I managed to find a nice plate of sushi at Doraku — good, but not too much better than I’ve had in London or New York, despite the proximity to Japan.

From Hawaii, I had to fly back for a transfer in Taipei, where I was happy to find plenty more dumplings (as well as pleasantly sweet Taiwanese pineapple cake). Certainly some of the best airport food I’ve had (for the record, my other favourites are sausages in Munich, and sushi at the Ebisu counter at San Francisco):

From there, my last stop was 40 hours in Beijing. Much more to say about that visit, but the culinary part of the trip had a couple of highlights. After a morning spent wandering around the Forbidden City (aka the Palace Museum), I was getting tired and hungry. I tried to find Tian Di Yi Jia, supposedly “An Incredible Imperial-Style Restaurant”. Alas, some combination of not having a website, not having Roman-lettered signs, and the likelihood that it had closed down meant an hour’s wandering Beijing’s streets was in vain. Instead, I ended up at this hole in the wall: And was very happy indeed, in particular with the amazing slithery, tangy eggplant: That night, I ended up at The Grandma’s, an outpost of yet another chain, seemingly a different chain than Grandma’s Kitchen, which apparently serves American food. Definitely not American food:

It was a very tasty trip. I think there was science, too.

### Scott Aaronson — The Scientific Case for P≠NP

Out there in the wider world—OK, OK, among Luboš Motl, and a few others who comment on this blog—there appears to be a widespread opinion that P≠NP is just “a fashionable dogma of the so-called experts,” something that’s no more likely to be true than false.  The doubters can even point to at least one accomplished complexity theorist, Dick Lipton, who publicly advocates agnosticism about whether P=NP.

Of course, not all the doubters reach their doubts the same way.  For Lipton, the thinking is probably something like: as scientists, we should be rigorously open-minded, and constantly question even the most fundamental hypotheses of our field.  For the outsiders, the thinking is more like: computer scientists are just not very smart—certainly not as smart as real scientists—so the fact that they consider something a “fundamental hypothesis” provides no information of value.

Consider, for example, this comment of Ignacio Mosqueira:

If there is no proof that means that there is no reason a-priori to prefer your arguments over those [of] Lubos. Expertise is not enough.  And the fact that Lubos is difficult to deal with doesn’t change that.

In my response, I wondered how broadly Ignacio would apply the principle “if there’s no proof, then there’s no reason to prefer any argument over any other one.”  For example, would he agree with the guy interviewed on Jon Stewart who earnestly explained that, since there’s no proof that turning on the LHC will destroy the world, but also no proof that it won’t destroy the world, the only rational inference is that there’s a 50% chance it will destroy the world?  (John Oliver’s deadpan response was classic: “I’m … not sure that’s how probability works…”)

In a lengthy reply, Luboš bites this bullet with relish and mustard.  In physics, he agrees, or even in “continuous mathematics that is more physics-wise,” it’s possible to have justified beliefs even without proof.  For example, he admits to a 99.9% probability that the Riemann hypothesis is true.  But, he goes on, “partial evidence in discrete mathematics just cannot exist.”  Discrete math and computer science, you see, are so arbitrary, manmade, and haphazard that every question is independent of every other; no amount of experience can give anyone any idea which way the next question will go.

No, I’m not kidding.  That’s his argument.

I couldn’t help wondering: what about number theory?  Aren’t the positive integers a “discrete” structure?  And isn’t the Riemann Hypothesis fundamentally about the distribution of primes?  Or does the Riemann Hypothesis get counted as an “honorary physics-wise continuous problem” because it can also be stated analytically?  But then what about Goldbach’s Conjecture?  Is Luboš 50/50 on that one too?  Better yet, what about continuous, analytic problems that are closely related to P vs. NP?  For example, Valiant’s Conjecture says you can’t linearly embed the permanent of an n×n matrix as the determinant of an m×m matrix, unless m≥exp(n).  Mulmuley and others have connected this “continuous cousin” of P≠NP to issues in algebraic geometry, representation theory, and even quantum groups and Langlands duality.  So, does that make it kosher?  The more I thought about the proposed distinction, the less sense it made to me.

But enough of this.  In the rest of this post, I want to explain why the odds that you should assign to P≠NP are more like 99% than they are like 50%.  This post supersedes my 2006 post on the same topic, which I hereby retire.  While that post was mostly OK as far as it went, I now feel like I can do a much better job articulating the central point.  (And also, I made the serious mistake in 2006 of striving for literary eloquence and tongue-in-cheek humor.  That works great for readers who already know the issues inside-and-out, and just want to be amused.  Alas, it doesn’t work so well for readers who don’t know the issues, are extremely literal-minded, and just want ammunition to prove their starting assumption that I’m a doofus who doesn’t understand the basics of his own field.)

So, OK, why should you believe P≠NP?  Here’s why:

Because, like any other successful scientific hypothesis, the P≠NP hypothesis has passed severe tests that it had no good reason to pass were it false.

What kind of tests am I talking about?

By now, tens of thousands of problems have been proved to be NP-complete.  They range in character from theorem proving to graph coloring to airline scheduling to bin packing to protein folding to auction pricing to VLSI design to minimizing soap films to winning at Super Mario Bros.  Meanwhile, another cluster of tens of thousands of problems has been proved to lie in P (or BPP).  Those range from primality to matching to linear and semidefinite programming to edit distance to polynomial factoring to hundreds of approximation tasks.  Like the NP-complete problems, many of the P and BPP problems are also related to each other by a rich network of reductions.  (For example, countless other problems are in P “because” linear and semidefinite programming are.)

So, if we were to draw a map of the complexity class NP  according to current knowledge, what would it look like?  There’d be a huge, growing component of NP-complete problems, all connected to each other by an intricate network of reductions.  There’d be a second huge component of P problems, many of them again connected by reductions.  Then, much like with the map of the continental US, there’d be a sparser population in the middle: stuff like factoring, graph isomorphism, and Unique Games that for various reasons has thus far resisted assimilation onto either of the coasts.

Of course, to prove P=NP, it would suffice to find a single link—that is, a single polynomial-time equivalence—between any of the tens of thousands of problems on the P coast, and any of the tens of thousands on the NP-complete one.  In half a century, this hasn’t happened: even as they’ve both ballooned exponentially, the two giant regions have remained defiantly separate from each other.  But that’s not even the main point.  The main point is that, as people explore these two regions, again and again there are “close calls”: places where, if a single parameter had worked out differently, the two regions would have come together in a cataclysmic collision.  Yet every single time, it’s just a fake-out.  Again and again the two regions “touch,” and their border even traces out weird and jagged shapes.  But even in those border zones, not a single problem ever crosses from one region to the other.  It’s as if they’re kept on their respective sides by an invisible electric fence.

As an example, consider the Set Cover problem: i.e., the problem, given a collection of subsets S1,…,Sm⊆{1,…,n}, of finding as few subsets as possible whose union equals the whole set.  Chvatal showed in 1979 that a greedy algorithm can produce, in polynomial time, a collection of sets whose size is at most ln(n) times larger than the optimum size.  This raises an obvious question: can you do better?  What about 0.9ln(n)?  Alas, building on a long sequence of prior works in PCP theory, it was recently shown that, if you could find a covering set at most (1-ε)ln(n) times larger than the optimum one, then you’d be solving an NP-complete problem, and P would equal NP.  Notice that, conversely, if the hardness result worked for ln(n) or anything above, then we’d also get P=NP.  So, why do the algorithm and the hardness result “happen to meet” at exactly ln(n), with neither one venturing the tiniest bit beyond?  Well, we might say, ln(n) is where the invisible electric fence is for this problem.

Want another example?  OK then, consider the “Boolean Max-k-CSP” problem: that is, the problem of setting n bits so as to satisfy the maximum number of constraints, where each constraint can involve an arbitrary Boolean function on any k of the bits.  The best known approximation algorithm, based on semidefinite programming, is guaranteed to satisfy at least a 2k/2k fraction of the constraints.  Can you guess where this is going?  Recently, Siu On Chan showed that it’s NP-hard to satisfy even slightly more than a 2k/2k fraction of constraints: if you can, then P=NP.  In this case the invisible electric fence sends off its shocks at 2k/2k.

I could multiply such examples endlessly—or at least, Dana (my source for such matters) could do so.  But there are also dozens of “weird coincidences” that involve running times rather than approximation ratios; and that strongly suggest, not only that P≠NP, but that problems like 3SAT should require cn time for some constant c.  For a recent example—not even a particularly important one, but one that’s fresh in my memory—consider this paper by myself, Dana, and Russell Impagliazzo.  A first thing we do in that paper is to give an approximation algorithm for a family of two-prover games called “free games.”  Our algorithm runs in quasipolynomial time:  specifically, nO(log(n)).  A second thing we do is show how to reduce the NP-complete 3SAT problem to free games of size ~2O(√n).

Composing those two results, you get an algorithm for 3SAT whose overall running time is roughly

$$2^{O( \sqrt{n} \log 2^{\sqrt{n}}) } = 2^{O(n)}.$$

Of course, this doesn’t improve on the trivial “try all possible solutions” algorithm.  But notice that, if our approximation algorithm for free games had been slightly faster—say, nO(log log(n))—then we could’ve used it to solve 3SAT in $$2^{O(\sqrt{n} \log n)}$$ time.  Conversely, if our reduction from 3SAT had produced free games of size (say) $$2^{O(n^{1/3})}$$ rather than 2O(√n), then we could’ve used that to solve 3SAT in $$2^{O(n^{2/3})}$$ time.

I should stress that these two results have completely different proofs: the approximation algorithm for free games “doesn’t know or care” about the existence of the reduction, nor does the reduction know or care about the algorithm.  Yet somehow, their respective parameters “conspire” so that 3SAT still needs cn time.  And you see the same sort of thing over and over, no matter which problem domain you’re interested in.  These ubiquitous “coincidences” would be immediately explained if 3SAT actually did require cn time—i.e., if it had a “hard core” for which brute-force search was unavoidable, no matter which way you sliced things up.  If that’s not true—i.e., if 3SAT has a subexponential algorithm—then we’re left with unexplained “spooky action at a distance.”  How do the algorithms and the reductions manage to coordinate with each other, every single time, to avoid spilling the subexponential secret?

Notice that, contrary to Luboš’s loud claims, there’s no “symmetry” between P=NP and P≠NP in these arguments.  Lower bound proofs are much harder to come across than either algorithms or reductions, and there’s not really a mystery about why: it’s hard to prove a negative!  (Especially when you’re up against known mathematical barriers, including relativization, algebrization, and natural proofs.)  In other words, even under the assumption that lower bound proofs exist, we now understand a lot about why the existing mathematical tools can’t deliver them, or can only do so for much easier problems.  Nor can I think of any example of a “spooky numerical coincidence” between two unrelated-seeming results, which would’ve yielded a proof of P≠NP had some parameters worked out differently.  P=NP and P≠NP can look like “symmetric” possibilities only if your symmetry is unbroken by knowledge.

Imagine a pond with small yellow frogs on one end, and large green frogs on the other.  After observing the frogs for decades, herpetologists conjecture that the populations represent two distinct species with different evolutionary histories, and are not interfertile.  Everyone realizes that to disprove this hypothesis, all it would take would be a single example of a green/yellow hybrid.  Since (for some reason) the herpetologists really care about this question, they undertake a huge program of breeding experiments, putting thousands of yellow female frogs next to green male frogs (and vice versa) during mating season, with candlelight, soft music, etc.  Nothing.

As this green vs. yellow frog conundrum grows in fame, other communities start investigating it as well: geneticists, ecologists, amateur nature-lovers, commercial animal breeders, ambitious teenagers on the science-fair circuit, and even some extralusionary physicists hoping to show up their dimwitted friends in biology.  These other communities try out hundreds of exotic breeding strategies that the herpetologists hadn’t considered, and contribute many useful insights.  They also manage to breed a larger, greener, but still yellow frog—something that, while it’s not a “true” hybrid, does have important practical applications for the frog-leg industry.  But in the end, no one has any success getting green and yellow frogs to mate.

Then one day, someone exclaims: “aha!  I just found a huge, previously-unexplored part of the pond where green and yellow frogs live together!  And what’s more, in this part, the small yellow frogs are bigger and greener than normal, and the large green frogs are smaller and yellower!”

This is exciting: the previously-sharp boundary separating green from yellow has been blurred!  Maybe the chasm can be crossed after all!

Alas, further investigation reveals that, even in the new part of the pond, the two frog populations still stay completely separate.  The smaller, yellower frogs there will mate with other small yellow frogs (even from faraway parts of the pond that they’d never ordinarily visit), but never, ever with the larger, greener frogs even from their own part.  And vice versa.  The result?  A discovery that could have falsified the original hypothesis has instead strengthened it—and precisely because it could’ve falsified it but didn’t.

Now imagine the above story repeated a few dozen more times—with more parts of the pond, a neighboring pond, sexually-precocious tadpoles, etc.  Oh, and I forgot to say this before, but imagine that doing a DNA analysis, to prove once and for all that the green and yellow frogs had separate lineages, is extraordinarily difficult.  But the geneticists know why it’s so difficult, and the reasons have more to do with the limits of their sequencing machines and with certain peculiarities of frog DNA, than with anything about these specific frogs.  In fact, the geneticists did get the sequencing machines to work for the easier cases of turtles and snakes—and in those cases, their results usually dovetailed well with earlier guesses based on behavior.  So for example, where reddish turtles and bluish turtles had never been observed interbreeding, the reason really did turn out to be that they came from separate species.  There were some surprises, of course, but nothing even remotely as shocking as seeing the green and yellow frogs suddenly getting it on.

Now, even after all this, someone could saunter over to the pond and say: “ha, what a bunch of morons!  I’ve never even seen a frog or heard one croak, but I know that you haven’t proved anything!  For all you know, the green and yellow frogs will start going at it tomorrow.  And don’t even tell me about ‘the weight of evidence,’ blah blah blah.  Biology is a scummy mud-discipline.  It has no ideas or principles; it’s just a random assortment of unrelated facts.  If the frogs started mating tomorrow, that would just be another brute, arbitrary fact, no more surprising or unsurprising than if they didn’t start mating tomorrow.  You jokers promote the ideology that green and yellow frogs are separate species, not because the evidence warrants it, but just because it’s a convenient way to cover up your own embarrassing failure to get them to mate.  I could probably breed them myself in ten minutes, but I have better things to do.”

At this, a few onlookers might nod appreciatively and say: “y’know, that guy might be an asshole, but let’s give him credit: he’s unafraid to speak truth to competence.”

Even among the herpetologists, a few might beat their breasts and announce: “Who’s to say he isn’t right?  I mean, what do we really know?  How do we know there even is a pond, or that these so-called ‘frogs’ aren’t secretly giraffes?  I, at least, have some small measure of wisdom, in that I know that I know nothing.”

What I want you to notice is how scientifically worthless all of these comments are.  If you wanted to do actual research on the frogs, then regardless of which sympathies you started with, you’d have no choice but to ignore the naysayers, and proceed as if the yellow and green frogs were different species.  Sure, you’d have in the back of your mind that they might be the same; you’d be ready to adjust your views if new evidence came in.  But for now, the theory that there’s just one species, divided into two subgroups that happen never to mate despite living in the same habitat, fails miserably at making contact with any of the facts that have been learned.  It leaves too much unexplained; in fact it explains nothing.

For all that, you might ask, don’t the naysayers occasionally turn out to be right?  Of course they do!  But if they were right more than occasionally, then science wouldn’t be possible.  We would still be in caves, beating our breasts and asking how we can know that frogs aren’t secretly giraffes.

So, that’s what I think about P and NP.  Do I expect this post to convince everyone?  No—but to tell you the truth, I don’t want it to.  I want it to convince most people, but I also want a few to continue speculating that P=NP.

Why, despite everything I’ve said, do I want maybe-P=NP-ism not to die out entirely?  Because alongside the P=NP carpers, I also often hear from a second group of carpers.  This second group says that P and NP are so obviously, self-evidently unequal that the quest to separate them with mathematical rigor is quixotic and absurd.  Theoretical computer scientists should quit wasting their time struggling to understand truths that don’t need to be understood, but only accepted, and do something useful for the world.  (A natural generalization of this view, I guess, is that all basic science should end.)  So, what I really want is for the two opposing groups of naysayers to keep each other in check, so that those who feel impelled to do so can get on with the fascinating quest to understand the ultimate limits of computation.

Update (March 8): At least eight readers have by now emailed me, or left comments, asking why I’m wasting so much time and energy arguing with Luboš Motl.  Isn’t it obvious that, ever since he stopped doing research around 2006 (if not earlier), this guy has completely lost his marbles?  That he’ll never, ever change his mind about anything?

Yes.  In fact, I’ve noticed repeatedly that, even when Luboš is wrong about a straightforward factual matter, he never really admits error: he just switches, without skipping a beat, to some other way to attack his interlocutor.  (To give a small example: watch how he reacts to being told that graph isomorphism is neither known nor believed to be NP-complete.  Caught making a freshman-level error about the field he’s attacking, he simply rants about how graph isomorphism is just as “representative” and “important” as NP-complete problems anyway, since no discrete math question is ever more or less “important” than any other; they’re all equally contrived and arbitrary.  At the Luboš casino, you lose even when you win!  The only thing you can do is stop playing and walk away.)

Anyway, my goal here was never to convince Luboš.  I was writing, not for him, but for my other readers: especially for those genuinely unfamiliar with these interesting issues, or intimidated by Luboš’s air of certainty.  I felt like I owed it to them to set out, clearly and forcefully, certain facts that all complexity theorists have encountered in their research, but that we hardly ever bother to articulate.  If you’ve never studied physics, then yes, it sounds crazy that there would be quadrillions of invisible neutrinos coursing through your body.  And if you’ve never studied computer science, it sounds crazy that there would be an “invisible electric fence,” again and again just barely separating what the state-of-the-art approximation algorithms can handle from what the state-of-the-art PCP tools can prove is NP-complete.  But there it is, and I wanted everyone else at least to see what the experts see, so that their personal judgments about the likelihood of P=NP could be informed by seeing it.

Luboš’s response to my post disappointed me (yes, really!).  I expected it to be nasty and unhinged, and so it was.  What I didn’t expect was that it would be so intellectually lightweight.  Confronted with the total untenability of his foot-stomping distinction between “continuous math” (where you can have justified beliefs without proof) and “discrete math” (where you can’t), and with exactly the sorts of “detailed, confirmed predictions” of the P≠NP hypothesis that he’d declared impossible, Luboš’s response was simply to repeat his original misconceptions, but louder.

And that brings me, I confess, to a second reason for my engagement with Luboš.  Several times, I’ve heard people express sentiments like:

Yes, of course Luboš is a raging jerk and a social retard.  But if you can just get past that, he’s so sharp and intellectually honest!  No matter how many people he needlessly offends, he always tells it like it is.

I want the nerd world to see—in as stark a situation as possible—that the above is not correct.  Luboš is wrong much of the time, and he’s intellectually dishonest.

At one point in his post, Luboš actually compares computer scientists who find P≠NP a plausible working hypothesis to his even greater nemesis: the “climate cataclysmic crackpots.”  (Strangely, he forgot to compare us to feminists, Communists, Muslim terrorists, or loop quantum gravity theorists.)  Even though the P versus NP and global warming issues might not seem closely linked, part of me is thrilled that Luboš has connected them as he has.  If, after seeing this ex-physicist’s “thought process” laid bare on the P versus NP problem—how his arrogance and incuriosity lead him to stake out a laughably-absurd position; how his vanity then causes him to double down after his errors are exposed—if, after seeing this, a single person is led to question Lubošian epistemology more generally, then my efforts will not have been in vain.

Anyway, now that I’ve finally unmasked Luboš—certainly to my own satisfaction, and I hope to that of most scientifically-literate readers—I’m done with this.  The physicist John Baez is rumored to have said: “It’s not easy to ignore Luboš, but it’s ALWAYS worth the effort.”  It took me eight years, but I finally see the multiple layers of profundity hidden in that snark.

And thus I make the following announcement:

For the next three years, I, Scott Aaronson, will not respond to anything Luboš says, nor will I allow him to comment on this blog.

In March 2017, I’ll reassess my Luboš policy.  Whether I relent will depend on a variety of factors—including whether Luboš has gotten the professional help he needs (from a winged pig, perhaps?) and changed his behavior; but also, how much my own quality of life has improved in the meantime.

### Quantum Diaries — My Week as a Real Scientist

For a week at the end of January, I was a real scientist. Actually, I’m always a real scientist, but only for that week was I tweeting from the @realscientists Twitter account, which has a new scientist each week typing about his or her life and work. I tweeted a lot. I tweeted about the conference I was at. I tweeted about the philosophy of science and religion. I tweeted about how my wife, @CuratorPolly, wasn’t a big fan of me being called the “curator” of the account for the week. I tweeted about airplanes and very possibly bagels. But most of all I tweeted the answers to questions about particle physics and the LHC.

Real Scientists wrote posts for the start and end of my week, and all my tweets for the week are at this Storify page. My regular twitter account, by the way, is @sethzenz.

I was surprised by how many questions people had when I they were told that a real physicist at a relatively high-profile Twitter account was open for questions. A lot of the questions had answers that can already be found, often right here on Quantum Diaries! It got me thinking a bit about different ways to communicate to the public about physics. People really seem to value personal interaction, rather than just looking things up, and they interact a lot with an account that they know is tweeting in “real time.” (I almost never do a tweet per minute with my regular account, because I assume it will annoy people, but it’s what people expect stylistically from the @realscientists account.) So maybe we should do special tweet sessions from one of the CERN-related accounts, like @CMSexperiment, where we get four physicists around one computer for an hour and answer questions. (A lot of museums did a similar thing with #AskACurator day last September.) We’ve also discussed the possibility of doing a AMA on Reddit. And the Hangout with CERN series will be starting again soon!

But while you’re waiting for all that, let me tell you a secret: there are lots of physicists on Twitter. (Lists here and here and here, four-part Symmetry Magazine series here and here and here and here.) And I can’t speak for everyone, but an awful lot of us would answer questions if you had any. Anytime. No special events. Just because we like talking about our work. So leave us comments. Tweet at us. Your odds of getting an answer are pretty good.

In other news, Real Scientists is a finalist for the Shorty Award for social media’s best science. We’ll have to wait and see how they — we? — do in a head-to-head matchup with giants like NASA and Neil deGrasse Tyson. But I think it’s clear that people value hearing directly from researchers, and social media seems to give us more and more ways to communicate every year.

### Chad Orzel — Obligatory Cosmos Commentary

It says here in the fine print that my blogging license could be revoked if I fail to offer a public opinion on the Cosmos reboot, which premiered last night. I missed the first couple of minutes– I had The Pip for bedtime, and he didn’t start snoring until 8:58– but saw most of it in real time. I posted a bit of commentary on Twitter, but will offer something marginally less ephemeral here.

The show opened and closed with tributes to Carl Sagan, and Neil deGrasse Tyson standing on the same cliff where Sagan opened the original series back in 1980. That was good and fitting, and Tyson’s story about visiting Sagan at Cornell in 1975 was very touching.

Between those bookends, the first episode was more or less an introduction and outline of the series to come, and as such served mostly to showcase some really amazing visuals. The tour of our “long address” from Earth through the Solar System to the Milky Way and out to the observable universe was spectacular. The “cosmic calendar” sequence at the end was likewise extremely well done. There wasn’t anything all that new or amazing about the content– I think SteelyKid’s after-school program covered all the information about the Solar System– but the vast improvement in effects technology over the three and a bit decades since Sagan’s original series really showed up in these bits.

I was less taken with the animated “Giordano Bruno, martyr of Science” section, which my limited understanding of the history suggests was playing a little loose with the facts. Bruno’s one of those historical figures where if you tilt your head and squint like a confused dog, you can make parts of what he was saying seem to match up with modern science. Provided you’re willing to ignore some other bits that just look nutty. And this played up the modern-ish bits in a big way, while completely ignoring the nutty parts.

I understand the attraction of Bruno as a metaphor, and it’s probably not any less valid to cite his infinite cosmology as a historical antecedent of modern cosmology than it is to cite Democritus as the precursor of the modern notion of atoms. In both cases, there’s an element of similarity, and a gap of centuries in which everybody basically ignored the idea, before it was picked up again to give a veneer of historical depth to a new development. At the same time, though, I’m uneasy about the way the useful bits of Bruno’s cosmology get picked out and highlighted, while ignoring the rest of the package– this doesn’t strike me as all that much different than the games the 2012 cranks played with the Mayans, grabbing hold of the accurate parts of their astronomy while ignoring the goofier bits, something Tyson has directly and specifically mocked.

Then again, while I remember being blown away by the original Cosmos — contrary to this post, you don’t need to be over 50 to have seen the original, thankyouverymuch– I don’t remember much about the history segments. I remember they were there, of course, but mostly what I was interested in was the “spaceship of the imagination” sequences, zipping off to look at other planets. The couple of times in recent years that I’ve watched bits of the original, I’ve had a lot of “Oh, right, historical recreations…” moments. As such, I can’t really vouch for the accuracy of the history of science presented by the original. So this might not represent a departure from the form, just a change in my state of knowledge.

And that was kind of the other theme of the evening– my reaction to this reboot was largely colored by the fact that I know so much more than I did in 1980. So, while there were a lot of gosh-wow moments, I also found myself mentally quibbling about little details, like the vast overabundance of asteroids and Kuiper Belt objects. (And Giordano Bruno…) I totally understand the practical reasons for this– I’m not arguing that such minor inaccuracies for the sake of great visuals are a cardinal sin, or anything– but it was striking how much knowing stuff changed my reaction. I thought several times that this would’ve been way more awesome if I could go back to being 9 years old to watch it.

Sadly, SteelyKid is probably a little too young for this– it’s certainly airing way past her bedtime (particularly on a Sunday night– Monday mornings are rough even when she goes to bed on time)– and by the time she’s eight or nine, I suspect there will be chunks of it that are out of date. Of course, that very fact is kind of awesome in its own right…

But, you know, I have it on the DVR, so maybe I’ll try it out and see what she thinks, sometime next weekend when The Pip is napping. I suspect the talky bits are a little too slow for her (and Bruno’s trial might be too scary), but I bet she’d be appropriately awed by the visuals of the Solar System tour.

Anyway, as a series premiere, I thought this hit the right notes: paying respects to the original, showing off the spectacular new visuals, and setting the stage for the rest of the series to come. I expect future episodes will go into specific areas of science in a bit more depth, and I look forward to seeing what they come up with.

(Also, I look forward to seeing this on DVD, because, damn, the commercials are irritating…)

### Resonaances — Weekend Plot: all of dark matter

To put my recent posts into a bigger perspective, here's a graph summarizing all of dark matter particles discovered so far via direct or indirect detection:

The graph shows the number of years the signal has survived vs. the inferred mass of the dark matter particle. The particle names follow the usual Particle Data Group conventions. The label's size is related to the statistical significance of the signal. The colors correspond to the Bayesian likelihood that the signal originates from dark matter, from uncertain (red) to very unlikely (blue). The masses of the discovered particles span impressive 11 orders of magnitude, although the largest concentration is near the weak scale (this is called the WIMP miracle). If I forgot any particle for which a compelling evidence exists, let me know, and I will add it to the graph.

Here are the original references for the Bulbulon, BoehmotCollaron, CDMesonDaemon, CresstonHooperon, Wenigon, Pamelon, and the mother of Bert and Ernie

## March 09, 2014

### Tim Gowers — How do the power-series definitions of sin and cos relate to their geometrical interpretations?

I hope that most of you have either asked yourselves this question explicitly, or at least felt a vague sense of unease about how the definitions I gave in lectures, namely

$\displaystyle \cos x = 1 - \frac{x^2}{2!}+\frac{x^4}{4!}-\dots$

and

$\displaystyle \sin x = x - \frac{x^3}{3!}+\frac{x^5}{5!}-\dots,$

relate to things like the opposite, adjacent and hypotenuse. Using the power-series definitions, we proved several facts about trigonometric functions, such as the addition formulae, their derivatives, and the fact that they are periodic. But we didn’t quite get to the stage of proving that if $x^2+y^2=1$ and $\theta$ is the angle that the line from $(0,0)$ to $(x,y)$ makes with the line from $(0,0)$ to $(1,0)$, then $x=\cos\theta$ and $y=\sin\theta$. So how does one establish that? How does one even define the angle? In this post, I will give one possible answer to these questions.

### A couple of possible approaches that I won’t attempt to use

A cheating and not wholly satisfactory method would be to define the angle $\theta$ to be $\cos^{-1}(x)$. Then it would be trivial that $\cos\theta=x$ and we could use facts we know to prove that $\sin\theta=y$. (Or could we? Wouldn’t we just get that it was $\pm y$? The fact that many angles have the same $\cos$ and $\sin$ creates annoying difficulties for this approach, though ones that could in principle be circumvented.) But if we did this, how could we be confident that the notion of angle we had just defined coincided with what we think angle should be? The problem has not been fully solved.

Another approach might be to define trigonometric functions geometrically, prove that they have the basic properties that we established using the power series definitions, and prove that these properties characterize the trigonometric functions (meaning that any two functions $C$ and $S$ that have the properties must be $\cos$ and $\sin$). However, this still requires us to make sense of the notion of angle somehow, and we might also feel slightly worried about whether the geometric arguments we used to justify the addition formulae and the like were truly rigorous. (I’m not saying it can’t be done satisfactorily — just that I don’t immediately see a good way of doing it, and I have a different approach to present.)

### Defining angle

How are radians defined? You take a line L starting at the origin, and it hits the unit circle at some point P. Then the angle that line makes with the horizontal (or rather, the horizontal heading out to the right) is defined to be the length of the circular arc that goes anticlockwise round the unit circle from $(1,0)$ to P. (This defines a number between 0 and $2\pi$, but we can worry about numbers outside this range later.)

### Calculating the length of a circular arc

There is nothing wrong with this definition, except that it requires us to make rigorous sense of the length of a circular arc. How are we to do this?

For simplicity, let’s assume that our point P is $(x,y)$ and that both $x$ and $y$ are positive. So P is in the top right quadrant of the unit circle. How can we define and then calculate the length of the arc from $(1,0)$ to $(x,y)$, or equivalently from $(x,y)$ to $(1,0)$?

One non-rigorous but informative way of thinking about this is that for each $t$ between $x$ and $1$, we should take an interval $[x,x+dt]$, work out the length of the bit of the circle vertically above this interval, and sum up all those lengths. The bit of the circle in question is a straight line (since $dt$ is infinitesimally small) and by similar triangles its length is $\frac{dt}{\sqrt{1-t^2}}$.

How did I write that down? Well, the big triangle I was thinking of was one with vertices $(0,0)$, $(t,0)$ and the point on the circle directly above $(t,0)$, which is $(t,\sqrt{1-t^2})$, by Pythagoras’s theorem. The little triangle has one side of length $dt$, which corresponds to the side in the big triangle of length $\sqrt{1-t^2}$. So the hypotenuse of the little triangle is $\frac{dt}{\sqrt{1-t^2}}$, as I claimed.

Adding all these little lengths up, we get $\int_x^1\frac{dt}{\sqrt{1-t^2}}$, so it remains to evaluate this integral.

This is of course a very standard integral, usually solved by substituting $\cos\theta$ or $\sin\theta$ for $t$. If you do that, you find that the length works out as $\cos^{-1}(x)$, which is just what we hoped. However, we haven’t discussed integration by substitution in this course, so let us see it in a more elementary way (not that proving an appropriate form of the integration-by-substitution rule is especially hard).

Using the rules for differentiating inverses, we find that

$\displaystyle \frac{d}{dt}\cos^{-1}(t)=\frac 1{-\sin(\cos^{-1}(t))},$

and since $\sin\theta=\sqrt{1-\cos^2\theta}$, this gives us $\frac {-1}{\sqrt{1-t^2}}$. So the integrand has $-\cos^{-1}(t)$ as an antiderivative, and therefore, by the fundamental theorem of calculus,

$\int_x^1\frac{dt}{\sqrt{1-t^2}}=[-\cos^{-1}(t)]_x^1=\cos^{-1}(x).$

So the angle $\theta$ between the horizontal and the line joining the origin to $(x,y)$ is (by definition) the length of the arc from $(1,0)$ to $(x,y)$, which we have calculated to be $\cos^{-1}(x)$. Therefore, $\cos\theta=x$.

### How close was that to being rigorous?

The process I just went through, of saying “Let’s add up a whole lot of infinitesimal lengths; that says we should write down the following integral; calculating the integral gives us L, so the length is L,” is a process that one often goes through when calculating similar quantities. Why are we so confident that it is OK?

I sometimes realize with mathematical questions like this that I have been a mathematician for many years and never bothered to worry about them. It’s just sort of obvious that if a function is reasonably nice, then writing something down that’s approximately true with $\delta x$ and turning $\delta x$ into $dx$ and writing a nice $\int$ sign in front gives you a correct expression for the quantity in question. But let’s try to think a bit about how we might define length rigorously.

#### Curves

First, we should say what a curve is. There are various definitions, according to how much niceness one wants to assume, but let me take a basic definition: a curve is a continuous function $f$ from an interval $[a,b]$ to $\mathbb{R}^2$. (I haven’t defined continuous functions to $\mathbb{R}^2$, but it simply means that if $f(t)=(g(t),h(t))$, then $g$ and $h$ are both continuous functions from $[a,b]$ to $\mathbb{R}$.)

This is an example of a curious habit of mathematicians of defining objects as things that they clearly aren’t. Surely a curve is not a function — it’s a special sort of subset of the plane. In fact, shouldn’t a curve be defined as the image of a continuous function from $[a,b]$ to $\mathbb{R}^2$? It’s true that that corresponds more closely to what we are thinking of when we use the word “curve”, but the definition I’ve just given turns out to be more convenient, though it’s important to add that two curves (as I’ve defined them) $f_1:[a,b]\to\mathbb{R}^2$ and $f_2:[c,d]\to\mathbb{R}^2$ are equivalent if there is a strictly increasing continuous bijection $\phi:[a,b]\to[c,d]$ such that $f_1(x)=f_2(\phi(x))$ for every $x\in [a,b]$. In this situation, we think of $f_1$ and $f_2$ as different ways of representing the same curve.

Incidentally, if you want a reason not to identify curves with their images, then one quite good reason is the existence of objects called space-filling curves. These are continuous functions from intervals of reals to $\mathbb{R}^2$ that fill up entire two-dimensional sets. Here’s a picture of one, lifted from Wikipedia.

It shows the first few iterations of a process that gives you a sequence of functions that converge to a continuous limit that fills up an entire square.

#### Lengths of curves

Going back to lengths, let’s think about how one might define them. The one thing we know how to define is the length of a line segment. (Strictly speaking, I’m not allowed to say that, since a line segment isn’t a function, but let’s understand it as a particularly simple function from an interval $[a,b]$ to a line segment in the plane.) Given that, a reasonable definition of length would seem to be to approximate a given curve by a whole lot of little line segments. That leads to the following idea for at least approximating the length of a curve $f:[a,b]\to\mathbb{R}$. We take a dissection $a=x_0 and add up all the little distances $d(f(x_{i-1}),f(x_i))$. Here I am defining the distance between two points in $\mathbb{R}^2$ in the normal way by Pythagoras’s theorem. This gives us the expression

$\sum_{i=1}^n d(f(x_{i-1}),f(x_i))$

for the approximate length given by the dissection. We then hope that as the differences $x_i-x_{i-1}$ get smaller and smaller, these estimates will tend to a limit. It isn’t hard to see that if you refine a dissection, then the estimate increases (you are replacing the length of a line segment that joins two points by the length of a path that consists of line segments and joins the same two points).

Actually, that hope is not always fulfilled: sometimes the estimates tend to infinity. Indeed, for space-filling curves, or fractal-like curves such as the Koch snowflake, the estimates do tend to infinity. In this case, we say that they have infinite length. But if the estimates tend to a limit as the maximum of the differences $x_i-x_{i-1}$ tends to zero, we call that limit the length of the curve. A curve that has a finite length defined this way is called rectifiable.

Suppose now that we have a curve $f:[a,b]\to\mathbb{R}^2$ given by $f(t)=(g(t),h(t))$ and that the two functions $g$ and $h$ are continuously differentiable. Then both $g$ and $h$ are bounded on $[a,b]$, so let’s suppose that $M$ is an upper bound for $|g'(t)|$ and $|h'(t)|$. Then by the mean value theorem,

$d(f(x_{i-1}),f(x_i))=(g(x_i)-g(x_{i-1}))^2+(h(x_i)-h(x_{i-1}))^2)^{1/2}$
$\leq(M^2(x_i-x_{i-1})^2+M^2(x_i-x_{i-1})^2)^{1/2}=\sqrt{2}M(x_i-x_{i-1})$

Therefore, $\sum_{i=1}^n d(f(x_{i-1}),f(x_i))\leq\sqrt{2}M(b-a)$ for every dissection, which implies that the curve is rectifiable. (Remark: I didn’t really use the continuity of the derivatives there — just their boundedness.)

We can say slightly more than this, however. The differentiability of $g$ tells us that $g(x_i)-g(x_{i-1})=(x_i-x_{i-1})g'(c_i)$ for some $c_i\in(x_{i-1},x_i)$. And similarly for $h$ with some $d_i$. Therefore, the estimate for the length can be written

$\sum_{i=1}^n(x_i-x_{i-1})(g'(c_i)^2+h'(d_i)^2)^{1/2}$

This looks very similar to the kind of thing we write down when doing Riemann integration, so let’s see whether we can find a precise connection. We are concerned with the function $k(t)=(g'(t)^2+h'(t)^2)^{1/2}$. If we now do use the continuity of $g'$ and $h'$, then $k$ is continuous too, so it can be integrated. Now since $c_i$ and $d_i$ belong to the interval $(x_{i-1},x_i)$, $\sum_{i=1}^n(x_i-x_{i-1})k(c_i)$ and $\sum_{i=1}^n(x_i-x_{i-1})k(d_i)$ both lie between the lower and upper sums given by the dissection. That implies the same for

$\sum_{i=1}^n(x_i-x_{i-1})(g'(c_i)^2+h'(d_i)^2)^{1/2}$

Since $k$ is integrable, the limit of $\sum_{i=1}^n(x_i-x_{i-1})(g'(c_i)^2+h'(d_i)^2)^{1/2}$ as the largest $x_i-x_{i-1}$ (which is often called the mesh of the dissection) tends to zero is $\int_a^bk(t)dt$.

We have shown that the length of the curve is given by the formula

$\int_a^b(g'(t)^2+h'(t)^2)^{1/2}dt$

Now, finally, let’s see whether we can justify our calculation of the length of the arc of the unit circle between $(1,0)$ and $(x,y)$. It would be nice to parametrize the circle as $(\sin\theta,\cos\theta)$, but we can’t do that, since we are defining $\theta$ using length, so we would end up with a circular definition (in more than one sense). [Actually, we can do something very close to this. See the final section of the post for details.] So let’s parametrize it as follows. We’ll define $f$ on the interval $[x,1]$ and we’ll send $t$ to $(t,\sqrt{1-t^2})$. Then $g'(t)=1$ and $h'(t)=-t(1-t^2)^{-1/2}$, so

$\displaystyle (g'(t)^2+h'(t)^2)^{1/2}=(1+\frac{t^2}{1-t^2})^{1/2}=\frac 1{\sqrt{1-t^2}}$

So the length is $\int_x^1\frac{dt}{\sqrt{1-t^2}}$, which is exactly the expression we wrote down earlier.

Let me make two quick remarks about that. First, you might argue that although I have shown that the final expression is indeed correct, I haven’t shown that the informal argument is (essentially) correct. But I more or less have, since what I have effectively done is calculate the lengths of the hypotenuses of the little triangles in a slightly different way. Before, I used the fact that one side was $dt$ and used similar triangles. Here I’ve used the fact that one side is $dt$ and another side is $(\frac d{dt}\sqrt{1-t^2})dt$ and used Pythagoras.

A slightly more serious objection is that for this calculation I used a general result that depended on the assumption that both $g$ and $h$ are continuously differentiable, but didn’t check that the appropriate conditions held, which they don’t. The problem is that $h(t)=\sqrt{1-t^2}$, so $h'(t)=-t/\sqrt{1-t^2}$, which tends to infinity as $t\to 1$ and is undefined at $t=1$.

However, it is easy to get round this problem. What we do is integrate from $x$ to $1-\epsilon$, in which case the argument is valid, and then let $\epsilon$ tend to zero. The integral between $x$ and $1-\epsilon$ is $\cos^{-1}(x)-\cos^{-1}(1-\epsilon)$, and that tends to $\cos^{-1}(x)$.

One final remark is that this length calculation explains why the usual substitution of $\cos\theta$ for $t$ in an integral of the form $\int_a^b\frac{dt}{\sqrt{1-t^2}}$ is not a piece of unmotivated magic. It is just a way of switching from one parametrization of a circular arc (using the x-coordinate) to another (using the angle, or equivalently the distance along the circular arc) that one expects to be simpler.

### An easier argument

Thanks to a comment of Jason Fordham below, I now realize that we can after all parametrize the circle as $(\cos\theta,\sin\theta)$. However, this $\theta$ is not the $\theta$ I’m trying to calculate, so let’s call it $\phi$. I’m just taking $\phi$ to be an ordinary real number, and I’m defining $\cos$ and $\sin$ using the power-series definition. Then the arc of the unit circle that goes from $(1,0)$ to $(x,\sqrt{1-x^2})$ can be defined as the curve defined on the interval $[0,\cos^{-1}(x)]$ by the formula $\phi\mapsto (\cos\phi,\sin\phi)$. The general formula for the length of a curve then gives us

$\int_0^{\cos^{-1}(x)}((-\sin\phi)^2+(\cos\phi)^2)^{1/2}d\phi=\int_0^{\cos^{-1}(x)}1=\cos^{-1}(x)$

So the length $\theta$ of the arc satisfies $\cos\theta=x$.

### Jordan Ellenberg — Scientists aren’t experts on what makes jokes funny

This week I finally realized what bugged me about the talk I was hearing about the science of science communication: Nothing. The issue is what I wasn’t hearing.

This was catalyzed by a short news item Lauren Rugani linked on twitter. A scientist had run a study where they discovered that sometimes a punchline is funnier if words from the punchline had been mentioned several minutes earlier. From the abstract:

“These findings also show that pre-exposing a punchline, which in common knowledge should spoil a joke, can actually increase funniness under certain conditions.”

This is shocking. Not the conclusion, which is clearly correct. The problem is that the conclusion has been known to comedians for at least the last several thousand years. When I trained in improv comedy the third class was on callbacks, the jargon term for that technique. The entire structure of an improv comedy set is based around variations on the idea that things are funnier if they’re repeated. And yet to the authors it was “common knowledge” that this will spoil a joke. There is a long tradition of people who know, from experience, how this works, and yet the idea of asking them is not evident anywhere in the paper. This is the problem — the sense that the only valid answers come from inside science and the research world.

Yes!  Everybody knows by now that when mathematicians try to do mathematical biology alone, without people with domain knowledge of biology in the room, they do crappy mathematical biology.  ”Digital humanities” or “neuroaesthetics” or “culturomics” &c are just the same.  New techniques drawn from science and mathematics are fantastic research tools now, and they’re only getting better, but it seems like a terrible idea to study cultural objects from scratch, without domain experts in the room.

### n-Category CaféReview of the Elements of 2-Categories

Guest post by Dimitri Zaganidis

First of all, I would like to thank Emily for organizing the Kan extension seminar. It is a pleasure to be part of it. I want also to thank my advisor Kathryn Hess and my office mate Martina Rovelli for their revisions.

In the fifth installment of the Kan Extension Seminar we read the paper “Review of the Elements of 2-categories” by G.M Kelly and Ross Street. This article was published in the Proceedings of the Sydney Category Theory Seminar, and its purpose is to “serve as a common introduction to the authors’ paper in this volume”.

The article has three main parts, the first of them being definitions in elementary terms of double categories and 2-categories, together with the notion of pasting. In a second chapter, they review adjunctions in 2-categories with a nice expression of the naturality of the bijection given by mates using double categories. The last part of the article introduces monads in 2-categories, and specializing to 2-monads towards the end.

### Double categories and 2-categories

The article starts with the definition of a double category as a category object in the (not locally small) category of categories $\mathbf{CAT}$. (I think that there might be some set theoretic issues with such a category, but you can add small everywhere if you want to stay safe.)

The authors then switch to a description of such an object in terms of objects, horizontal arrows, vertical arrows, and squares, with various compositions and units. I will explain a bit how to go from one description to the other.

A category object is constituted of a category of objects, a category of morphisms, target and source functors, identity functor and a composition.

The category of objects is the category whose morphisms are “the objects” and whose morphisms are the vertical arrows. The category of morphisms is the category whose objects are the horizontal morphisms and whose morphisms are the squares, with vertical composition.

Since the functors $\mathrm{Obj}, \mathrm{Mor}: \mathbf{CAT} \longrightarrow \mathbf{SET}$ preserve pullbacks, by applying them to a double category seen as a category object, we get actual categories. Applying $\mathrm{Obj}$ to the double category, we get the category whose objects are “the objects” and whose morphisms are the horizontal arrows. Applying $\mathrm{Mor}$, we get the category whose objects are the vertical morphisms and whose morphisms are the squares, but this time with horizontal composition.

An interesting thing to notice is that the symmetry of the explicit description of a double category is much more apparent than the symmetry of its description as a category object.

One can define a $2$-category as a double category with a discrete category of objects, or as a $\mathbf{CAT}$-enriched category, exactly as one can define a simplicially enriched small category as either a category enriched over $\mathbf{sSet}$ or as a category object in $\mathbf{sSet}$ with a discrete simplicial set of objects.

The second viewpoint on 2-categories leads to definitions of 2-functors and 2-natural transformations and also to modifications, once one makes clear what enrichment a category of 2-functors inherits.

It is also worthwhile mentioning that the pasting operation makes computations easier to make, because they are more visual. The proof of proposition 2.1 of this paper is a good illustration of this.

The basic example of a 2-category is $\mathbf{CAT}$ itself, with natural transformations as 2-cells (squares).

As category theory describes set-like constructions, 2-category theory describes category-like constructions. You can usually build up categories with as objects sets with extra structure. In the same way, small V-categories, V-functors, and V-natural transformations form a 2-category.

My first motivation to learn about 2-categories was the 2-category of quasi-categories defined by Joyal and which has been studied by Emily Riehl and Dominic Verity in the article The 2-category theory of quasi-categories in particular the category-like constructions one can make with quasi-categories, such as adjunctions and limits.

### Adjunctions and mates in 2-categories

It is not a surprise that 2-categories are the right framework in which to define adjunctions. To build the general definition from the usual one, you just need to replace categories by objects in a 2-category, functors by 1-cells of the 2-category, and natural transformations by its 2-cells.

Adjunctions in a 2-category $\mathcal{C}$ compose (as in $\mathbf{CAT}$), and one can form two, a priori distinct double categories of adjunctions. Both of them will have the objects of $\mathcal{C}$ as objects and the horizontal morphisms being the morphisms of $\mathcal{C}$, while their vertical morphisms are the adjunctions (going in the same direction as the right adjoint, by convention). The two double categories differ on the squares. Given adjunctions $f \dashv u$ and $f' \dashv u'$ together with 1-cells $a:A \longrightarrow A'$ (between the domains of $u$ and $u'$) and $b:B \longrightarrow B'$ (between the codomains of $u$ and $u'$), the squares of the first double category are 2-cells $b u \Rightarrow u'a$ while the squares of the second are 2-cells $f'b \Rightarrow a f$.

Now, the bijective correspondence between these kind of 2-cells given by mates induces an isomorphism of double categories. This means in particular that the horizontal (or vertical) composite of mates is equal to the mate of the corresponding composite.

This is a very beautiful way to express the naturality of the mate correspondence, and it provides a one-line proof of the fact that two 1-cells that are left adjoints to a same 1-cell are naturally isomorphic.

2-categories are also the right framework to define monads. A monad in a 2-category $\mathcal{C}$ and on an object $B$ is a 1-cell $t:B \longrightarrow B$ together with 2-cells $\mu: t^2 \Rightarrow t$ and $\eta: 1_B \Rightarrow t$, verifying the usual equations $\mu \circ (t\mu)= \mu \circ (\mu t)$ and $\mu \circ(t\eta) = 1_B = \mu \circ(\eta t)$. Since 2-functors preserve both horizontal and vertical compositions, for all objects $X$ of $\mathcal{C}$, $t$ induces a monad on $\mathcal{C}(X,B)$, given by post-composition $(t_{\ast},\mu_{\ast},\eta_{\ast})$. The authors call * an action of $t$ on $s:X \longrightarrow B$* a $t_\ast$ algebra structure on $s$.

In Ross Street’s original paper, a monad morphism $(B,t,\mu, \eta) \longrightarrow (B',t',\mu', \eta')$ is a 1-cell $f: B \longrightarrow B'$ together with a $2$-cell $\phi: t'f \Rightarrow f t$ verifying certain conditions.

In this paper, morphisms of monads are defined only for monads on the same object, letting the $1$-cell part of a monad transformation of the previous article be the identity. This leads the authors to reverse the direction of the morphism, since the $2$-cell seems to go in the reverse direction of the $1$-cell!

One might think that fixing $f=1$ is needed by the result which explains that there is a bijection between monad morphisms $t \Rightarrow t'$ and actions of $t$ on $t'$ making $t'$ a “$(t,t')$-bimodule”. In fact, in the case where $f$ is not necessarily the identity, there is a bijection between 2-cells $\phi:t f \Rightarrow f t'$ such that $(f,\phi)$ is a monad functor and actions of $t$ on $ft'$ making $ft'$ a “$(t,t')$-bimodule”. A statement of the same kind can be also made for monad functor transformations (in the sense of the formal theory of monads). A 2-cell $\sigma : f \Rightarrow f'$ is a monad functor transformation $(f,\phi) \longrightarrow (f', \phi')$ if and only if $\sigma t': f t' \Rightarrow f' t'$ is a morphism of “$(t,t')$-bimodules”.

A 2-category admits the construction of algebras if for every monad $(B,t,\mu, \eta)$, the 2-functor $X \mapsto \mathcal {C}(X,B)^{(t_\ast, \mu_\ast, \eta_\ast)}$ is representable. The representing object is called the object of $t$-algebras. By Yoneda, the free-forgetful adjunction can be made internal in this case.

The terminology is justified, because in the $2$-category $\mathbf{CAT}$, it specializes to the usual notions of the category of $t$-algebras and the corresponding free-forgetful adjunction.

A monad in $\mathcal{C}$ is the same as a 2-functor $\mathbf{Mnd} \longrightarrow \mathcal{C}$, where $\mathbf{Mnd}$ is the 2-category with one object and $\Delta_+$, the algebraist’s simplicial category as monoidal hom-category (with ordinal sum). Since moreover, $\mathcal {C}(X,B)^{(t_\ast, \mu_\ast, \eta_\ast)} \cong [\mathbf{Mnd}, \CAT]( \Delta_{+\infty}, \mathcal{C}(X,-)),$ (where $\Delta_{+\infty}$ is the subcategory of maps of $\Delta$ preserving maxima, which is acted on by $\Delta_+$ via ordinal sum) one can see that the object of t-algebras can be expressed as a weighted limit.

As a consequence, it is not surprising that a 2-category admits the construction of algebras under some completeness assumptions.

### Doctrines

In the last part of the article, the authors review the notion of a doctrine, which is a 2-monad in 2-$\mathbf{CAT}$, i.e., a 2-functor $D: \mathcal {C} \longrightarrow \mathcal{C}$, where $\mathcal{C}$ is a 2-category, and 2-natural transformations $m$ and $j$, which are respectively the multiplication and the unit, verifying the usual identities. The fact that it is both a monad on a 2-category and in another one can be a bit disturbing at first.

If $(D,m,j)$ is a doctrine over a 2-category $\mathcal{C}$, then its algebras will be objects $X$ of $\mathcal{C}$ together with an action $DX \longrightarrow X$, exactly as in the case of algebras over a usual monad.

Already with morphisms, we can take advantage of the fact that a 2-category $\mathcal{C}$ has 2-cells, and define $D$-morphisms to be lax in the sense that the diagram $\begin{matrix} DX & \longrightarrow & DY \\ \downarrow & & \downarrow \\ X & \longrightarrow & Y \end{matrix}$ is not supposed to be commutative, but is rather filled by a 2-cell with some coherence properties.

As one might expect, we can actually form a 2-category of such $D$-algebras by adding 2-cells, using again the $2$-cells existing in $\mathcal{C}$.

If we keep only the $D$-morphisms that are strict, we obtain the object of algebras (which should be a $2$-category) that we discussed before.

One example of a doctrine is $\Delta_+ \times - : \mathbf{CAT} \longrightarrow \mathbf{CAT}$ together with the multiplication induced by the ordinal sum, and unit given on $\mathcal {D}$ by the functor $\mathcal{D} \longrightarrow \Delta_+ \times \mathcal{D}$ that sends $d$ to $(\emptyset,d)$.

The algebras for this doctrine will be categories equipped with a monad acting on them, while the $D$-morphisms are transformations of monads, and the $D$-2-cells are exactly the monad functor transformations of Street’s article.

Here, since we have two different 2-categories of algebras (with strict $D$-morphisms or with all of them), one can wonder if monad morphisms $D \longrightarrow D'$ will induce $2$-functors $D'$-$\mathbf{Alg} \longrightarrow D$-$\mathbf{Alg}$ on the level of these $2$-categories.

This is indeed the case, and one can actually go even one step further and define monad modifications, using the fact that 2-$\mathbf{CAT}$ is in fact a 3-category! These modifications between two given monad morphisms are in fact in bijective correspondence with the 2-natural transformations between the $2$-functors induced by these monad morphisms on the level of algebras (with lax D-morphisms). Note that they are not the same as monad morphisms transformations of Street’s article.

This bijection is nice because it implies that you can compare 2-categories of algebras by only looking at the doctrines: if they are equivalent, so are the 2-categories of algebras.

The fact that this bijection does not hold when we restrict only to strict morphism was really surprising to me, but I guess this is the price to pay to use the 3-category structure.

During the last days of April, the Kan extension seminar will be reading the article “Two dimensional monad theory”, by Blackwell, Kelly and Powell. We will then have more to say about these 2-monads!

### Resonaances — One More Try

My blogging juices have been drying up for some time now, and at this point Résonaances is close to withering. This could be expected. The glorious year 2012 with all the excitement of the Higgs boson discovery was inevitably followed by post-coital depression, only amplified  by the shutdown of the LHC for repairs.

One problem with blogging these days is that, in the short run, things are expected to get worse rather than better. The year 2013 was depressing but at least we could not complain of the lack of action. The LHC was flooding us with new results based on the data collected in the first run. On the Higgs front, the 125 GeV particle discovered the year before was established, beyond reasonable doubt, as a Higgs boson related to electroweak symmetry breaking. The CMB results from the Planck experiment were a sweeping victory for the Lambda-CDM description of  the universe at large scales. The LUX experiment provided the best limits so far on the WIMP-nucleon cross section and slashed the hope that we may be on the verge of detecting dark matter. Plus a cherry on the top: ACME limits on electron's electric dipole moment increased the strain on any extension of the Standard Model with new particles at the TeV scale. Yes, a lot to remember, not much to cherish...

And what about 2014? Are there any results to be released this year that could be at least marginally exciting for particle physicists? I don't see much, and the opinion polls that I have conducted are not optimistic either. Basically, we just expect more of the same: the LHC, Planck, ICECUBE, AMS-02, Fermi... Of course, there is always a non-zero probability that some new results from these experiments will turn out to be a smoking gun for new physics, but the later in the game the dimmer the chances are. The only qualitatively new piece of data among those will be the Planck polarization data, but even that is unlikely to be a game-changer. One may also keep an eye on lightweight contenders: small precision experiments that pursue indirect limits on new physics. Recently there's been new such limits on non-standard interactions between electrons and quarks  from JLab's PVDIS Collaboration who study  low-energy scattering of electrons on nuclei. A similar experiment in JLab called Q-weak promises new results and improved limits this year. If there's anything else like that in the queue I'll be glad if you let me know in the comments section.

So how to live? How to blog? How to make it till 2015 when the sky is supposed to get brighter? I have no idea but I'll try to go on for a little longer. Back soon.

## March 08, 2014

One of the endlessly recurring topics around here is the use of PowerPoint and comparable presentation software. Usually because of some ill-informed rant against the use of PowerPoint.

It’s come around yet again in a particularly ironic fashion, via an online slideshow at Slate, the only medium more consistently exasperating than a bad PowerPoint presentation. In keeping with modern tastes, this has been re-shared so many times that I finally went to look at it, but it’s more of the usual, in a more annoying medium. This is a bit of a shame, as there’s actually some good presentation advice buried in there, but between the format and the usual raft of misconceptions, it’s kind of hard to find.

The biggest problem I have with this is the blanket declaration that PowerPoint– or, more precisely, a particular style of PowerPoint– is “destroying higher education.” Which is an ill-formed claim on a lot of levels, but mostly in the claim that “higher education” is a single coherent thing. Which it’s not. Standards for what counts as good and effective teaching vary dramatically across disciplines and even within them. What’s well suited to one area is slow death in another, but that doesn’t mean that either of those groups is wrong.

So, yes, it’s absolutely true that text-heavy PowerPoint presentations that are made available online are deadly for classes where the discussion in class is the whole point of the course. But that’s not all of higher education. And if you’re working in a subject that involves more direct transfer of knowledge than mutual construction thereof, text-heavy PowerPoints made available online are the right tool for the job.

For the umpteenth time, then: PowerPoint is a tool, nothing more and nothing less. It can be used in many different ways, some of them good, some of them bad. The key to using it effectively is knowing your audience. Or, more precisely, your audiences, plural.

If I’m giving a public lecture or a TED-type talk, I’ll use slides with big splashy images and very few words. If I’m putting together a colloquium talk, I’ll use a bit more text, but still try to emphasize images. If I’m teaching an intro physics class, where I expect students will need to go back over the material multiple times while doing homework, I’ll use slides with a lot more text and equations, because I’m going to make those slides available on the course website after class, so students can go back and check stuff. If I’m teaching a seminar-type class, I’ll use a mix of those– on days when discussion is the main point, it’ll be mostly images, but on days when I need to convey factual information, there will be more text.

Even the better bits of that slideshow’s advice aren’t universal. I’m generally against giving an outline of your talk up front, because it mostly just wastes time. But there are topics where the path to be followed is sufficiently twisty that an outline is genuinely helpful, as a road map to the talk. (And that’s without even getting into fields where a clear outline at the beginning is an absolute cultural norm.) There’s no rule that applies to absolutely every presentation in absolutely every circumstance.

PowerPoint is a tool, but it’s not a single blunt instrument like a hammer. It’s very versatile, and can be used in lots of different ways. The key is knowing which of those ways to use in different sets of circumstances, and adjusting your style to match the audiences. Many of the alleged misuses of it are really “This isn’t how it ought to be used in my field,” which manages to be both perfectly true and perfectly useless.

### Quantum Diaries — Nobody understands quantum mechanics? Nonsense!

Despite the old canard about nobody understanding quantum mechanics, physicists do understand it.  With all of the interpretations ever conceived for quantum mechanics[1], this claim may seem a bit of a stretch, but like the proverbial ostrich with its head in the sand, many physicists prefer to claim they do not understand quantum mechanics, rather than just admit that it is what it is and move on.

What is it about quantum mechanics that generates so much controversy and even had Albert Einstein (1879 – 1955) refusing to accept it? There are three points about quantum mechanics that generate controversy. It is probabilistic, eschews realism, and is local. Let us look at these three points in more detail.

1. Quantum mechanics is probabilistic, not determinist. Consider a radioactive atom. It is impossible, within the confines of quantum mechanics, to predict when an individual atom will decay. There is no measurement or series of measurements that can be made on a given atom to allow me to predict when it will decay. I can calculate the probability of when it will decay or the time it takes half of a sample to decay but not the exact time a given atom will decay. This lack of ability to predict exact outcomes, but only probabilities, permeates all of quantum mechanics. No possible set of measurements on the initial state of a system allows one to predict precisely the result of all possible experiments on that state.
2. Quantum mechanics eschews realism[2]. This is a corollary of the first point. A quantum mechanical system does not have well defined values for properties that have not been directly measured. This has been compared to the moon only existing when someone is looking at it. For deterministic systems one can always safely infer back from a measurement what the system was like before the measurement. Hence if I measure a particle’s position and motion I can infer not only where it will go but where it has come from. The probabilistic nature of quantum mechanics prevents this backward looking inference. If I measure the spin of an atom, there is no certainty that is had only that value before the measurement. It is this aspect of quantum mechanics that most disturbs people, but quantum mechanics is what it is.
3. Quantum mechanics is local. To be precise, no action at point A will have an observable effect at point B that is instantaneous, or non-causal.  Note the word observable. Locality is often denied in an attempt to circumvent Point 2, but when restricted to what is observable, locality holds. Despite the Pentagon’s best efforts, no messages have been sent using quantum non-locality.

Realism, at least, is a common aspect of the macroscopic world. Even a baby quickly learns that the ball is behind the box even when he cannot see it. But much about the microscopic world is not obviously determinist, the weather in Vancouver for example (it is snowing as I write this). Nevertheless, we cling to determinism and realism like a child to his security blanket. It seems to me that determinism or realism, if they exist, would be at least as hard to understand as their lack. There is no theorem that states the universe should be deterministic and not probabilistic or vice versa. Perhaps god, contrary to Einstein’s assertion, does indeed like a good game of craps[3].

So quantum mechanics, at least at the surface level, has features many do not like. What has the response been? They have followed the example set by Philip Gosse (1810 – 1888) with the Omphalos hypothesis[4]. Gosse, being a literal Christian, had trouble with the geological evidence that the world was older than 6,000, so he came up with an interpretation of history that the world was created only 6,000 years ago but in such a manner that it appeared much older. This can be called an interpretation of history because it leaves all predictions for observations intact but changes the internal aspects of the model so that they match his preconceived ideas. To some extent, Tycho Brahe (1546 – 1601) used the same technique to keep the earth at the center of the universe. He had the earth fixed and the sun circle the earth and the other planets the sun. With the information available at the time, this was consistent with all observations.

The general technique is to adjust those aspects of the model that are not constrained by observation to make it conform to one’s ideas of how the universe should behave. In quantum mechanics these efforts are called interpretations. Hugh Everett (1930 – 1982) proposed many worlds in an attempt to make quantum mechanics deterministic and realistic. But it was only in the unobservable parts of the interpretation that this was achieved and the results of experiments in this world are still unpredictable. Louis de Broglie (1892 – 1987) and later David Bohm (1917 – 1992) introduced pilot waves in an effort to restore realism and determinism. In doing do they gave up locality. Like Gosse’s work, theirs was nice proof in principle that, with sufficient ingenuity, the universe could be made to conform to almost any preconceived ideas, or at least appear to do so. Reassuring I guess, but like Gosse it was done by introducing non-observable aspects to the model: not just unobserved but in principle unobservable. The observable aspects of the universe, at least as far as quantum mechanics is correct, are as stated in the three points above: probabilistic, nonrealistic and local.

Me, I am not convinced that there is anything to understand about quantum mechanics beyond the rules for its use given in standard quantum mechanics text books. However, interpretations of quantum mechanics might, possibly might, suggest different ways to tackle unsolved problems like quantum gravity and they do give one something to discuss after one has had a few beers (or is that a few too many beers).

[1] See my February 2014 post “Reality and the Interpretations of Quantum Mechanics.”

[2] Realism as defined in the paper by Einstein, Podolsky and Rosen, Physical Review 47 (10): 777–780 (1935).

[3] Or dice.

### Tommaso Dorigo — Top Asymmetry: The Latest From DZERO

It is nice to see that the Tevatron experiments are continuing to produce excellent scientific measurements well after the demise of the detectors. Of course the CDF and DZERO collaborations have shrunk in size and in available man-years for data analysis since the end of data taking, as most researchers have increased and gradually maxed their participations to
other experiments - typically the ones at the Large Hadro Collider; but a hard core of dedicated physicists remains actively involved in the analysis of the 10 inverse femtobarns of proton-antiproton collisions acquired in Run 2, in the conviction that the Tevatron data still provides a basis for scientific results that cannot be obtained elsewhere.

### n-Category CaféNetwork Theory Talks at Oxford

One of my dreams these days is to get people to apply modern math to ecology and biology, to help us design technologies that work with nature instead of against it. I call this dream ‘green mathematics’. But this will take some time to reach, since living systems are subtle, and most mathematicians are more familiar with physics.

So, I’ve been warming up by studying the mathematics of chemistry, evolutionary game theory, electrical engineering, control theory and information theory. There are a lot of ideas in common to all these fields, but making them clear requires some category theory. I call this project ‘network theory’. I’m giving some talks about it at Oxford.

(This diagram is written in Systems Biology Graphical Notation.)

Here’s the plan:

#### Network Theory

Nature and the world of human technology are full of networks. People like to draw diagrams of networks: flow charts, electrical circuit diagrams, signal-flow graphs, Bayesian networks, Feynman diagrams and the like. Mathematically minded people know that in principle these diagrams fit into a common framework: category theory. But we are still far from a unified theory of networks. After an overview, we will look at three portions of the jigsaw puzzle in three separate talks:

I. Electrical circuits and signal-flow graphs.

II. Stochastic Petri nets, chemical reaction networks and Feynman diagrams.

III. Bayesian networks, information and entropy.

All these talks will be in Lecture Theatre B of the Computer Science Department—you can see a map here, but the entrance is on Keble Road. Here are the times:

• Friday 21 February 2014, 2 pm: Network Theory: overview. Also available on YouTube.

• Tuesday 25 February, 3:30 pm: Network Theory I: electrical circuits and signal-flow graphs. Also available on YouTube.

• Tuesday 4 March, 3:30 pm: Network Theory II: stochastic Petri nets, chemical reaction networks and Feynman diagrams. Also available on YouTube.

• Tuesday 11 March, 3:30 pm: Network Theory III: Bayesian networks, information and entropy.

I thank Samson Abramsky, Bob Coecke and Jamie Vicary of the Computer Science Department for inviting me, and Ulrike Tillmann and Minhyong Kim of the Mathematical Institute for helping me get set up. I also thank all the people who helped do the work I’ll be talking about, most notably Jacob Biamonte, Jason Erbele, Brendan Fong, Tobias Fritz, Tom Leinster, Tu Pham, and Franciscus Rebro.

Ulrike Tillmann has also kindly invited me to give a topology seminar:

#### Operads and the Tree of Life

Trees are not just combinatorial structures: they are also biological structures, both in the obvious way but also in the study of evolution. Starting from DNA samples from living species, biologists use increasingly sophisticated mathematical techniques to reconstruct the most likely “phylogenetic tree” describing how these species evolved from earlier ones. In their work on this subject, they have encountered an interesting example of an operad, which is obtained by applying a variant of the Boardmann–Vogt “W construction” to the operad for commutative monoids. The operations in this operad are labelled trees of a certain sort, and it plays a universal role in the study of stochastic processes that involve branching. It also shows up in tropical algebra. This talk is based on work in progress with Nina Otter.

I’m not sure exactly where this will take place, but probably somewhere in the Mathematical Institute, shown on this map. Here’s the time:

• Monday 24 February, 3:30 pm, Operads and the Tree of Life.

If you’re nearby, I hope you can come to some of these talks — and say hi!

(This diagram was drawn by Darwin.)

### John Baez — Network Theory Talks at Oxford

I’m giving some talks at Oxford:

### Network Theory

Nature and the world of human technology are full of networks. People like to draw diagrams of networks: flow charts, electrical circuit diagrams, signal-flow graphs, Bayesian networks, Feynman diagrams and the like. Mathematically minded people know that in principle these diagrams fit into a common framework: category theory. But we are still far from a unified theory of networks. After an overview, we will look at three portions of the jigsaw puzzle in three separate talks:

I. Electrical circuits and signal-flow graphs.

II. Stochastic Petri nets, chemical reaction networks and Feynman diagrams.

III. Bayesian networks, information and entropy.

If you’re nearby I hope you can come! All these talks will take place in Lecture Theatre B in the Computer Science Department—see the map below. Here are the times:

• Friday 21 February 2014, 2 pm: Network Theory: overview. See the slides or watch a video.

• Tuesday 25 February, 3:30 pm: Network Theory I: electrical circuits and signal-flow graphs. See the slides or watch a video.

• Tuesday 4 March, 3:30 pm: Network Theory II: stochastic Petri nets, chemical reaction networks and Feynman diagrams. See the slides or watch a video.

• Tuesday 11 March, 3:30 pm: Network Theory III: Bayesian networks, information and entropy. See the slides.

The first talk will be part of the OASIS series, meaning the “Oxford Advanced Seminar on Informatic Structures”.

I thank Samson Abramsky, Bob Coecke and Jamie Vicary of the Computer Science Department for inviting me, and Ulrike Tillmann and Minhyong Kim of the Mathematical Institute for helping me get set up. I also thank all the people who helped do the work I’ll be talking about, most notably Jacob Biamonte, Jason Erbele, Brendan Fong, Tobias Fritz, Tom Leinster, Tu Pham, and Franciscus Rebro.

Ulrike Tillmann has also kindly invited me to give a topology seminar:

### Operads and the Tree of Life

Trees are not just combinatorial structures: they are also biological structures, both in the obvious way but also in the study of evolution. Starting from DNA samples from living species, biologists use increasingly sophisticated mathematical techniques to reconstruct the most likely “phylogenetic tree” describing how these species evolved from earlier ones. In their work on this subject, they have encountered an interesting example of an operad, which is obtained by applying a variant of the Boardmann–Vogt “W construction” to the operad for commutative monoids. The operations in this operad are labelled trees of a certain sort, and it plays a universal role in the study of stochastic processes that involve branching. It also shows up in tropical algebra. This talk is based on work in progress with Nina Otter.

I’m not sure exactly where this will take place, but surely somewhere in the Mathematical Institute building:

• Monday 24 February, 3:30 pm, Operads and the Tree of Life. See the slides.

The Computer Science Department is shown in the map here:

The Mathematical Institute is a bit to the west:

### David Hogg — continuity of exoplanets

In a day of talks—I had to leave early a talk by Ruth Angus (Oxford) on stellar ages from Kepler to see a talk in Computer Science by Alekh Agarwal (Microsoft) on distributed and clever machine-learning algorithms and engineering—Bekki Dawson (Berkeley) showed us results on the statistics of exoplanet populations and tests of planetary migration scenarios. She showed that the continuity of tidal circularization models (conservative exoplanet flow, in some sense) makes a prediction for the distribution of planets in the period–eccentricity plane, and that the prediction is falsified strongly by Kepler. There is not yet any good model for the formation and migration of exoplanets that explains the main features of the data, but there are many possible effects, and it is possible that all of them are acting at some level. Her talk suggested scores of other projects that could and should be done. On a side note, she showed convincingly that you can measure eccentricities just with Kepler data alone, and that there are strong asymmetries that make it much more likely that you will see a faster-than-circular transit than a slower-than-circular transit when the transiting-planet orbit is eccentric. She also showed some transit timing work by our own Foreman-Mackey.

### Sean Carroll — Guest Post: Katherine Freese on Dark Matter Developments

The hunt for dark matter has been heating up once again, driven (as usual) by tantalizing experimental hints. This time the hints are coming mainly from outer space rather than underground laboratories, which makes them harder to check independently, but there’s a chance something real is going on. We need more data to be sure, as scientists have been saying since the time Eratosthenes measured the circumference of the Earth.

As I mentioned briefly last week, Katherine Freese of the University of Michigan has a new book coming out, The Cosmic Cocktail, that deals precisely with the mysteries of dark matter. Katie was also recently at the UCLA Dark Matter Meeting, and has agreed to share some of her impressions with us. (She also insisted on using the photo on the right, as a way of reminding us that this is supposed to be fun.)

Dark Matter Everywhere (at the biannual UCLA Dark Matter Meeting)

The UCLA Dark Matter Meeting is my favorite meeting, period. It takes place every other year, usually at the Marriott Marina del Rey right near Venice Beach, but this year on UCLA campus. Last week almost two hundred people congregated, both theorists and experimentalists, to discuss our latest attempts to solve the dark matter problem. Most of the mass in galaxies, including our Milky Way, is not comprised of ordinary atomic material, but instead of as yet unidentified dark matter. The goal of dark matter hunters is to resolve this puzzle. Experimentalist Dave Cline of the UCLA Physics Department runs the dark matter meeting, with talks often running from dawn till midnight. Every session goes way over, but somehow the disorganization leads everybody to have lots of discussion, interaction between theorists and experimentalists, and even more cocktails. It is, quite simply, the best meeting. I am usually on the organizing committee, and cannot resist sending in lots of names of people who will give great talks and add to the fun.

Last week at the meeting we were treated to multiple hints of potential dark matter signals. To me the most interesting were the talks by Dan Hooper and Tim Linden on the observations of excess high-energy photons — gamma-rays — coming from the Central Milky Way, possibly produced by annihilating WIMP dark matter particles. (See this arxiv paper.) Weakly Interacting Massive Particles (WIMPs) are to my mind the best dark matter candidates. Since they are their own antiparticles, they annihilate among themselves whenever they encounter one another. The Center of the Milky Way has a large concentration of dark matter, so that a lot of this annihilation could be going on. The end products of the annihilation would include exactly the gamma-rays found by Hooper and his collaborators. They searched the data from the FERMI satellite, the premier gamma-ray mission (funded by NASA and DoE as well as various European agencies), for hints of excess gamma-rays. They found a clear excess extending to about 10 angular degrees from the Galactic Center. This excess could be caused by WIMPs weighing about 30 GeV, or 30 proton masses. Their paper called these results “a compelling case for annihilating dark matter.” After the talk, Dave Cline decided to put out a press release from the meeting, and asked the opinion of us organizers. Most significantly, Elliott Bloom, a leader of the FERMI satellite that obtained the data, had no objection, though the FERMI team itself has as yet issued no statement.

Many putative dark matter signals have come and gone, and we will have to see if this one holds up. Two years ago the 130 GeV line was all the rage — gamma-rays of 130 GeV energy that were tentatively observed in the FERMI data towards the Galactic Center. (Slides from Andrea Albert’s talk.) This line, originally proposed by Stockholm’s Lars Bergstrom, would have been the expectation if two WIMPs annihilated directly to photons. People puzzled over some anomalies of the data, but with improved statistics there isn’t much evidence left for the line. The question is, will the 30 GeV WIMP suffer the same fate? As further data come in from the FERMI satellite we will find out.

What about direct detection of WIMPs? Laboratory experiments deep underground, in abandoned mines or underneath mountains, have been searching for direct signals of astrophysical WIMPs striking nuclei in the detectors. At the meeting the SuperCDMS experiment hammered on light WIMP dark matter with negative results. The possibility of light dark matter, that was so popular recently, remains puzzling. 10 GeV dark matter seemed to be detected in many underground laboratory experiments: DAMA, CoGeNT, CRESST, and in April 2013 even CDMS in their silicon detectors. Yet other experiments, XENON and LUX, saw no events, in drastic tension with the positive signals. (I told Rick Gaitskell, a leader of the LUX experiment, that I was very unhappy with him for these results, but as he pointed out, we can’t argue with nature.) Last week at the conference, SuperCMDS, the most recent incarnation of the CDMS experiment, looked to much lower energies and again saw nothing. (Slides from Lauren Hsu’s talk.) The question remains: are we comparing apples and oranges? These detectors are made of a wide variety of types of nuclei and we don’t know how to relate the results. Wick Haxton’s talk surprised me by discussion of nuclear physics uncertainties I hadn’t been aware of, that in principle could reconcile all the disagreements between experiments, even DAMA and LUX. Most people think that the experimental claims of 10 GeV dark matter are wrong, but I am taking a wait and see attitude.

We also heard about the hints of detection of a completely different dark matter candidate: sterile neutrinos. (Slides from George Fuller’s talk.) In addition to the three known neutrinos of the Standard Model of Particle Physics, there could be another one that doesn’t interact with the standard model. Yet its decay could lead to x-ray lines. Two separate groups found indications of lines in data from the Chandra and XMM-Newton space satellites that would be consistent with a 7 keV neutrino (7 millionths of a proton mass). Could it be that there is more than one type of dark matter particle? Sure, why not?

### Scheme theoretic size

We now consider counting size in a less naive way. Again, for simplicity, suppose that ${X}$ is affine, with corresponding ring ${S}$. Let ${(b_1, b_2, \ldots, b_n)}$ be a point of ${B}$, so there is a map of rings ${R \rightarrow k}$ by ${x_i \mapsto b_i}$. Consider the ring ${S \otimes_R k}$, where ${R}$ acts on ${k}$ by the above map. The maps from this ring to ${k}$ are the point in ${\pi^{-1}(b)}$. Thus, ${\dim_k S \otimes_R k}$ is an upper bound for the number of points of ${X}$ above ${b}$. We will call this dimension the scheme theoretic size of the fiber. Once again, it can be defined when ${X}$ is not affine as well.

We have the following cautionary example: Let ${X = \{ (x,y) : xy=1 \}}$ mapping onto the ${x}$ coordinate. Then the degree is ${1}$, but the fiber above ${0}$ has size ${0}$, either scheme theoretically or naively. To rule this out, we impose that ${X}$ is finite over ${B}$. By definition, this means that ${X}$ is affine, and ${S}$ is a finitely generated ${R}$ module.

You might worry about how we could ever prove that ${X}$ is affine if it is not given to us as a closed subset of ${k^m}$. Fortunately, we have:

Theorem (Hartshorne, Exercise III.11.2) If ${X \rightarrow B}$ is projective with finite fibers, then it is a finite map. Here projective means that ${X}$ is a closed subset of ${\mathbb{P}^n \times B}$, projecting onto ${B}$. (This is not the morally right definition of a projective map, but if you are ready for the right definition, then you should be working with “proper” rather than “projective” anyway.)

We then have

Theorem (Hartshorne, Exercise II.5.8) If ${X}$ is finite over ${B}$, and no irreducible component of ${X}$ maps to a proper subvariety of ${B}$, then every fiber of ${\pi}$ has scheme theoretic size ${\geq d}$.

### Flatness

Theorem Let ${X \rightarrow B}$ be a finite map. Then all fibers have scheme theoretic size ${d}$ if and only if ${X}$ is flat over ${B}$.

Unfortunately, flat is a rather technical condition. The first thing to understand is that some nice looking maps can fail to be flat:

Warning Let ${X}$ be ${\{ (w,x,y,z) : wy=wz=xy=xz=0 \}}$, let ${B = k^2}$ and let the map ${X \rightarrow B}$ be ${(w,x,y,z) \mapsto (w-y, x-z)}$. This is a finite map. (We can alternately describe ${X}$ as ${\{ (w,x,0,0) \} \cup \{ (0,0,y,z) \}}$.) This map is degree ${2}$, but the fiber over ${(0,0)}$ has scheme theoretic size ${3}$ (and naive size ${1}$).

If your eye is well enough trained that this doesn’t look nice to you, try the examples here.

There are two good conditions that imply flatness:

Theorem (Hartshorne III.9.7) If ${B}$ is normal and one dimensional, and no irreducible component of ${X}$ maps to a proper subvariety of ${B}$, then ${X}$ is flat over ${B}$.

Theorem (The miracle flatness theorem) If ${X}$ is Cohen-Macaulay, ${B}$ is smooth of the same dimension as ${X}$, and ${X \rightarrow B}$ is finite, then ${X \rightarrow B}$ is flat.

### Quantum Diaries — Particle Beam Cancer Therapy: The Promise and Challenges

Advances in accelerators built for fundamental physics research have inspired improved cancer treatment facilities. But will one of the most promising—a carbon ion treatment facility—be built in the U.S.? Participants at a symposium organized by Brookhaven Lab for the 2014 AAAS meeting explored the science and surrounding issues.

by Karen McNulty Walsh

Accelerator physicists are natural-born problem solvers, finding ever more powerful ways to generate and steer particle beams for research into the mysteries of physics, materials, and matter. And from the very beginning, this field born at the dawn of the atomic age has actively sought ways to apply advanced technologies to tackle more practical problems. At the top of the list—even in those early days— was taking aim at cancer, the second leading cause of death in the U.S. today, affecting one in two men and one in three women.

Using beams of accelerated protons or heavier ions such as carbon, oncologists can deliver cell-killing energy to precisely targeted tumors—and do so without causing extensive damage to surrounding healthy tissue, eliminating the major drawback of conventional radiation therapy using x-rays.

“This is cancer care aimed at curing cancer, not just treating it,” said Ken Peach, a physicist and professor at the Particle Therapy Cancer Research Institute at Oxford University.

Peach was one of six participants in a symposium exploring the latest advances and challenges in this field—and a related press briefing attended by more than 30 science journalists—at the 2014 meeting of the American Association for the Advancement of Science in Chicago on February 16. The session, “Targeting Tumors: Ion Beam Accelerators Take Aim at Cancer,” was organized by the U.S. Department of Energy’s (DOE’s) Brookhaven National Laboratory, an active partner in an effort to build a prototype carbon-ion accelerator for medical research and therapy. Brookhaven Lab is also currently the only place in the U.S. where scientists can conduct fundamental radiobiological studies of how beams of ions heavier than protons, such as carbon ions, affect cells and DNA.

Participants in a symposium and press briefing exploring the latest advances and challenges in particle therapy for cancer at the 2014 AAAS meeting: Eric Colby (U.S. Department of Energy), Jim Deye (National Cancer Institute), Hak Choy (University of Texas Southwestern Medical Center), Kathryn Held (Harvard Medical School and Massachusetts General Hospital), Stephen Peggs (Brookhaven National Laboratory and Stony Brook University), and Ken Peach (Oxford University). (Credit: AAAS)

“We could cure a very high percentage of tumors if we could give sufficiently high doses of radiation, but we can’t because of the damage to healthy tissue,” said radiation biologist Kathryn Held of Harvard Medical School and Massachusetts General Hospital during her presentation. “That’s the advantage of particles. We can tailor the dose to the tumor and limit the amount of damage in the critical surrounding normal tissues.”

Yet despite the promise of this approach and the emergence of encouraging clinical results from carbon treatment facilities in Asia and Europe, there are currently no carbon therapy centers operating in the U.S.

Participants in the Brookhaven-organized session agreed: That situation has to change—especially since the very idea of particle therapy was born in the U.S.

Physicists as pioneers

“When Harvard physicist Robert Wilson, who later became the first director of Fermilab, was asked to explore the potential dangers of proton particle radiation [just after World War II], he flipped the problem on its head and described how proton beams might be extremely useful—as effective killers of cancer cells,” said Stephen Peggs, an accelerator physicist at Brookhaven Lab and adjunct professor at Stony Brook University.

As Peggs explained, the reason is simple: Unlike conventional x-rays, which deposit energy—and cause damage—all along their path as they travel through healthy tissue en route to a tumor (and beyond it), protons and other ions deposit most of their energy where the beam stops. Using magnets, accelerators can steer these charged particles left, right, up, and down and vary the energy of the beam to precisely place the cell-killing energy right where it’s needed: in the tumor.

The first implementation of particle therapy used helium and other ions generated by the Bevatron at Berkeley Lab. Those spin-off studies “established a foundation for all subsequent ion therapy,” Peggs said. And as accelerators for physics research grew in size, pioneering experiments in particle therapy continued, operating “parasitically” until the very first accelerator built for hospital-based proton therapy was completed with the help of DOE scientists at Fermilab in 1990.

But even before that machine left Illinois for Loma Linda University Medical Center in California, physicists were thinking about how it could be made better. The mantra of making machines smaller, faster, cheaper—and capable of accelerating more kinds of ions—has driven the field since then.

Advances in magnet technology, including compact superconducting magnets and beam-delivery systems developed at Brookhaven Lab, hold great promise for new machines. Peggs is working to incorporate these technologies in a prototype ‘ion Rapid Cycling Medical Synchrotron’ (iRCMS) capable of delivering protons and/or carbon ions for radiobiology research and for treating patients.

Brookhaven Lab accelerator physicist Stephen Peggs with magnet technology that could reduce the size of particle accelerators needed to steer heavy ion beams and deliver cell-killing energy to precisely targeted tumors while sparing surrounding healthy tissue.

Small machine, big particle impact

The benefits of using charged particles heavier than protons (e.g., carbon ions) stem not only from their physical properties—they stop and deposit their energy over an even smaller and better targeted tumor volume than protons—but also a range of biological advantages they have over x-rays.

As Kathryn Held elaborated in her talk, compared with x-ray photons, “carbon ions are much more effective at killing tumor cells. They put a huge hole through DNA compared to the small pinprick caused by x-rays, which causes clustered or complex DNA damage that is less accurately repaired between treatments—less repaired, period—and thus more lethal [to the tumor].” Carbon ions also appear to be more effective than x-rays at killing oxygen-deprived tumor cells, and might be most effective in fewer higher doses, “but we need more basic biological studies to really understand these effects,” Held said.

Different types of radiation treatment cause different kinds of damage to the DNA in a tumor cell. X-ray photons (top arrow) cause fairly simple damage (purple area) that cancer cells can sometimes repair between treatments. Charged particles—particularly ions heavier than protons (bottom arrow)—cause more and more complex forms of damage, resulting in less repair and a more lethal effect on the tumor. (Credit: NASA)

Held conducts research at the NASA Space Radiation Laboratory (NSRL) at Brookhaven Lab, an accelerator-based facility designed to fully understand risks and design protections for future astronauts exposed to radiation. But much of that research is relevant to understanding the mechanisms and basic radiobiological responses that can apply to the treatment of cancer. But additional facilities and funding are needed for research specifically aimed at understanding the radiobiological effects of heavier ions for potential cancer therapies, Held emphasized.

Hak Choy, a radiation oncologist and chair in the Department of Radiation Oncology at the University of Texas Southwestern Medical Center, presented compelling clinical data on the benefits of proton particle therapy, including improved outcomes and reduced side effects when compared with conventional radiation, particularly for treating tumors in sensitive areas such as the brain and spine and in children. “When you can target the tumor and spare critical tissue you get fewer side effects,” he said.

Data from Japan and Europe suggest that carbon ions could be three or four times more biologically potent than protons, Choy said, backing that claim with impressive survival statistics for certain types of cancers where carbon therapy surpassed protons, and was even better than surgery for one type of salivary gland cancer. “And carbon therapy is noninvasive,” he emphasized.

To learn more about this promising technology and the challenges of building a carbon ion treatment/research facility in the U.S., including perspectives from the National Cancer Institute, DOE and a discussion about economics, read the full summary of the AAAS symposium here: http://www.bnl.gov/newsroom/news.php?a=24672.

Karen McNulty Walsh is a science writer in the Media & Communications Office at Brookhaven National Laboratory.

### Sean Carroll — Particle Physicists and Cosmologists on Twitter

Katie Freese, a well-known particle cosmologist who has a new book coming out, was asking if I had an tips about publicity. Short answer: not really, no. I haven’t really figured that one out. But one of the most obvious things to do, in terms of possible benefit per unit effort, is to join Twitter and start talking about science with the other denizens there.

I thought I would suggest to her a dozen or so good scientists to follow — after all, there aren’t that many working physicists in this field who are active on Twitter. I went to send her a few recommendations, at which point I realized there are actually quite a few! Some of whom deserve a lot more recognition.

So here is a list I compiled, consisting of people who are (1) active researchers in particle physics or cosmology; (2) on Twitter; (3) known to me. (More specifically, “recalled or noticed by me while making this list.”) Obviously one could compile a much longer list if we expanded it to include science communicators of all stripes, or even active scientists (or even just physicists) of all stripes. But I’m only one guy here. If I’m missing anyone who certainly qualifies, leave a comment; I’ll be happy to add them if I feel like it. I’m sure this isn’t more than half of the people who might be included in such a list. Obvious systematic error in favor of English-speakers, sorry. Entries listed in no particular order.

### Backreaction — 10 Misconceptions about Creativity

 Lara, painting. She saysit's a snake and a trash can.

The American psyche is deeply traumatized by the finding that creativity scores of children and adults have been constantly declining since 1990. The consequence is a flood of advice on how to be more creative, books and seminars and websites. There’s no escaping the message: Get creative, now!

Science needs a creative element, and so every once in a while I read these pieces that come by my newsfeed. But they’re like one of these mildly pleasant songs that stop making sense when you listen to the lyrics. Clap your hands if you’re feeling like a room without a ceiling.

It’s not like I know a terrible lot about research on creativity. I’m sure there must be some research on it, right? But most of what I read isn’t even logically coherent.
1. Creativity means solving problems.

The NYT recently wrote in an article titled “Creativity Becomes an Academic Discipline”:
“Once considered the product of genius or divine inspiration, creativity — the ability to spot problems and devise smart solutions — is being recast as a prized and teachable skill.”
Yes, creativity is an essential ingredient to solving problems, but equating creativity with problem solving is like saying curiosity is a device to kill cats. It’s one possible use, but it’s not the only use and there are other ways to kill cats.

Creativity is in the first place about creation, the creation of something new and interesting. The human brain has two different thought processes to solve problems. One is to make use of learned knowledge and proceed systematically step by step. This is often referred to as ‘convergent thinking’ and dominantly makes use of the left side of the brain. The other process is a pattern-finding, a free association, often referred to as ‘divergent thinking’ which employs more brain regions on the right side. It normally kicks in only if the straight-forward left-brain attempt failed because it’s energetically more costly. Exactly what constitutes creative thinking is not well known, but most agree it is a combination of both of these thought processes.

Creative thinking is a way to arrive at solutions to problems, yes. Or you might create a solution looking for a problem. Creativity is also an essential ingredient to art and knowledge discovery, which might or might not solve any problem.

2. Creativity means solving problems better.

It takes my daughter about half an hour to get dressed. First she doesn’t know how to open the buttons, then she doesn’t know how to close them. She’ll try to wear her pants as a cap and pull her socks over the jeans just to then notice the boots won’t fit.

It takes me 3 minutes to dress her – if she lets me – not because I’m not creative but because it’s not a problem which calls for a creative solution. Problems that can be solved with little effort by a known algorithm are in most cases best solved by convergent thinking.

Xkcd nails it:

But Newsweek bemoans:
“Preschool children, on average, ask their parents about 100 questions a day. Why, why, why—sometimes parents just wish it’d stop. Tragically, it does stop. By middle school they’ve pretty much stopped asking.”
There’s much to be said about schools not teaching children creative thinking – I agree it’s a real problem. But the main reason children stop asking question is that they learn. And somewhat down the line they learn how to find answers themselves. The more we learn, the more problems we can address with known procedures.

There’s a priori nothing wrong with solving problems non-creatively. In most cases creative thinking just wastes time and brain-power. You don’t have to reinvent the wheel every day. It’s only when problems do not give in to standard solutions that a creative approach becomes useful.

3. Happiness makes you creative.

For many people the problem with creative thought is the lack of divergent thinking. If you look at the advice you find online, they’re almost all guides to divergent thinking, not to creativity: “Don’t think. Let your thoughts unconsciously bubble away.” “Sourround yourself with inspiration”Be open and aware. Play and pretend. List unusual uses for common household objects.” And so on. Happiness then plays a role for creativity because there is some evidence that happiness makes divergent thinking easier:
“Recent studies have shown […] that everyday creativity is more closely linked with happiness than depression. In 2006, researchers at the University of Toronto found that sadness creates a kind of tunnel vision that closes people off from the world, but happiness makes people more open to information of all kinds.”
Writes Bambi Turner who has a business degree and writes stuff. Note the vague term “closely linked” and look at the research.

It is a study showing that people who listened to Bach’s (“happy”) Brandenburg Concerto No. 3 were better solving a word puzzle that required divergent thinking. In science speak the result reads “positive affect enhanced access to remote associates, suggesting an increase in the scope of semantic access.” Let us not even ask about the statistical significance of a study with 24 students of the University of Toronto in their lunch break, or its relevance for real life. The happy people participating this study were basically forced to think divergently. In real life happiness might instead divert you from hacking on a problem.

In summary, the alleged “close link” should read: There is tentative evidence that happiness increases your chances of being creative in a laboratory setting, if you are among those who lack divergent thinking and are student at the University of Toronto.

4. Creativity makes you happy.

There’s very little evidence that creativity for the sake of creativity improves happiness. Typically it’s arguments of plausibility like this that solving a problem might improve your life generally:
“creativity allows [people] to come up with new ways to solve problems or simply achieve their goals.”
That is plausible indeed, but it doesn’t take into account that being creative has downsides that counteract the benefits.

This blog is testimony to my divergent thinking. You might find this interesting in your news feed, but ask my husband what fun it is to have a conversation with somebody who changes topic every 30 seconds because it’s all connected! I’m the nightmare of your organizing committee, of your faculty meeting, and of your carefully assembled administration workflow. Because I know just how to do everything better and have ten solutions to every problem, none of which anybody wants to hear. It also has the downside that I can only focus on reading when I’m tired because otherwise I’ll never get though a page. Good thing all my physics lectures were early in the morning.

Thus, I am very skeptic of the plausibility argument that creativity makes you happy. If you look at the literature, there is in fact very little that has shown to lastingly increase people’s happiness at all. Two known procedures that have proved some effect in studies is showing gratitude and getting to know ones’ individual strengths.

For more evidence that speaks against the idea that creativity increases happiness, see 7 and 8. There is some evidence that happiness and creativity are correlated, because both tend to be correlated with other character traits, like openness and cognitive flexibility. However, there is also evidence to the contrary, that creative people have a tendency to depression: “Although little evidence exists to link artistic creativity and happiness, the myth of the depressed artist has some scientific basis.” I’d call this inconclusive. Either way, correlations are only of so much use if you want to actively change something.

5. Creativity will solve all our problems.

“All around us are matters of national and international importance that are crying out for creative solutions, from saving the Gulf of Mexico to bringing peace to Afghanistan to delivering health care. Such solutions emerge from a healthy marketplace of ideas, sustained by a populace constantly contributing original ideas and receptive to the ideas of others.”
[From Newsweek again.] I don’t buy this at all. It’s not that we lack creative solutions, just look around, look at TED if you must. We’re basically drowning in creativity, my inbox certainly is. But they’re solutions to the wrong problems.

(One of the reasons is that we simply do not know what a “healthy marketplace of ideas” is even supposed to mean, but that’s a different story and shell be told another time.)

6. You can learn to be creative if you follow these simple rules.

You don’t have to learn creative thinking, it comes with your brain. You can however train it if you want to improve, and that’s what most of the books and seminars want to sell. It’s much like running. You don’t have to learn to run. Everybody who is reasonably healthy can run. How far and how fast you can run depends on your genes and on your training. There is some evidence that creativity has a genetic component and you can’t do much about this. However, you can work on the non-genetic part of it.

7. “To live creatively is a choice.”

This is a quote from the WSJ essay “Think Inside the Box.” I don’t know if anybody ever looked into this in a scientific way, it seems a thorny question. But anecdotally it’s easier to increase creativity than to decrease it and thus it seems highly questionable that this is correct, especially if you take into account the evidence that it’s partially genetic. Many biographies of great writers and artists speak against this, let me just quote one:
“We do not write because we want to; we write because we have to.”
W. Somerset Maugham, English dramatist and novelist (1874 - 1965).

8. Creativity will make you more popular.

People welcome novelty only in small doses and incremental steps. The wilder your divergent leaps of imagination, the more likely you are to just leave people behind you. Creativity might be a potential source for popularity in that at least you have something interesting to offer, but too much of it won’t do any good. You’ll end up being the misunderstood unappreciated genius whose obituary says “ahead of his times”.

9. Creativity will make you more successful.

Last week, the Washington post published this opinion piece which informs the reader that:
“Not for centuries has physics been so open to metaphysics, or more amenable to an ancient attitude: a sense of wonder about things above and within.”
This comes from a person named Michael Gerson who recently opened Max Tegmark’s book and whose occupation seems to be, well, to write opinion pieces. I’ll refrain from commenting on the amenability of professions I know nothing about, so let me just say that he has clearly never written a grant proposal. I warmly recommend you put the word “metaphysics” into your next proposal to see what I mean. I think you should all do that because I clearly won’t, so then maybe I stand a chance then in the next round.

Most funding agencies have used the 2008 financial crisis as an excuse to focus on conservative and applied research to the disadvantage of high risk and basic research. They really don’t want you to be creative – the “expected impact” is far too remote, the uncertainty too high. They want to hear you’ll use this hammer on that nail and when you’ve been hitting at it for 25 months and two weeks, out will pop 3 papers and two plenary talks. Open to metaphysics? Maybe Gerson should have a chat with Tegmark.

There is indeed evidence showing that people are biased against creativity to the favor of practicality, even if they state they welcome creativity. This study relied on 140 American undergraduate students. (Physics envy, anybody?) The punchline is that creative solutions by their very nature have a higher risk of failure than those relying on known methods and this uncertainty is unappealing. It is particularly unappealing when you are coming up with solutions to problems that nobody wanted you to solve.

So maybe being creative will make you successful. Or maybe your ideas will just make everybody roll their eyes.

10. The internet kills creativity.

The internet has made life difficult for many artists, writers, and self-employed entrepreneurs, and I see a real risk that this degrades the value of creativity. However, it isn’t true that the mere availability of information kills creativity. It just moves it elsewhere. The internet has made many tasks that previously required creative approaches to step-by-step procedures. Need an idea for a birthday cake? Don’t know how to fold a fitted sheet? Want to know how to be more creative? Google will tell you. This frees your mind to get creative on tasks that Google will not do for your. In my eyes, that’s a good thing.
So should you be more creative?

My summary of reading all these articles is that if you feel like your life lacks something, you should take score of your strengths and weaknesses and note what most contributes to your well-being. If you think that you are missing creative outlets, by all means, try some of these advice pages and get going. But do it for yourself and not for others, because creativity is not remotely as welcome as they want you to believe.

On that note, here’s the most recent of my awesomely popular musical experiments:

### Doug Natelson — March Meeting, Day 1

Observations from the first day of the APS March Meeting:
• There has been a lot of progress and excitement in looking at layered materials "beyond graphene".  It's interesting to see a resurgence of interest in transition metal (Ti, but more frequently W and Mo) dichalcogenides (S, Se, Te), a topic of great activity in bulk materials growth in the 1970s and early 80s.  There are clearly a lot of bright people working on ways to grow these materials layer-by-layer, with the long-term idea of making structures somewhat like semiconductor heterostructures (e.g., GaAs/AlGaAs), but with the richer palette provided by these materials (exhibiting charge density waves, strong spin-orbit effects, complex band structure, etc.).  Molecular beam epitaxy of these materials with high quality is generally very hard.  For example, Mo and W are extremely refractory, requiring electron beam evaporation at temperatures exceeding 2500 C, and sticking at the sample surface without much diffusion.  Whoever really gets layer-by-layer, large-area growth working with diverse materials is going to make a big impact.
• I saw Heinrich Jaeger give a great talk about granular materials by design.  These are entirely classical systems, but they are extremely challenging.  If you think about it, they are not crystalline (no long-range symmetries to exploit in modeling), they are non-ergodic (the constituent grains are kinetically limited, and can't explore all possible configurations), and nonlinear (the interactions between particles are short-ranged and very strong).  Very interesting.
• I caught two talks in the session looking at silicon-based quantum information processing.  It's possible to create and manipulate dangling bonds on the Si surface (localized states that can trap electrons) and look at how those bonds interact with each other.  Very neat.  Looking at particular individual impurities, with the right system (erbium in Si), you can couple a single impurity to a single-electron transistor charge sensor.  Then, you can manipulate that impurity with optical techniques and use the charge detection to determine its state.  Very impressive.
• The session on secrecy in science was very good.  The ability to manufacture viruses by design is genuinely frightening (though it loses some menace when the words "Pandemic - millions of deaths?" are projected in Comic Sans).  The discussion of intellectual property was great and the role of universities merits its own blog post.  Lastly, I was unaware of the WATCHMAN project, which is a very interesting neutrino physics experiment that as an added bonus should allow the international community to detect rogue nuclear reactors meant for weapons development.

## March 03, 2014

### Quantum Diaries — CDMS result covers new ground in search for dark matter

The Cryogenic Dark Matter Search has set more stringent limits on light dark matter.

Scientists looking for dark matter face a serious challenge: No one knows what dark matter particles look like. So their search covers a wide range of possible traits—different masses, different probabilities of interacting with regular matter.

Today, scientists on the Cryogenic Dark Matter Search experiment, or CDMS, announced they have shifted the border of this search down to a dark-matter particle mass and rate of interaction that has never been probed.

“We’re pushing CDMS to as low mass as we can,” says Fermilab physicist Dan Bauer, the project manager for CDMS. “We’re proving the particle detector technology here.”

Their result, which does not claim any hints of dark matter particles, contradicts a result announced in January by another dark matter experiment, CoGeNT, which uses particle detectors made of germanium, the same material as used by CDMS.

To search for dark matter, CDMS scientists cool their detectors to very low temperatures in order to detect the very small energies deposited by the collisions of dark matter particles with the germanium. They operate their detectors half of a mile underground in a former iron ore mine in northern Minnesota. The mine provides shielding from cosmic rays that could clutter the detector as it waits for passing dark matter particles.

Today’s result carves out interesting new dark matter territory for masses below 6 billion electronvolts. The dark matter experiment Large Underground Xenon, or LUX, recently ruled out a wide range of masses and interaction rates above that with the announcement of its first result in October 2013.

Scientists have expressed an increasing amount of interest of late in the search for low-mass dark matter particles, with CDMS and three other experiments—DAMA, CoGeNT and CRESST—all finding their data compatible with the existence of dark matter particles between 5 billion and 20 billion electronvolts. But such light dark-matter particles are hard to pin down. The lower the mass of the dark-matter particles, the less energy they leave in detectors, and the more likely it is that background noise will drown out any signals.

Even more confounding is the fact that scientists don’t know whether dark matter particles interact in the same way in detectors built with different materials. In addition to germanium, scientists use argon, xenon, silicon and other materials to search for dark matter in more than a dozen experiments around the world.

“It’s important to look in as many materials as possible to try to understand whether dark matter interacts in this more complicated way,” says Adam Anderson, a graduate student at MIT who worked on the latest CDMS analysis as part of his thesis. “Some materials might have very weak interactions. If you only picked one, you might miss it.”

Scientists around the world seem to be taking that advice, building different types of detectors and constantly improving their methods.

“Progress is extremely fast,” Anderson says. “The sensitivity of these experiments is increasing by an order of magnitude every few years.”

Kathryn Jepsen

### Tommaso Dorigo — The dyslectic guy with an erection problem...

Did you know about that dyslectic guy with an impotence problem who once came to Fermilab ? He said he'd been advised to go there as he wanted to get a hadron.

### Matt Strassler — A 100 TeV Proton-Proton Collider?

During the gap between the first run of the Large Hadron Collider [LHC], which ended in 2012 and included the discovery of the Higgs particle (and the exclusion of quite a few other things), and its second run, which starts a year from now, there’s been a lot of talk about the future direction for particle physics. By far the most prominent option, both in China and in Europe, involves the long-term possibility of a (roughly) 100 TeV proton-proton collider — that is, a particle accelerator like the LHC, but with 5 to 15 times more energy per collision.

Do we need such a machine?

The answer is “Yes, Definitely”. Definitely, if human beings are to continue to explore the inner world of the elementary laws of nature with the same level of commitment with which they explore the outer world of our neighboring planets, the nearby stars and their own planets, and distant galaxies far-flung across the universe. If we can send the Curiosity rover to roam around the surface of the Red Planet and beam back pictures and scientific information — if we can send telescopes like Kepler into space whose sole purpose is to look for signs of planets around distant stars — then surely we can build a machine on Earth whose sole purpose is to help us understand the fundamental principles and elementary objects that underlie the natural world. That’s why we built the LHC, and machines before it; and the justification for a 100 TeV machine remains the same.

Definitely, also, if the exploration of the laws of nature is to continue as a healthy research field. We have a large number of experts who know how to build a big particle accelerator. If we were to postpone building such a machine for a generation, we would suffer some of the same problems suffered by the U.S. space program. All sorts of crucial knowledge of the craft of rocket building was lost when the U.S. failed to follow up on its several trips to the Moon. If we have a hiatus of a generation between the current machine and the next, we will find it much more difficult and expensive to build the next one when we finally decide to do it. So it makes sense to do maintain continuity, especially if it can be done at reasonable cost.

One thing that’s interesting to keep in mind is that a roughly 100 TeV machine is hardly a stretch for modern technology; it’s not going to be a machine with a significant risk of failure. The Superconducting SuperCollider (SSC), which was to be the U.S. flagship machine and was due to start running in the year 2000 (in which case it would definitely have discovered the Higgs particle many years ago — sadly, the U.S. congress canceled it, after it was well underway, in 1993), would have been a 40 TeV machine. The technological step from 40 TeV to 80 or 120 is not a big one. Moreover, the SSC would have been an easier machine to run than is the LHC, which has to strain with very high collision rates to make up for the fact that its energy per collision is a third of what the SSC would have been capable of. The main challenge for such an accelerator is that it has to be very large — which requires a very long tunnel (over 50 miles/80 km) and a very large number of powerful magnets.

It’s no wonder the Chinese are interested in potentially building this machine. With an economy growing rapidly enough to catch up with the other great nations of the world in the next decade or two, and with scientific prowess rapidly increasing (see here and here), some in China rightly see a 100 TeV proton-proton collider both as an opportunity to gain all sorts of technical and technological knowledge that they have previously lacked, and to establish themselves among the few nations that can be viewed as scientific superpowers. Yet it will not require them to go far out on a limb with technology that no one has ever attempted at all, and invent whole new methods that don’t currently exist. Moreover, some of the things that would be expensive or politically complex in the U.S. or Europe will be easier in China. They may be able to pay for and construct this machine themselves, with technical advice and personnel from other countries, but without being dependent on other nations’ political and financial challenges.

In fact, there’s another huge potential benefit along the way, even before the 100 TeV machine is built: a “Higgs factory”. One can potentially use this same tunnel to first build an accelerator that smashes electrons and positrons [i.e. anti-electrons] together, at an energy which isn’t that high, but is sufficient to make Higgs particles at a high rate — not as many Higgs particles as the LHC will produce, but in an environment where precise measurements are much easier to make. [Protons are messy, and all measurements in proton-proton collisions are very difficult due to huge collision rates and large backgrounds; electrons and positrons are simple objects and measurements tend to be much more straightforward.  This comes at a cost: it is harder to get collisions at the highest energies physicists would ideally want.]

The value of a Higgs factory is obvious: a no-brainer. The Higgs particle is our main way of gaining insight into the nature of the all-important Higgs field, and moreover the Higgs particle might also, through its possible rare decays, illuminate a currently veiled world of unknown particles and forces. It’s a research effort whose importance no one can deny, and it serves as a technical stepping stone to a 100 TeV collider, complete with the realistic possibility of Nobel Prize-worthy discoveries in the near term. For China, it’s perfect.

Of course, the Chinese aren’t the only ones interested.  My European colleagues, recognizing a good thing when they see it, and with the advantage that they built and ran the LHC, are also considering building such a machine. [Neither the U.S., which is expertly squandering its scientific leadership in many scientific fields (and pushing many of its best scientists toward the Chinese effort), nor Russia, which is busy starting a disastrous invasion of its neighbor, seem able to make any intelligent decisions at the moment, and surely aren't going to be the leaders in such an effort.] For the moment, the scientists involved are all working together.  Over recent years, any particle physicist worth his or her salt (including me) would spend some time at Europe’s CERN laboratory, which hosts the LHC. And now, many young U.S. experts in theoretical particle physics are planning to spend extended time at China’s “Center for the Future of High Energy Physics“. There was a time young Chinese geniuses like T.D. Lee, C.N. Yang and C.S. Wu did Nobel Prize-winning (or -deserving) work in the United States. Soon, perhaps, it will be the other way around.

But what, scientifically, is the justification for this machine?

Why build a 100 TeV collider?

It’s important to distinguish two types of scientific enterprises: exploratory and targeted. Exploratory refers to when you’re doing a search, in a plausible place, for anything unexpected — perhaps for something whose existence you might suspect, but perhaps more broadly. Targeted refers to doing a search or study where you know roughly, or even exactly, what you’re looking for.

Often a targeted enterprise is also exploratory; while looking for one thing, you can always stumble on something else. Many scientific discoveries, such as X-rays, have been made while doing or preparing experiments with a completely different purpose. On the other hand, an exploratory enterprise may not have any targets at all, or at best, only a very vague target. Sometimes we go searching just because we can. When Galileo pointed his first telescopes at the moon and the planets and the stars, he had no idea what he would find; he just knew he had a great opportunity to discover something.

The LHC was built as a clearly targeted machine: its main goal was to find the Higgs particle (or particles) if it (or they) existed, or whatever replaced them if they did not. Well, now we know that one Higgs particle exists, and it resembles the simplest possible type of Higgs particle, which is termed a “Standard Model Higgs”.   But much remains to learn.  Is this Higgs particle really Standard Model-like, not just at the 30% level but at the 3% level and better? Are there other Higgs particles?  Are there other as-yet unknown particles being produced at the LHC? Are there new forces beyond the ones we’re aware of? Other than the detailed study of the new Higgs particle, these questions are mostly exploratory. In short, though the LHC was built as a targeted machine with a near-guarantee of success, its mission has now shifted toward exploration of the unknown, with no guarantee of further discoveries. But it’s also important to understand that a lack of discoveries will be just as important to our understanding of nature as discoveries would be, for reasons I’ll return to in my next post.

Now what about the 100 TeV machine? Will it be a targeted experimental facility, or an exploratory one?

For the moment, the answer is: we don’t know. Currently, there is no clear target; more precisely, there are lots of possible targets, but none that we know could emerge to be a major, central one. But this machine won’t be built and completed for a couple of decades, and things could change dramatically by then. If the LHC discovers something not predicted by the Standard Model (the equations used to describe the known elementary particles and forces), then clarifying this new discovery will become a major target, and possibly the main target, of the 100 TeV machine.

This highlights one of the challenges with large experimental projects. One has to start thinking about them far in advance, long before it’s entirely clear what their precise use will be. When the SSC and the LHC were first proposed, they did have a proposed target — finding the Higgs particle or particles. But if the recently discovered Higgs particle’s mass had been, say, half of what it actually is, it would have been discovered some years before the SSC or LHC were completed… in which case, the target of the SSC and LHC would have significantly shifted. So we have to start considering, proposing, and perhaps even building the 100 TeV machine before it’s completely clear whether it will have a prominent and definite target, or whether it will be mainly exploratory. That ambiguity is something we just have to live with.

In contrast to the 100 TeV machine, which currently has to be viewed as exploratory, the Higgs factory that would precede it in the same tunnel is much more sharply targeted… targeted at detailed study of the Higgs particle. There are some other targeted and exploratory activities that it can be involved in, including more detailed investigation of the Z particle, W particle and top quark, but its main focus is the Higgs.

However, even if no prominent target for the 100 TeV collider shows up before it is built, its justification as an exploratory machine is clear. In quantum field theory, collisions at higher energy and momentum allow you to probe physics at shorter times and distances — for “particles” are really quanta, i.e., ripples in quantum fields, and a higher-energy quantum has a shorter wavelength and a faster frequency. And we’ve learned time and time again that one way (though not the only one) to discover new things about the world is to examine its behavior on shorter times and shorter distances than we’ve previously been capable of. This enterprise has been going on for generations; first microscopes discovered bacteria and other cells; then these were found, with more powerful experiments, to be made of molecules, in turn made from atoms; yet more powerful experiments showed first that the atoms contain electrons and atomic nuclei, then that the nuclei are made from protons and neutrons, and then that these in turn are made from quarks and gluons. All of this has been discovered by probing the world with ever more powerful particle collisions of one form or another. So building a higher energy accelerator is to take another step along a well-trodden path.

However, it’s not the only path, nor has it ever been.

Is this the most promising path to explore?

The LHC is still in its adolescence, and we can’t predict its future discoveries. At this point the LHC experiments have collected a few percent of the data they’ll collect over the next decade, and they have done so with proton-proton collisions whose energy is only about 60% of what we expect to see in the next few years. Moreover, even the existing data set, collected in 2011-2012, hasn’t been fully analyzed; this data could still yield discoveries (but only if the experimenters choose to make the relevant measurements.) So we certainly can’t know yet whether the LHC will produce a new target for the 100 TeV machine. If it does, then it will be much clearer what to do next and how to use the 100 TeV machine. If it doesn’t… well, that’s something that deserves a bit more discussion.

Suppose that, after the LHC’s last run, nothing other than the Higgs particle’s been found, with properties that are consistent, to a few percent, with a Standard Model Higgs. While this sounds dull at first glance, it’s actually among the most radical possible outcomes of the LHC. That’s because of the “naturalness puzzle”, which I discussed in some detail in this article. Never before in nature, in any generic context, have we come across a low-mass spin-zero particle (i.e. something like the Higgs particle) without other particles associated with it.  In this sense, the Standard Model is an extraordinarily non-generic theory, at least from our current point of view and understanding.  It will be quite shocking if it completely describes all LHC data.

But maybe it does.  If it does, what does this potentially imply about nature?  And what would be the implications for our future explorations of nature at its most elementary level? I’ll address this issue in my next post.

Filed under: Higgs, LHC News, Other Collider News, Particle Physics, The Scientific Process Tagged: atlas, cms, Higgs, LHC, particle physics

### Quantum Diaries — Dear Google: Hire us!

In case you haven’t figured it out already from reading the US LHC blog or any of the others at Quantum Diaries, people who do research in particle physics feel passionate about their work. There is so much to be passionate about! There are challenging intellectual issues, tricky technical problems, and cutting-edge instrumentation to work with — all in pursuit of understanding the nature of the universe at its most fundamental level. Your work can lead to global attention and support Nobel Prizes. It’s a lot of effort put in over long days and nights, but there is also a lot of satisfaction to be gained from our accomplishments.

That being said, a fundamental truth about our field is that not everyone doing particle-physics research will be doing that for their entire career. There are fewer permanent jobs in the field than there are people who are qualified to hold them. It is certainly easy to do the math about university jobs in particular — each professor may supervise a large number of PhD students in his or her career, but only one could possibly inherit that job position in the end. Most of our researchers will end up working in other fields, quite likely in the for-profit sector, and as a field we do need to make sure that they are well-prepared for jobs in that part of the world.

I’ve always believed that we do a good job of this, but my belief was reinforced by a recent column by Tom Friedman in The New York Times. It was based around an interview with the Google staff member who oversees hiring for the company. The essay describes the attributes that Google looks for in new employees, and I couldn’t help but to think that people who work in the large experimental particle physics projects such as those at the LHC have all of those attributes. Google is not just looking for technical skills — it goes without saying that they are, and that particle physicists have those skills and great experience with digesting large amounts of computerized data. Google is also looking for social and personality traits that are also important for success in particle physics.

(Side note: I don’t support all of what Friedman writes in his essay; he is somewhat dismissive of the utility of a college education, and as a university professor I think that we are doing better than he suggests. But I will focus on some of his other points here. I also recognize that it is perhaps too easy for me to write about careers outside the field when I personally hold a permanent job in particle physics, but believe me that it just as easily could have wound up differently for me.)

For example, just reading from the Friedman column, one thing Google looks for is what is referred to as “emergent leadership”. This is not leadership in the form of holding a position with a particular title, but seeing when a group needs you to step forward to lead on something when the time is right, but also to step back and let someone else lead when needed. While the big particle-physics collaborations appear to be massive organizations, much of the day to day work, such as the development of a physics measurement, is done in smaller groups that function very organically. When they function well, people do step up to take on the most critical tasks, especially when they see that they are particularly positioned to do them. Everyone figures out how to interact in such a way that the job gets done. Another facet of this is ownership: everyone who is working together on a project feels personally responsible for it and will do what is right for the group, if not the entire experiment — even if it means putting aside your own ideas and efforts when someone else clearly has the better thing.

And related to that in turn is what is referred to in the column as “intellectual humility.” We are all very aggressive in making our arguments based on the facts that we have in hand. We look at the data and we draw conclusions, and we develop and promote research techniques that appear to be effective. But when presented with new information that demonstrates that the previous arguments are invalid, we happily drop what we had been pursuing and move on to the next thing. That’s how all of science works, really; all of your theories are only as good as the evidence that supports them, and are worthless in the face of contradictory evidence. Google wants people who take this kind of approach to their work.

I don’t think you have to be Google to be looking for the same qualities in your co-workers. If you are an employer who wants to have staff members who are smart, technically skilled, passionate about what they do, able to incorporate disparate pieces of information and generate new ideas, ready to take charge when they need to, feel responsible for the entire enterprise, and able to say they are wrong when they are wrong — you should be hiring particle physicists.

### Tommaso Dorigo — The Plot Of The Week - Dark Matter Candidates In Super-CDMS

The Super-CDMS dark-matter search has released two days ago the results from the analysis of nine months of data taking. The experiment has excellent sensitivity to weak interacting massive particles producing inelastic scattering with the Germanium in the detector.

The detector is composed of fifteen cylindrical 0.6 kg crystals stacked in groups of three, equipped with ionization and phonon detectors that are capable of measuring the energy of the signals. From that the recoil energy can be derived, and a rough estimate of WIMP candidates mass. The towers are kept at close to absolute zero temperature in the Soudan mine, where backgrounds from cosmic rays and other sources are very small.

### n-Category CaféShould Mathematicians Cooperate with GCHQ?

I’ve just submitted a piece for the new Opinions section of the monthly LMS Newsletter: Should mathematicians cooperate with GCHQ? The LMS is the London Mathematical Society, which is the UK’s national mathematical society. My piece should appear in the April edition of the newsletter, and you can read it below.

Here’s the story. Since November, I’ve been corresponding with people at the LMS, trying to find out what connections there are between it and GCHQ. Getting the answer took nearly three months and a fair bit of pushing. In the process, I made some criticisms of the LMS’s total silence over the GCHQ/NSA scandal:

GCHQ is a major employer of mathematicians in the UK. The NSA is said to be the largest employer of mathematicians in the world. If there had been a major scandal at the heart of the largest publishing houses in the world, unfolding constantly over the last eight months, wouldn’t you expect it to feature prominently in every issue of the Society of Publishers’ newsletter?

To its credit, the LMS responded by inviting me to write an inaugural piece for a new Opinions section of the newsletter. Here it is.

Should mathematicians cooperate with GCHQ?

Tom Leinster

One of the UK’s largest employers of mathematicians has been embroiled in a major international scandal for the last nine months, stands accused of law-breaking on an industrial scale, and is now the object of widespread outrage. How has the mathematical community responded? Largely by ignoring it.

GCHQ and its partners have been systematically monitoring as much of our lives as they possibly can, including our emails, phone calls, text messages, bank transactions, web browsing, Skype calls, and physical location. The goal: “collect it all”. They tap internet trunk cables, bug charities and political leaders, disrupt lawful activist groups, and conduct economic espionage, all under the banner of national security.

Perhaps most pertinently to mathematicians, the NSA (GCHQ’s major partner and partial funder) has deliberately undermined internet encryption, inserting a secret back door into a standard elliptic curve algorithm. This can be exploited by anyone sufficiently skilled and malicious — not only the NSA/GCHQ. (See Thomas Hales’s piece in February’s Notices of the AMS.) We may never know what else mathematicians have been complicit in; GCHQ’s policy is not to comment on intelligence matters, which is to say, anything it does.

Indifference to mass surveillance rests partly on misconceptions such as “it’s only metadata”. This is certainly false; for instance, GCHQ has used webcams to collect images, many sexually intimate, of millions of ordinary citizens. It is also misguided, even according to the NSA’s former legal counsel: “metadata absolutely tells you everything about somebody’s life”.

Some claim to be unbothered by the recording of their daily activities, confident that no one will examine their records. They may be right. But even if you feel that way, do you want the secret services to possess such a powerful tool for chilling dissent, activism, and even journalism? Do you trust an organization operating in secret, and subject to only “light oversight” (a GCHQ lawyer’s words), never to abuse that power?

Mathematicians seldom have to face ethical questions. But now we must decide: cooperate with GCHQ or not? It has been suggested that mathematicians today are in the same position as nuclear physicists in the 1940s. However, the physicists knew they were building a bomb, whereas mathematicians working for GCHQ may have little idea how their work will be used. Colleagues who have helped GCHQ in the past, trusting that they were contributing to legitimate national security, may justifiably feel betrayed.

At a bare minimum, we as a community should talk about it. Sasha Beilinson has proposed that working for the NSA/GCHQ should be made “socially unacceptable”. Not everyone will agree, but it reminds us that we have both individual choice and collective power. Individuals can withdraw their labour from GCHQ. Heads of department can refuse staff leave to work for GCHQ. The LMS can refuse GCHQ’s money.

At a bare minimum, let us acknowledge that the choices are ours to make. We are human beings before we are mathematicians, and if we do not like what GCHQ is doing, we do not have to cooperate.

I had a 500-word limit, so I omitted a lot. Here are the facts on the LMS’s links with GCHQ, as stated to me by the LMS President Terry Lyons:

The Society has an indirect relationship with GCHQ via a funding agreement with the Heilbronn Institute, in which the Institute will give up to £20,000 per year to the Society. This is approximately 0.7% of our total income. This is a recently made agreement and the funding will contribute directly to the LMS-CMI Research Schools, providing valuable intensive training for early career mathematicians. GCHQ is not involved in the choice of topics covered by the Research Schools.

So, GCHQ’s financial support for the LMS is small enough that declining it would not make a major financial impact.

I hope the LMS will make a public statement clarifying its relationship with GCHQ. I see no argument against transparency.

Another significant factor (which Lyons alludes to above and is already a matter of public record) is that GCHQ is a funder of the Heilbronn Institute, which is a collaboration between GCHQ and the University of Bristol. I don’t know that the LMS is involved with Heilbronn beyond what’s mentioned above, but Heilbronn does seem to provide an important channel through which (some!) British mathematicians support the secret services.

Finally, I want to make clear that although I think there are some problems with the LMS as an institution, I don’t blame the people running it, many of whom are taking time out of extremely busy schedules for the most altruistic reasons. As I wrote to one of them:

I’m genuinely in awe of the amount that you […] give to the mathematical community, both in terms of your selflessness and your energy. I don’t know how you do it. Anything critical I have to say is said with that admiration as the backdrop, and I hope I’d never say anything of the form “do more!”, because to ask that would be ridiculous.

Rules for commenting here  I’ve now written several posts on this and related subjects (1, 2, 3, 4). Every time, I’ve deleted some off-topic comments — including some I’ve enjoyed and agreed with heartily. Please keep comments on-topic. In case there’s any doubt, the topic is the relationship between mathematicians and the secret services. Comments that stray too far from this will be deleted.

### Terence Tao — Conserved quantities for the Euler equations

The Euler equations for incompressible inviscid fluids may be written as

$\displaystyle \partial_t u + (u \cdot \nabla) u = -\nabla p$

$\displaystyle \nabla \cdot u = 0$

where ${u: [0,T] \times {\bf R}^n \rightarrow {\bf R}^n}$ is the velocity field, and ${p: [0,T] \times {\bf R}^n \rightarrow {\bf R}}$ is the pressure field. To avoid technicalities we will assume that both fields are smooth, and that ${u}$ is bounded. We will take the dimension ${n}$ to be at least two, with the three-dimensional case ${n=3}$ being of course especially interesting.

The Euler equations are the inviscid limit of the Navier-Stokes equations; as discussed in my previous post, one potential route to establishing finite time blowup for the latter equations when ${n=3}$ is to be able to construct “computers” solving the Euler equations, which generate smaller replicas of themselves in a noise-tolerant manner (as the viscosity term in the Navier-Stokes equation is to be viewed as perturbative noise).

Perhaps the most prominent obstacles to this route are the conservation laws for the Euler equations, which limit the types of final states that a putative computer could reach from a given initial state. Most famously, we have the conservation of energy

$\displaystyle \int_{{\bf R}^n} |u|^2\ dx \ \ \ \ \ (1)$

(assuming sufficient decay of the velocity field at infinity); thus for instance it would not be possible for a computer to generate a replica of itself which had greater total energy than the initial computer. This by itself is not a fatal obstruction (in this paper of mine, I constructed such a “computer” for an averaged Euler equation that still obeyed energy conservation). However, there are other conservation laws also, for instance in three dimensions one also has conservation of helicity

$\displaystyle \int_{{\bf R}^3} u \cdot (\nabla \times u)\ dx \ \ \ \ \ (2)$

and (formally, at least) one has conservation of momentum

$\displaystyle \int_{{\bf R}^3} u\ dx$

and angular momentum

$\displaystyle \int_{{\bf R}^3} x \times u\ dx$

(although, as we shall discuss below, due to the slow decay of ${u}$ at infinity, these integrals have to either be interpreted in a principal value sense, or else replaced with their vorticity-based formulations, namely impulse and moment of impulse). Total vorticity

$\displaystyle \int_{{\bf R}^3} \nabla \times u\ dx$

is also conserved, although it turns out in three dimensions that this quantity vanishes when one assumes sufficient decay at infinity. Then there are the pointwise conservation laws: the vorticity and the volume form are both transported by the fluid flow, while the velocity field (when viewed as a covector) is transported up to a gradient; among other things, this gives the transport of vortex lines as well as Kelvin’s circulation theorem, and can also be used to deduce the helicity conservation law mentioned above. In my opinion, none of these laws actually prohibits a self-replicating computer from existing within the laws of ideal fluid flow, but they do significantly complicate the task of actually designing such a computer, or of the basic “gates” that such a computer would consist of.

Below the fold I would like to record and derive all the conservation laws mentioned above, which to my knowledge essentially form the complete set of known conserved quantities for the Euler equations. The material here (although not the notation) is drawn from this text of Majda and Bertozzi.

For reasons which may become clearer later, I will rewrite the Euler equations in the language of Riemannian geometry, in particular, using the abstract index notation of Penrose), and using the Euclidean metric ${\eta_{ij}}$ on ${{\bf R}^n}$ to raise and lower indices, and to define the covariant derivative ${\nabla_i}$ through the Levi-Civita connection (which, in Cartesian coordinates, is just the usual partial derivative ${\partial_i}$ evaluated componentwise). The velocity field ${u}$ now is written as ${u^i}$; contracting against the metric ${\eta}$ gives a ${1}$-form ${u_i := \eta_{ij} u^j}$, which I will call the covelocity, and also write as ${u^*}$. The Euler equations then become

$\displaystyle \partial_t u^i + u^j \nabla_j u^i = - \nabla^i p$

$\displaystyle \nabla_i u^i = 0.$

In particular we have

$\displaystyle \partial_t |u|^2 = 2 u_i \partial_t u^i$

$\displaystyle = - 2 u_i u^j \nabla_j u^i - 2 u_i \nabla^i p$

$\displaystyle = - \nabla_j (u^j u_i u^i) - 2 \nabla^i (u_i p)$

which leads to the conservation of energy (1) upon integrating in space.

In the usual treatment of the Euler equations, it is common to introduce the material derivative

$\displaystyle D_t = \partial_t + u \cdot \nabla.$

Here, we shall adopt the subtly different (but closely related) approach of using the material Lie derivative

$\displaystyle {\cal D}_t := \partial_t + {\cal L}_u$

where ${{\cal L}_u}$ is the Lie derivative along the vector field ${u}$. For scalar fields ${f}$, the material Lie derivative is the same as the material derivative:

$\displaystyle {\cal D}_t f = D_t f = \partial_t f + u^i \nabla_i f.$

However, the two notions differ when applied to vector fields or forms, with the material Lie derivative having better covariance properties than the material derivative. When applied to vector fields ${X^i}$, we have

$\displaystyle {\cal L}_u X^i = u^j \nabla_j X^i - X^j \nabla_j u^i$

and so

$\displaystyle {\cal D}_t X^i = D_t X_i - X^j \nabla_j u^i.$

Similarly, for ${1}$-forms ${\lambda_i}$, we have

$\displaystyle {\cal L}_u \lambda_i = u^j \nabla_j \lambda_i + \lambda_j \nabla_i u^j,$

and similarly for ${2}$-forms ${\omega_{ij}}$ we have

$\displaystyle {\cal L}_u \omega_{ij} = u^k \nabla_k \omega_{ij} + \omega_{kj} \nabla_i u^k + \omega_{ik} \nabla_j u^k \ \ \ \ \ (3)$

leading to similar formulae comparing ${{\cal D}_t}$ and ${D_t}$ for forms.

Since ${{\cal L}_u u = 0}$, the material Lie derivative of the velocity field ${u^i}$ is just the time derivative:

$\displaystyle {\cal D}_t u^i = \partial_t u^i = - u^j \nabla_j u^i - \nabla^i p.$

The material Lie derivative of the covelocity field ${u_i}$ is however more interesting:

$\displaystyle {\cal D}_t u_i = \partial_t u_i + u^j \nabla_j u_i + u_j \nabla_i u_j$

$\displaystyle = - \nabla_i p + \nabla_i( \frac{1}{2} |u|^2 )$

$\displaystyle = \nabla_i( - p + \frac{1}{2} |u|^2 ).$

$\displaystyle {\cal D}_t u^* = d( -p + \frac{1}{2} |u|^2 ). \ \ \ \ \ (4)$

Since the integral of a gradient along any closed loop is zero, we obtain

Theorem 1 (Kelvin’s circulation theorem) Let ${\gamma: [0,T] \times S^1 \rightarrow {\bf R}^n}$ be a time-dependent loop in ${{\bf R}^n}$ which is transported by the flow (thus ${\partial_t F(t,\gamma(t,s)) = {\cal D}_t F( t, \gamma(t,s))}$ for any scalar function ${F: [0,T] \times {\bf R}^n \rightarrow {\bf R}}$). Then

$\displaystyle \partial_t \int_\gamma u^* = 0.$

Now we take an exterior derivatives of the covelocity ${u^*}$ to obtain the vorticity

$\displaystyle \omega := d u^*.$

In abstract index notation, ${\omega}$ is the ${2}$-form

$\displaystyle \omega_{ij} = \nabla_i u_j - \nabla_j u_i.$

As exterior derivatives commute with diffeomorphisms, they also commute with Lie derivatives, so in particular

$\displaystyle {\cal D}_t \omega = d( {\cal D}_t u^* ).$

$\displaystyle {\cal D}_t \omega = 0. \ \ \ \ \ (5)$

(This fact was also interpreted as conservation of exterior momentum in this previous blog post.) This fact also follows from Kelvin’s circulation theorem, after first applying Stokes’ theorem to rewrite ${\int_\gamma u^*}$ as ${\int_S \omega}$ for a spanning surface ${S}$ that is transported by the flow.

If we let ${\hbox{vol}}$ be the usual volume ${n}$-form on ${{\bf R}^n}$, then the divergence-free nature of ${u}$ (and the time-independent nature of ${\hbox{vol}}$) implies that ${\hbox{vol}}$ is also transported by the flow:

$\displaystyle {\cal D}_t \hbox{vol} = 0. \ \ \ \ \ (6)$

If we thus define the polar vorticity ${*\omega}$ to be the ${n-2}$-vector that is the Hodge star of ${\omega}$ with respect to this volume form, thus

$\displaystyle *\omega( v ) \hbox{vol} = \omega \wedge v$

for all ${n-2}$-forms ${v}$, then we see from (5), (6) that the polar vorticity is also transported by the flow:

$\displaystyle {\cal D}_t *\omega = 0. \ \ \ \ \ (7)$

$\displaystyle D_t \omega = 0.$

In three dimensions ${n=3}$, ${*\omega = \omega^i}$ is a vector field which by abuse of notation is also denoted ${\omega}$ (in coordinates, ${\omega = \nabla \times u}$), and (7) becomes the well-known vorticity equation:

$\displaystyle D_t \omega^i = \omega^j \nabla_j u^i.$

From (7) we also see that the vortex lines are transported by the flow; in fact we have the stronger statement that if ${\gamma: [0,T] \times {\bf R} \rightarrow {\bf R}^3}$ is transported by the flow and obeys

$\displaystyle \partial_s \gamma^i(t,s) = \omega^i( \gamma(t,s) )$

at the initial time ${t=0}$, then it continues to do so at all later times ${t}$.

In three dimensions, we may contract the polar vorticity ${\omega^i}$ against the covelocity ${u_i}$ to obtain a scalar ${u \cdot \omega}$. We may then combine (7) and (4) to obtain

$\displaystyle {\cal D}_t (u \cdot \omega) = \omega^i \partial_i( -p + \frac{1}{2} |u|^2 ) = {\cal L}_{*\omega}( -p + \frac{1}{2} |u|^2 ).$

Now the exterior derivative of ${\omega = du^*}$ vanishes, so that ${*\omega}$ is divergence-free, and so ${{\cal L}_{*\omega}}$ annihilates ${\hbox{vol}}$. We therefore conclude conservation of helicity (2). In fact we conclude the stronger statement that if ${\Omega}$ is any time-dependent region in ${{\bf R}^3}$ which is preserved by ${\omega}$ (i.e. it is the union of vortex lines) and is transported by the flow, then ${\int_\Omega (u \cdot \omega)\ \hbox{vol}}$ is conserved in time. This is consistent with Kelvin’s circulation theorem, since one can use Fubini’s theorem to compute the integral ${\int_\Omega (u \cdot \omega)\ \hbox{vol}}$ by first computing the integral of ${u^*}$ on each of the vortex lines in ${\Omega}$, and then integrating against ${\omega}$ on the space of vortex lines in ${\Omega}$ (which is a two-dimensional space on which ${\omega}$ naturally descends to become an area form. All of these quantities are transported by the flow.

Finally, we consider the conservation of various moments of the velocity and vorticity. Here it is best to return to material derivatives ${D_t}$ instead of material Lie derivatives ${{\cal D}_t}$, basically because the flow along ${u}$ does not preserve the Euclidean metric ${\eta_{ij}}$ or the flat connection ${\nabla}$, making the interchange of Lie derivatives with the integration of vector-valued quantities a little tricky.

Because we will be considering linear integrals of ${u}$ or ${\omega}$ rather than quadratic integrals, there can be some difficulty in ensuring absolute integrability of the integrals used; for instance, in three dimensions the Biot-Savart law ${u = \nabla \times \Delta^{-1} \omega}$ suggests that ${u}$ could decay as slowly as ${1/|x|^2}$, even if the vorticity is compactly supported. However, the vorticity transport equation (7) tells us (in any dimension) that if the vorticity is compactly supported at time zero, then it remains compactly supported at later times (with the support being transported by the flow). In practice, this means that we will be able to justify operations such as integration by parts if there is at least one factor of the vorticity present.

We begin with the total vorticity

$\displaystyle \int_{{\bf R}^n} \omega_{ij}\ d\hbox{vol},$

which is well-defined as a ${2}$-form thanks to the flat connection. Formally, if we write ${*\omega = du^*}$ and integrate by parts, this vorticity should vanish; however if ${u}$ has slow decay then this is not necessarily the case. For instance, if ${u}$ is a smooth mollification of the 2D Biot-Savart kernel ${\frac{1}{2\pi} (-\frac{x_2}{|x|^2},\frac{x_1}{|x|^2})}$ then the total vorticity is one (times the standard ${2}$-form). In three dimensions, though, there is a trick that allows one to establish vanishing of the total polar vorticity

$\displaystyle \int_{{\bf R}^3} \omega^i\ d\hbox{vol}$

and hence also the total vorticity. Namely, if ${x^i}$ is the scaling vector field ${x \cdot \nabla}$, then

$\displaystyle \omega^i = \omega^j \nabla_j x^i$

$\displaystyle = \nabla_j (\omega^j x^i )$

and integration in parts (now involving the compactly supported vorticity ${\omega^j}$) gives the required vanishing.

In any dimension, though, the total vorticity (and hence also total polar vorticity) is conserved. Indeed, from (5) and (3) we have

$\displaystyle D_t \omega_{ij} = - \omega_{kj} \nabla_i u^k - \omega_{ik} \nabla_j u^k$

$\displaystyle = -\nabla_i(\omega_{kj} u^k) - \nabla_j( \omega_{ik} u^k ) + u^k (\nabla_i \omega_{kj} + \nabla_j \omega_{ik})$

$\displaystyle = -\nabla_i(\omega_{kj} u^k) - \nabla_j( \omega_{ik} u^k ) - u^k \nabla_k \omega_{ji}$

$\displaystyle = -\nabla_i(\omega_{kj} u^k) - \nabla_j( \omega_{ik} u^k ) - \nabla_k (u^k \omega_{ji})$

where we have used the vanishing ${d\omega=0}$ of the exterior derivative of vorticity, as well as the divergence-free nature of ${u}$. This expresses ${D_t \omega}$ as a total derivative

$\displaystyle D_t \omega_{ij} = -\nabla_i(\omega_{kj} u^k) - \nabla_j( \omega_{ik} u^k ) - \nabla_k (u^k \omega_{ji}), \ \ \ \ \ (8)$

giving conservation of total vorticity.

Now we look at total velocity

$\displaystyle \int_{{\bf R}^n} u^i\ d\hbox{vol},$

which (up to a scaling factor representing the density of the incompressible fluid) has the physical interpretation as the total momentum of the fluid. We have

$\displaystyle D_t u^i = - \nabla^i p$

which formally suggests that total velocity is conserved. However, in practice ${u}$ usually decays too slowly to justify this calculation, unless one works in a suitable principal value sense. We shall take a different tack, noting that

$\displaystyle x^i \omega_{ij} = x^i \partial_i u_j - x^i \partial_j u_i$

$\displaystyle = -(n-1) u_j + \partial_i( x^i u_j ) - \partial_j( x^i u_i ).$

Thus, when ${u}$ has enough decay, one has

$\displaystyle \int_{{\bf R}^n} u^i\ d\hbox{vol} = \frac{-1}{n-1} \int_{{\bf R}^n} x^i \omega_{ij}\ d\hbox{vol};$

$\displaystyle \frac{-1}{n-1} \int_{{\bf R}^n} x^i \omega_{ij}\ d\hbox{vol};$

in three dimensions, this would be ${\frac{1}{2} \int_{{\bf R}^3} x \times \omega}$. The above considerations suggest that the impulse should be another conserved quantity, and indeed it is. To see this, we first compute using (8):

$\displaystyle D_t( x^i \omega_{ij} ) = (D_t x^i) \omega_{ij} + x^i D_t \omega_{ij}$

$\displaystyle = u^i \omega_{ij} - x^i \nabla_i( \omega_{kj} u^k ) - x^i \nabla_j( \omega_{ik} u^k ) - \nabla_k (u^k \omega_{ji}))$

$\displaystyle = u^i \omega_{ij} -\nabla_i( x^i \omega_{kj} u^k ) - \nabla_j ( x^i \omega_{ik} u^j ) - \nabla_k ( x^i u^k \omega_{ji})$

$\displaystyle + n \omega_{kj} u^k + \omega_{jk} u^j + u^i \omega_{ji},$

and so it will suffice to show that ${u^i \omega_{ij}}$ is also a total derivative. But it is:

$\displaystyle u^i \omega_{ij} = u^i (\nabla_i u_j - \nabla_j u_i)$

$\displaystyle = \nabla_i (u^i u_j) - \nabla_j(\frac{1}{2} u^i u_i ). \ \ \ \ \ (9)$

Finally, we look at the total angular momentum

$\displaystyle \int_{{\bf R}^n} u_j x_k - u_k x_j\ d\hbox{vol}.$

Again, we have

$\displaystyle D_t( u_j x_k - u_k x_j ) = (D_t u_j) x_k - (D_t u_k) x_j + u_j D_t x_k - u_k D_t x_j$

$\displaystyle = - (\nabla_j p) x_k + (\nabla_k p) x_j + u_j u_k - u_k u_j$

$\displaystyle = - \nabla_j(px_k) + \nabla_k(px_j)$

which as before formally suggests that total angular momentum should be conserved. As with total momentum, in practice the velocity field ${u}$ decays too slowly to justify this calculation, unless one works carefully with principal value integrals (and uses quite precise asymptotics on the decay of ${u}$ at infinity). Once again, one can avoid these technicalities by recasting this quantity in terms of vorticity. Using ${(a_{jk})_{[j,k]} := a_{jk}-a_{kj}}$ to denote antisymmetrisation in the ${jk}$ indices, we observe that

$\displaystyle x^i \omega_{ij} x_k - x^i \omega_{ik} x_j = (x^i \omega_{ij} x_k)_{[j,k]}$

$\displaystyle = ( x^i x_k \partial_i u_j - x^i x_k \partial_j u_i )_{[j,k]}$

$\displaystyle = ( \partial_i(x^i x_k u_j) - \partial_j (x^i x_k u_i) - n x_k u_j - x^i u_i \eta_{jk} )_{[j,k]}$

$\displaystyle = ( \partial_i(x^i x_k u_j) - \partial_j (x^i x_k u_i) )_{[j,k]} - n(u_j x_k - u_k x_j)$

and so we have

$\displaystyle \int_{{\bf R}^n} u_j x_k - u_k x_j\ d\hbox{vol} = \frac{-1}{n} \int_{{\bf R}^n} x^i \omega_{ij} x_k - x^i \omega_{ik} x_j\ d\hbox{vol}$

when there is sufficient decay of the velocity field. Again, the right-hand side makes sense whenever the vorticity is compactly supported. If we then define the moment of impulse

$\displaystyle \frac{-1}{n} \int_{{\bf R}^n} x^i \omega_{ij} x_k - x^i \omega_{ik} x_j$

Filed under: expository, math.AP Tagged: conservation laws, Euler equations, helicity, vorticity

### Scott Aaronson — Recent papers by Susskind and Tao illustrate the long reach of computation

Most of the time, I’m a crabby, cantankerous ogre, whose only real passion in life is using this blog to shoot down the wrong ideas of others.  But alas, try as I might to maintain my reputation as a pure bundle of seething negativity, sometimes events transpire that pierce my crusty exterior.  Maybe it’s because I’m in Berkeley now, visiting the new Simons Institute for Theory of Computing during its special semester on Hamiltonian complexity.  And it’s tough to keep up my acerbic East Coast skepticism of everything new in the face of all this friggin’ sunshine.  (Speaking of which, if you’re in the Bay Area and wanted to meet me, this week’s the week!  Email me.)  Or maybe it’s watching Lily running around, her face wide with wonder.  If she’s so excited by her discovery of (say) a toilet plunger or some lint on the floor, what right do I have not to be excited by actual scientific progress?

Which brings me to the third reason for my relatively-sunny disposition: two long and fascinating recent papers on the arXiv.  What these papers have in common is that they use concepts from theoretical computer science in unexpected ways, while trying to address open problems at the heart of “traditional, continuous” physics and math.  One paper uses quantum circuit complexity to help understand black holes; the other uses fault-tolerant universal computation to help understand the Navier-Stokes equations.

Recently, our always-pleasant string-theorist friend Luboš Motl described computational complexity theorists as “extraordinarily naïve” (earlier, he also called us “deluded” and “bigoted”).  Why?  Because we’re obsessed with “arbitrary, manmade” concepts like the set of problems solvable in polynomial time, and especially because we assume things we haven’t yet proved such as P≠NP.  (Jokes about throwing stones from a glass house—or a stringy house—are left as exercises for the reader.)  The two papers that I want to discuss today reflect a different perspective: one that regards computation as no more “arbitrary” than other central concepts of mathematics, and indeed, as something that shows up even in contexts that seem incredibly remote from it, from the AdS/CFT correspondence to turbulent fluid flow.

Our first paper is Computational Complexity and Black Hole Horizons, by Lenny Susskind.  As readers of this blog might recall, last year Daniel Harlow and Patrick Hayden made a striking connection between computational complexity and the black-hole “firewall” question, by giving complexity-theoretic evidence that performing the measurement of Hawking radiation required for the AMPS experiment would require an exponentially-long quantum computation.  In his new work, Susskind makes a different, and in some ways even stranger, connection between complexity and firewalls.  Specifically, given an n-qubit pure state |ψ〉, recall that the quantum circuit complexity of |ψ〉 is the minimum number of 2-qubit gates needed to prepare |ψ〉 starting from the all-|0〉 state.  Then for reasons related to black holes and firewalls, Susskind wants to use the quantum circuit complexity of |ψ〉 as an intrinsic clock, to measure how long |ψ〉 has been evolving for.  Last week, I had the pleasure of visiting Stanford, where Lenny spent several hours explaining this stuff to me.  I still don’t fully understand it, but since it’s arguable that no one (including Lenny himself) does, let me give it a shot.

My approach will be to divide into two questions.  The first question is: why, in general (i.e., forgetting about black holes), might one want to use quantum circuit complexity as a clock?  Here the answer is: because unlike most other clocks, this one should continue to tick for an exponentially long time!

Consider some standard, classical thermodynamic system, like a box filled with gas, with the gas all initially concentrated in one corner.  Over time, the gas will diffuse across the box, in accord with the Second Law, until it completely equilibrates.  Furthermore, if we know the laws of physics, then we can calculate exactly how fast this diffusion will happen.  But this implies that we can use the box as a clock!  To do so, we’d simply have to measure how diffused the gas was, then work backwards to determine how much time had elapsed since the gas started diffusing.

But notice that this “clock” only works until the gas reaches equilibrium—i.e., is equally spread across the box.  Once the gas gets to equilibrium, which it does in a reasonably short time, it just stays there (at least until the next Poincaré recurrence time).  So, if you see the box in equilibrium, there’s no measurement you could make—or certainly no “practical” measurement—that would tell you how long it’s been there.  Indeed, if we model the collisions between gas particles (and between gas particles and the walls of the box) as random events, then something even stronger is true.  Namely, the probability distribution over all possible configurations of the gas particles will quickly converge to an equilibrium distribution.  And if you all you knew was that the particles were in the equilibrium distribution, then there’s no property of their distribution that you could point to—not even an abstract, unmeasurable property—such that knowing that property would tell you how long the gas had been in equilibrium.

Interestingly, something very different happens if we consider a quantum pure state, in complete isolation from its environment.  If you have some quantum particles in a perfectly-isolating box, and you start them out in a “simple” state (say, with all particles unentangled and in a corner), then they too will appear to diffuse, with their wavefunctions spreading out and getting entangled with each other, until the system reaches “equilibrium.”  After that, there will once again be no “simple” measurement you can make—say, of the density of particles in some particular location—that will give you any idea of how long the box has been in equilibrium.  On the other hand, the laws of unitary evolution assure us that the quantum state is still evolving, rotating serenely through Hilbert space, just like it was before equilibration!  Indeed, in principle you could even measure that the evolution was still happening, but to do so, you’d need to perform an absurdly precise and complicated measurement—one that basically inverted the entire unitary transformation that had been applied since the particles started diffusing.

Lenny now asks the question: if the quantum state of the particles continues to evolve even after “equilibration,” then what physical quantity can we point to as continuing to increase?  By the argument above, it can’t be anything simple that physicists are used to talking about, like coarse-grained entropy.  Indeed, the most obvious candidate that springs to mind, for a quantity that should keep increasing even after equilibration, is just the quantum circuit complexity of the state!  If there’s no “magic shortcut” to simulating this system—that is, if the fastest way to learn the quantum state at time T is just to run the evolution equations forward for T time steps—then the quantum circuit complexity will continue to increase linearly with T, long after equilibration.  Eventually, the complexity will “max out” at ~cn, where n is the number of particles, simply because (neglecting small multiplicative terms) the dimension of the Hilbert space is always an upper bound on the circuit complexity.  After even longer amounts of time—like ~cc^n—the circuit complexity will dip back down (sometimes even to 0), as the quantum state undergoes recurrences.  But both of those effects only occur on timescales ridiculously longer than anything normally relevant to physics or everyday life.

Admittedly, given the current status of complexity theory, there’s little hope of proving unconditionally that the quantum circuit complexity continues to rise until it becomes exponential, when some time-independent Hamiltonian is continuously applied to the all-|0〉 state.  (If we could prove such a statement, then presumably we could also prove PSPACE⊄BQP/poly.)  But maybe we could prove such a statement modulo a reasonable conjecture.  And we do have suggestive weaker results.  In particular (and as I just learned this Friday), in 2012 Brandão, Harrow, and Horodecki, building on earlier work due to Low, showed that, if you apply S>>n random two-qubit gates to n qubits initially in the all-|0〉 state, then with high probability, not only do you get a state with large circuit complexity, you get a state that can’t even be distinguished from the maximally mixed state by any quantum circuit with at most ~S1/6 gates.

OK, now on to the second question: what does any of this have to do with black holes?  The connection Lenny wants to make involves the AdS/CFT correspondence, the “duality” between two completely different-looking theories that’s been the rage in string theory since the late 1990s.  On one side of the ring is AdS (Anti de Sitter), a quantum-gravitational theory in D spacetime dimensions—one where black holes can form and evaporate, etc., but on the other hand, the entire universe is surrounded by a reflecting boundary a finite distance away, to help keep everything nice and unitary.  On the other side is CFT (Conformal Field Theory): an “ordinary” quantum field theory, with no gravity, that lives only on the (D-1)-dimensional “boundary” of the AdS space, and not in its interior “bulk.”  The claim of AdS/CFT is that despite how different they look, these two theories are “equivalent,” in the sense that any calculation in one theory can be transformed to a calculation in the other theory that yields the same answer.  Moreover, we get mileage this way, since a calculation that’s hard on the AdS side is often easy on the CFT side and vice versa.

As an example, suppose we’re interested in what happens inside a black hole—say, because we want to investigate the AMPS firewall paradox.  Now, figuring out what happens inside a black hole (or even on or near the event horizon) is a notoriously hard problem in quantum gravity; that’s why people have been arguing about firewalls for the past two years, and about the black hole information problem for the past forty!  But what if we could put our black hole in an AdS box?  Then using AdS/CFT, couldn’t we translate questions about the black-hole interior to questions about the CFT on the boundary, which don’t involve gravity and which would therefore hopefully be easier to answer?

In fact people have tried to do that—but frustratingly, they haven’t been able to use the CFT calculations to answer even the grossest, most basic questions about what someone falling into the black hole would actually experience.  (For example, would that person hit a “firewall” and die immediately at the horizon, or would she continue smoothly through, only dying close to the singularity?)  Lenny’s paper explores a possible reason for this failure.  It turns out that the way AdS/CFT works, the closer to the black hole’s event horizon you want to know what happens, the longer you need to time-evolve the quantum state of the CFT to find out.  In particular, if you want to know what’s going on at distance ε from the event horizon, then you need to run the CFT for an amount of time that grows like log(1/ε).  And what if you want to know what’s going on inside the black hole?  In line with the holographic principle, it turns out that you can express an observable inside the horizon by an integral over the entire AdS space outside the horizon.  Now, that integral will include a part where the distance ε from the event horizon goes to 0—so log(1/ε), and hence the complexity of the CFT calculation that you have to do, diverges to infinity.  For some kinds of calculations, the ε→0 part of the integral isn’t very important, and can be neglected at the cost of only a small error.  For other kinds of calculations, unfortunately—and in particular, for the kind that would tell you whether or not there’s a firewall—the ε→0 part is extremely important, and it makes the CFT calculation hopelessly intractable.

Note that yes, we even need to continue the integration for ε much smaller than the Planck length—i.e., for so-called “transplanckian” distances!  As Lenny puts it, however:

For most of this vast sub-planckian range of scales we don’t expect that the operational meaning has anything to do with meter sticks … It has more to do with large times than small distances.

One could give this transplanckian blowup in computational complexity a pessimistic spin: darn, so it’s probably hopeless to use AdS/CFT to prove once and for all that there are no firewalls!  But there’s also a more positive interpretation: the interior of a black hole is “protected from meddling” by a thick armor of computational complexity.  To explain this requires a digression about firewalls.

The original firewall paradox of AMPS could be phrased as follows: if you performed a certain weird, complicated measurement on the Hawking radiation emitted from a “sufficiently old” black hole, then the expected results of that measurement would be incompatible with also seeing a smooth, Einsteinian spacetime if you later jumped into the black hole to see what was there.  (Technically, because you’d violate the monogamy of entanglement.)  If what awaited you behind the event horizon wasn’t a “classical” black hole interior with a singularity in the middle, but an immediate breakdown of spacetime, then one says you would’ve “hit a firewall.”

Yes, it seems preposterous that “firewalls” would exist: at the least, it would fly in the face of everything people thought they understood for decades about general relativity and quantum field theory.  But crucially—and here I have to disagree with Stephen Hawking—one can’t “solve” this problem by simply repeating the physical absurdities of firewalls, or by constructing scenarios where firewalls “self-evidently” don’t arise.  Instead, as I see it, solving the problem means giving an account of what actually happens when you do the AMPS experiment, or of what goes wrong when you try to do it.

On this last question, it seems to me that Susskind and Juan Maldacena achieved a major advance in their much-discussed ER=EPR paper last year.  Namely, they presented a picture where, sure, a firewall arises (at least temporarily) if you do the AMPS experiment—but no firewall arises if you don’t do the experiment!  In other words, doing the experiment sends a nonlocal signal to the interior of the black hole (though you do have to jump into the black hole to receive the signal, so causality outside the black hole is still preserved).  Now, how is it possible for your measurement of the Hawking radiation to send an instantaneous signal to the black hole interior, which might be light-years away from you when you measure?  On Susskind and Maldacena’s account, it’s possible because the entanglement between the Hawking radiation and the degrees of freedom still in the black hole, can be interpreted as creating wormholes between the two.  Under ordinary conditions, these wormholes (like most wormholes in general relativity) are “non-traversable”: they “pinch off” if you try to send signals through them, so they can’t be used for faster-than-light communication.  However, if you did the AMPS experiment, then the wormholes would become traversable, and could carry a firewall (or an innocuous happy-birthday message, or whatever) from the Hawking radiation to the black hole interior.  (Incidentally, ER stands for Einstein and Rosen, who wrote a famous paper on wormholes, while EPR stands for Einstein, Podolsky, and Rosen, who wrote a famous paper on entanglement.  “ER=EPR” is Susskind and Maldacena’s shorthand for their proposed connection between wormholes and entanglement.)

Anyway, these heady ideas raise an obvious question: how hard would it be to do the AMPS experiment?  Is sending a nonlocal signal to the interior of a black hole via that experiment actually a realistic possibility?  In their work a year ago on computational complexity and firewalls, Harlow and Hayden already addressed that question, though from a different perspective than Susskind.  In particular, Harlow and Hayden gave strong evidence that carrying out the AMPS experiment would require solving a problem believed to be exponentially hard even for a quantum computer: specifically, a complete problem for QSZK (Quantum Statistical Zero Knowledge).  In followup work (not yet written up, though see my talk at KITP and my PowerPoint slides), I showed that the Harlow-Hayden problem is actually at least as hard as inverting one-way functions, which is even stronger evidence for hardness.

All of this suggests that, even supposing we could surround an astrophysical black hole with a giant array of perfect photodetectors, wait ~1069 years for the black hole to (mostly) evaporate, then route the Hawking photons into a super-powerful, fault-tolerant quantum computer, doing the AMPS experiment (and hence, creating traversable wormholes to the black hole interior) still wouldn’t be a realistic prospect, even if the equations formally allow it!  There’s no way to sugarcoat this: computational complexity limitations seem to be the only thing protecting the geometry of spacetime from nefarious experimenters.

Anyway, Susskind takes that amazing observation of Harlow and Hayden as a starting point, but then goes off on a different tack.  For one thing, he isn’t focused on the AMPS experiment (the one involving monogamy of entanglement) specifically: he just wants to know how hard it is to do any experiment (possibly a different one) that would send nonlocal signals to the black hole interior.  For another, unlike Harlow and Hayden, Susskind isn’t trying to show that such an experiment would be exponentially hard.  Instead, he’s content if the experiment is “merely” polynomially hard—but in the same sense that (say) unscrambling an egg, or recovering a burned book from the smoke and ash, are polynomially hard.  In other words, Susskind only wants to argue that creating a traversable wormhole would be “thermodynamics-complete.”  A third, related, difference is that Susskind considers an extremely special model scenario: namely, the AdS/CFT description of something called the “thermofield double state.”  (This state involves two entangled black holes in otherwise-separated spacetimes; according to ER=EPR, we can think of those black holes as being connected by a wormhole.)  While I don’t yet understand this point, apparently the thermofield double state is much more favorable for firewall production than a “realistic” spacetime—and in particular, the Harlow-Hayden argument doesn’t apply to it.  Susskind wants to show that even so (i.e., despite how “easy” we’ve made it), sending a signal through the wormhole connecting the two black holes of the thermofield double state would still require solving a thermodynamics-complete problem.

So that’s the setup.  What new insights does Lenny get?  This, finally, is where we circle back to the view of quantum circuit complexity as a clock.  Briefly, Lenny finds that the quantum state getting more and more complicated in the CFT description—i.e., its quantum circuit complexity going up and up—directly corresponds to the wormhole getting longer and longer in the AdS description.  (Indeed, the length of the wormhole increases linearly with time, growing like the circuit complexity divided by the total number of qubits.)  And the wormhole getting longer and longer is what makes it non-traversable—i.e., what makes it impossible to send a signal through.

Why has quantum circuit complexity made a sudden appearance here?  Because in the CFT description, the circuit complexity continuing to increase is the only thing that’s obviously “happening”!  From a conventional physics standpoint, the quantum state of the CFT very quickly reaches equilibrium and then just stays there.  If you measured some “conventional” physical observable—say, the energy density at a particular point—then it wouldn’t look like the CFT state was continuing to evolve at all.  And yet we know that the CFT state is evolving, for two extremely different reasons.  Firstly, because (as we discussed early on in this post) unitary evolution is still happening, so presumably the state’s quantum circuit complexity is continuing to increase.  And secondly, because in the dual AdS description, the wormhole is continuing to get longer!

From this connection, at least three striking conclusions follow:

1. That even when nothing else seems to be happening in a physical system (i.e., it seems to have equilibrated), the fact that the system’s quantum circuit complexity keeps increasing can be “physically relevant” all by itself.  We know that it’s physically relevant, because in the AdS dual description, it corresponds to the wormhole getting longer!
2. That even in the special case of the thermofield double state, the geometry of spacetime continues to be protected by an “armor” of computational complexity.  Suppose that Alice, in one half of the thermofield double state, wants to send a message to Bob in the other half (which Bob can retrieve by jumping into his black hole).  In order to get her message through, Alice needs to prevent the wormhole connecting her black hole to Bob’s from stretching uncontrollably—since as long as it stretches, the wormhole remains non-traversable.  But in the CFT picture, stopping the wormhole from stretching corresponds to stopping the quantum circuit complexity from increasing!  And that, in turn, suggests that Alice would need to act on the radiation outside her black hole in an incredibly complicated and finely-tuned way.  For “generically,” the circuit complexity of an n-qubit state should just continue to increase, the longer you run unitary evolution for, until it hits its exp(n) maximum.  To prevent that from happening would essentially require “freezing” or “inverting” the unitary evolution applied by nature—but that’s the sort of thing that we expect to be thermodynamics-complete.  (How exactly do Alice’s actions in the “bulk” affect the evolution of the CFT state?  That’s an excellent question that I don’t understand AdS/CFT well enough to answer.  All I know is that the answer involves something that Lenny calls “precursor operators.”)
3. The third and final conclusion is that there can be a physically-relevant difference between pseudorandom n-qubit pure states and “truly” random states—even though, by the definition of pseudorandom, such a difference can’t be detected by any small quantum circuit!  Once again, the way to see the difference is using AdS/CFT.  It’s easy to show, by a counting argument, that almost all n-qubit pure states have nearly-maximal quantum circuit complexity.  But if the circuit complexity is already maximal, that means in particular that it’s not increasing!  Lenny argues that this corresponds to the wormhole between the two black holes no longer stretching.  But if the wormhole is no longer stretching, then it’s “vulnerable to firewalls” (i.e., to messages going through!).  It had previously been argued that random CFT states almost always correspond to black holes with firewalls—and since the CFT states formed by realistic physical processes will look “indistinguishable from random states,” black holes that form under realistic conditions should generically have firewalls as well.  But Lenny rejects this argument, on the ground that the CFT states that arise in realistic situations are not random pure states.  And what distinguishes them from random states?  Simply that they have non-maximal (and increasing) quantum circuit complexity!

I’ll leave you with a question of my own about this complexity / black hole connection: one that I’m unsure how to think about, but that perhaps interests me more than any other here.  My question is: could you ever learn the answer to an otherwise-intractable computational problem by jumping into a black hole?  Of course, you’d have to really want the answer—so much so that you wouldn’t mind dying moments after learning it, or not being able to share it with anyone else!  But never mind that.  What I have in mind is first applying some polynomial-size quantum circuit to the Hawking radiation, then jumping into the black hole to see what nonlocal effect (if any) the circuit had on the interior.  The fact that the mapping between interior and exterior states is so complicated suggests that there might be complexity-theoretic mileage to be had this way, but I don’t know what.  (It’s also possible that you can get a computational speedup in special cases like the thermofield double state, but that a Harlow-Hayden-like obstruction prevents you from getting one with real astrophysical black holes.  I.e., that for real black holes, you’ll just see a smooth, boring, Einsteinian black hole interior no matter what polynomial-size quantum circuit you applied to the Hawking radiation.)

If you’re still here, the second paper I want to discuss today is Finite-time blowup for an averaged three-dimensional Navier-Stokes equation by Terry Tao.  (See also the excellent Quanta article by Erica Klarreich.)  I’ll have much, much less to say about this paper than I did about Susskind’s, but that’s not because it’s less interesting: it’s only because I understand the issues even less well.

Navier-Stokes existence and smoothness is one of the seven Clay Millennium Problems (alongside P vs. NP, the Riemann Hypothesis, etc).  The problem asks whether the standard, classical differential equations for three-dimensional fluid flow are well-behaved, in the sense of not “blowing up” (e.g., concentrating infinite energy on a single point) after a finite amount of time.

Expanding on ideas from his earlier blog posts and papers about Navier-Stokes (see here for the gentlest of them), Tao argues that the Navier-Stokes problem is closely related to the question of whether or not it’s possible to “build a fault-tolerant universal computer out of water.”  Why?  Well, it’s not the computational universality per se that matters, but if you could use fluid flow to construct general enough computing elements—resistors, capacitors, transistors, etc.—then you could use those elements to recursively shift the energy in a given region into a region half the size, and from there to a region a quarter the size, and so on, faster and faster, until you got infinite energy density after a finite amount of time.

Strikingly, building on an earlier construction by Katz and Pavlovic, Tao shows that this is actually possible for an “averaged” version of the Navier-Stokes equations!  So at the least, any proof of existence and smoothness for the real Navier-Stokes equations will need to “notice” the difference between the real and averaged versions.  In his paper, though, Tao hints at the possibility (or dare one say likelihood?) that the truth might go the other way.  That is, maybe the “universal computer” construction can be ported from the averaged Navier-Stokes equations to the real ones.  In that case, we’d have blowup in finite time for the real equations, and a negative solution to the Navier-Stokes existence and smoothness problem.  Of course, such a result wouldn’t imply that real, physical water was in any danger of “blowing up”!  It would simply mean that the discrete nature of water (i.e., the fact that it’s made of H2O molecules, rather than being infinitely divisible) was essential to understanding its stability given arbitrary initial conditions.

So, what are the prospects for such a blowup result?  Let me quote from Tao’s paper:

Once enough logic gates of ideal fluid are constructed, it seems that the main difficulties in executing the above program [to prove a blowup result for the "real" Navier-Stokes equations] are of a “software engineering” nature, and would be in principle achievable, even if the details could be extremely complicated in practice.  The main mathematical difficulty in executing this “fluid computing” program would thus be to arrive at (and rigorously certify) a design for logical gates of inviscid fluid that has some good noise tolerance properties.  In this regard, ideas from quantum computing (which faces a unitarity constraint somewhat analogous to the energy conservation constraint for ideal fluids, albeit with the key difference of having a linear evolution rather than a nonlinear one) may prove to be useful.

One minor point that I’d love to understand is, what happens in two dimensions?  Existence and smoothness are known to hold for the 2-dimensional analogues of the Navier-Stokes equations.  If they also held for the averaged 2-dimensional equations, then it would follow that Tao’s “universal computer” must be making essential use of the third dimension. How?  If I knew the answer to that, then I’d feel for the first time like I had some visual crutch for understanding why 3-dimensional fluid flow is so complicated, even though 2-dimensional fluid flow isn’t.

I see that, in blog comments here and here, Tao says that the crucial difference between the 2- and 3-dimensional Navier-Stokes equations arises from the different scaling behavior of the dissipation term: basically, you can ignore it in 3 or more dimensions, but you can’t ignore it in 2.  But maybe there’s a more doofus-friendly explanation, which would start with some 3-dimensional fluid logic gate, and then explain why the gate has no natural 2-dimensional analogue, or why dissipation causes its analogue to fail.

Obviously, there’s much more to say about both papers (especially the second…) than I said in this post, and many people more knowledgeable than I am to say those things.  But that’s what the comments section is for.  Right now I’m going outside to enjoy the California sunshine.

## March 02, 2014

### Andrew Jaffe — Academic Blogging Still Dangerous?

Nearly a decade ago, blogging was young, and its place in the academic world wasn’t clear. Back in 2005, I wrote about an anonymous article in the Chronicle of Higher Education, a so-called “advice” column admonishing academic job seekers to avoid blogging, mostly because it let the hiring committee find out things that had nothing whatever to do with their academic job, and reject them on those (inappropriate) grounds.

I thought things had changed. Many academics have blogs, and indeed many institutions encourage it (here at Imperial, there’s a College-wide list of blogs written by people at all levels, and I’ve helped teach a course on blogging for young academics). More generally, outreach has become an important component of academic life (that is, it’s at least necessary to pay it lip service when applying for funding or promotions) and blogging is usually seen as a useful way to reach a wide audience outside of one’s field.

So I was distressed to see the lament — from an academic blogger — “Want an academic job? Hold your tongue”. Things haven’t changed as much as I thought:

… [A senior academic said that] the blog, while it was to be commended for its forthright tone, was so informal and laced with profanity that the professor could not help but hold the blog against the potential faculty member…. It was the consensus that aspiring young scientists should steer clear of such activities.

Depending on the content of the blog in question, this seems somewhere between a disregard for academic freedom and a judgment of the candidate on completely irrelevant grounds. Of course, it is natural to want the personalities of our colleagues to mesh well with our own, and almost impossible to completely ignore supposedly extraneous information. But we are hiring for academic jobs, and what should matter are research and teaching ability.

Of course, I’ve been lucky: I already had a permanent job when I started blogging, and I work in the UK system which doesn’t have a tenure review process. And I admit this blog has steered clear of truly controversial topics (depending on what you think of Bayesian probability, at least).

### John Preskill — Oh, the Places You’ll Do Theoretical Physics!

I won’t run lab tests in a box.
I won’t run lab tests with a fox.
But I’ll prove theorems here or there.
Yes, I’ll prove theorems anywhere…

Physicists occupy two camps. Some—theorists—model the world using math. We try to predict experiments’ outcomes and to explain natural phenomena. Others—experimentalists—gather data using supermagnets, superconductors, the world’s coldest atoms, and other instruments deserving of superlatives. Experimentalists confirm that our theories deserve trashing or—for this we pray—might not model the world inaccurately.

Theorists, people say, can work anywhere. We need no million-dollar freezers. We need no multi-pound magnets.* We need paper, pencils, computers, and coffee. Though I would add “quiet,” colleagues would add “iPods.”

Theorists’ mobility reminds me of the book Green Eggs and Ham. Sam-I-am, the antagonist, drags the protagonist to spots as outlandish as our workplaces. Today marks the author’s birthday. Since Theodor Geisel stimulated imaginations, and since imagination drives physics, Quantum Frontiers is paying its respects. In honor of Oh, the Places You’ll Go!, I’m spotlighting places you can do theoretical physics. You judge whose appetite for exotica exceeds whose: Dr. Seuss’s or theorists’.

I’ve most looked out-of-place doing physics by a dirt road between sheep-populated meadows outside Lancaster, UK. Lancaster, the War of the Roses victor, is a city in northern England. The year after graduating from college, I worked in Lancaster University as a research assistant. I studied a crystal that resembles graphene, a material whose superlatives include “superstrong,” “supercapacitor,” and “superconductor.” From morning to evening, I’d submerse in math till it poured out my ears. Then I’d trek from “uni,” as Brits say, to the “city centre,” as they write.

The trek wound between trees; fields; and, because I was in England, puddles. Many evenings, a rose or a sunset would arrest me. Other evenings, physics would. I’d realize how to solve an equation, or that I should quit banging my head against one. Stepping off the road, I’d fish out a notebook and write. Amidst the puddles and lambs. Cyclists must have thought me the queerest sight since a cloudless sky.

A colleague loves doing theory in the sky. On planes, he explained, hardly anyone interrupts his calculations. And who minds interruptions by pretzels and coffee?

“A mathematician is a device for turning coffee into theorems,” some have said, and theoretical physicists live down the block from mathematicians in the neighborhood of science. Turn a Pasadena café upside-down and shake it, and out will fall theorists. Since Hemingway’s day, the romanticism has faded from the penning of novels in cafés. But many a theorist trumpets about an equation derived on a napkin.

Trumpeting filled my workplace in Oxford. One of Clarendon Lab’s few theorists, I neighbored lasers, circuits, and signs that read “DANGER! RADIATION.” Though radiation didn’t leak through our walls (I hope), what did contributed more to that office’s eccentricity more than radiation would. As early as 9:10 AM, the experimentalists next door blasted “Born to Be Wild” and Animal House tunes. If you can concentrate over there, you can concentrate anywhere.

One paper I concentrated on had a Crumple-Horn Web-Footed Green-Bearded Schlottz of an acknowledgements section. In a physics paper’s last paragraph, one thanks funding agencies and colleagues for support and advice. “The authors would like to thank So-and-So for insightful comments,” papers read. This paper referenced a workplace: “[One coauthor] is grateful to the Half Moon Pub.” Colleagues of the coauthor confirmed the acknowledgement’s aptness.

Though I’ve dwelled on theorists’ physical locations, our minds roost elsewhere. Some loiter in atoms; others, in black holes; some, on four-dimensional surfaces; others, in hypothetical universes. I hobnob with particles in boxes. As Dr. Seuss whisks us to a Bazzim populated by Nazzim, theorists tell of function spaces populated by Rényi entropies.

The next time you see someone standing in a puddle, or in a ditch, or outside Buckingham Palace, scribbling equations, feel free to laugh. You might be seeing a theoretical physicist. You might be seeing me. To me, physics has relevance everywhere. Scribbling there and here should raise eyebrows no more than any setting in a Dr. Seuss book.

The author would like to thank this emporium of Seussoria. And Java & Co.

*We need for them to confirm that our theories deserve trashing, but we don’t need them with us. Just as, when considering quitting school to break into the movie business, you need for your mother to ask, “Are you sure that’s a good idea, dear?” but you don’t need for her to hang on your elbow. Except experimentalists don’t say “dear” when crushing theorists’ dreams.

### John Baez — Network Theory Overview

Here’s a video of a talk I gave yesterday, made by Brendan Fong. You can see the slides here—and then click the items in blue, and the pictures, for more information!

The idea: nature and the world of human technology are full of networks! People like to draw diagrams of networks. Mathematical physicists know that in principle these diagrams can be understood using category theory. But why should physicists have all the fun? This is the century of understanding living systems and adapting to life on a finite planet. Math isn’t the main thing we need for this, but it’s got to be part of the solution… so one thing we should do is develop a unified and powerful theory of networks.

We are still far from doing this. In this overview, I briefly described three parts of the jigsaw puzzle, and invited everyone to help fit them together:

• electrical circuits and signal-flow graphs.

• stochastic Petri nets, chemical reaction networks and Feynman diagrams.

• Bayesian networks, information and entropy.

In my talks coming up, I’ll go into more detail on each of these.﻿ With luck, you’ll be able to see videos here.

But if you’re near Oxford, you might as well actually attend! You can see dates, times, locations, my slides, and the talks themselves as they show up by going here.

### Sean Carroll — Decennial

Almost forgot again — the leap-year thing always gets me. But I’ve now officially been blogging for ten years. Over 2,000 posts, generating over 57,000 comments. I don’t have accurate stats because I’ve moved around a bit, but on the order of ten million visits. Thanks for coming!

Nostalgia buffs are free to check out the archives (by category or month) via buttons on the sidebar, or see the greatest hits page. Here are some of my personal favorites from each of the past ten years: