Planet Musings

September 17, 2014

Tommaso DorigoJohn Ellis On The Ascent Of The Standard Model

Being at CERN for a couple of weeks, I could not refrain from following yesterday's talks in the Main Auditorium, which celebrated the 90th birthday of Herwig Schopper, who directed CERN in the crucial years of the LEP construction.

A talk I found most enjoyable was John Ellis'. He gave an overview of the historical context preceding the decision to build LEP, and then a summary of the incredible bounty of knowledge that the machine produced in the 1990s.

read more

Clifford JohnsonBaby Mothra!!!

So I discovered a terrifying (but also kind of fascinating and beautiful at the same time) new element to the garden this morning. We're having a heat wave here, and so this morning before leaving for work I thought I'd give the tomato plants a spot of moisture. I passed one of the tomato clusters and noticed that one of the (still green) tomatoes had a large bite taken out of it. I assumed it was an experimental bite from a squirrel (my nemesis - or one of them), and muttered dark things under my breath and then prepared to move away the strange coiled leaf that seemed to be on top of it. Then I noticed. It wasn't a leaf. caterpillar_horn_1 It was a HUGE caterpillar! Enormous! Giant and green with spots and even a red horn at one end! There's a moment when you're unexpectedly close to a creature like that where your skin crawls for a bit. Well, mine did for a while [...] Click to continue reading this post

John PreskillThe experimentalist next door

At 9:10 AM, the lab next door was blasting “Born to Be Wild.”

I was at Oxford, moonlighting as a visiting researcher during fall 2013. My hosts included quantum theorists in Townsend Laboratory, a craggy great-uncle of a building. Poke your head out of the theory office, and Experiment would flood your vision. Our neighbors included laser wielders, ion trappers, atom freezers, and yellow signs that warned, “DANGER OF DEATH.”

P1040564

Down the corridor in Townsend Laboratory.

Hardly the neighborhood Mr. Rogers had in mind.

The lab that shared a wall with our office blasted music. To clear my head of calculations and of Steppenwolf, I would roam the halls. Some of the halls, that is. Other halls had hazmat warnings instead of welcome mats. I ran into “RADIATION,” “FIRE HAZARD,” “STRONG MAGNETIC FIELDS,” “HIGH VOLTAGE,” and “KEEP THIS TOILET NEAT AND TIDY.” Repelled from half a dozen doors, I would retreat to the office. Kelly Clarkson would be cooing through the wall.

“We can hear them,” a theorist observed about the experimentalists, “but they can’t hear us.”

P1040552

Dangers lurked even in the bathroom.

Experiment should test, disprove, and motivate theories; and theory should galvanize and (according to some thinkers) explain experiments. But some theorists stray from experiment like North America from Pangaea.

The theoretical physics I’ve enjoyed is abstract. I rarely address platforms, particular physical systems in which theory might incarnate. Quantum-information platforms include electrons in magnetic fields, photons (particles of light), ion trapsquantum dots, and nuclei such as the ones that image internal organs in MRI machines.

Instead of addressing electrons and photons, I address mathematics and abstract physical concepts. Each of these concepts can incarnate in different forms in different platforms. Examples of such concepts include preparation procedures, evolutions, measurements, and memories. One preparation procedure defined by one piece of math can result from a constant magnetic field in one platform and from a laser in another. Abstractness has power, enabling one idea to describe diverse systems.

I’ve enjoyed wandering the hills and sampling the vistas of Theory Land. Yet the experimentalist next door cranked up the radio of reality in my mind. “We can hear them,” a theorist said. In Townsend Laboratory, I began listening.

My Oxford collaborators and I interwove two theoretical frameworks that describe heat transferred and work performed on small scales. One framework, one-shot statistical mechanics, has guest-starred on this blog. The other framework consists of fluctuation-dissipation relations. Fluctuation relations describe the loss of information stored in a system tarnished by its environment. Imagine storing information in a particle’s spin (basically, the particle’s orientation). If the spin points in one direction, it encodes one number; if the spin points in another direction, it encodes another number. If air molecules jostle your particle, or if a shot of energy causes your particle to bounce around, the memory’s reliability degrades. Fluctuation relations quantify the degradation.

My colleagues and I addressed “information,” “systems,” and “interactions.” We deployed abstract ideas, referencing platforms only when motivating our work. Then a collaborator challenged me to listen through the wall.

Experimentalists have tested fluctuation relations. Why not check whether their data supports our theory? At my friend’s urging, I contacted experimentalists who’d shown that DNA obeys a fluctuation relation. The experimentalists had unzipped and re-zipped single DNA molecules using optical tweezers, which resemble ordinary tweezers but involve lasers. Whenever the experimentalists pulled the DNA, they measured the force they applied. They concluded that their platform obeyed an abstract fluctuation theorem. The experimentalists generously shared their data, which supported our results.

http://www.europhysicsnews.org/articles/epn/abs/2010/02/epn20102p27/epn20102p27.html

Experimentalists unzipped and rezipped DNA, using the set-up depicted here, to test fluctuation relations.

My colleagues and I didn’t propose experiments. We didn’t explain why platforms had behaved in unexpected ways. We checked calculations with recycled data. But we ventured outside Theory Land. We learned that one-shot theory models systems modeled also by fluctuation relations, which govern experiments. This link from one-shot theory to experiment, like the forbidden corridors in Townsend Laboratory, invite exploration.

In Townsend, I didn’t suffer the electric shocks or the explosions advertised on the doors (though the hot water in the bathroom nearly burned me). I turned out not to need those shocks. Blasting rock music at 9:10 AM can wake even a theorist up to reality.


September 16, 2014

Quantum DiariesSummer intern studies physics for self, family

This article appeared in Fermilab Today on Sept. 16, 2014.

Summer intern Sheri Lopez, here with son Dominic, pursues her love of physics as a student at the University of New Mexico-Los Alamos. She spent this summer at Fermilab as a summer intern. Photo courtesy of Sheri Lopez

Summer intern Sheri Lopez, here with son Dominic, pursues her love of physics as a student at the University of New Mexico-Los Alamos. She spent this summer at Fermilab as a summer intern. Photo courtesy of Sheri Lopez

Dominic is two. He is obsessed with “Despicable Me” and choo-choos. His mom Sheri Lopez is 29, obsessed with physics, and always wanted to be an astronaut.

But while Dominic’s future is full of possibilities, his mom’s options are narrower. Lopez is a single mother and a sophomore at the University of New Mexico-Los Alamos, where she is double majoring in physics and mechanical engineering. Her future is focused on providing for her son, and that plan recently included 10 weeks spent at Fermilab for a Summer Undergraduate Laboratories Internship (SULI).

“Being at Fermilab was beautiful, and it really made me realize how much I love physics,” Lopez said. “On the other end of the spectrum, it made me realize that I have to think of my future in a tangible way.”

Instead of being an astronaut, now she plans on building the next generation of particle detectors. Lopez is reaching that goal by coupling her love of physics with practical trade skills such as coding, which she picked up at Fermilab as part of her research developing new ways to visualize data for the MINERvA neutrino experiment.

“The main goal of it was to try to make the data that the MINERvA project was getting a lot easier to read and more presentable for a web-based format,” Lopez said. Interactive, user-friendly data may be one way to generate interest in particle physics from a more diverse audience. Lopez had no previous coding experience but quickly realized at Fermilab that it would allow her to make a bigger difference in the field.

Dominic, meanwhile, spent the summer with his grandparents in New Mexico. That was hard, Lopez said, but she received a lot of support from Internship Program Administrator Tanja Waltrip.

“I was determined to not let her miss this opportunity, which she worked so hard to acquire,” Waltrip said. Waltrip coordinates support services for interns like Lopez in 11 different programs hosted by Fermilab.

Less than 10 percent of applicants were accepted into Fermilab’s summer program. SULI is funded by the U.S. Department of Energy, so many national labs host these internships, and applicants choose which labs to apply to.

“There was never a moment when anyone doubted or said I couldn’t do it,” Lopez said. Dominic doesn’t understand why his mom was gone this summer, but he made sure to give her the longest hug of her life when she came back. For her part, Lopez was happy to bring back a brighter future for her son.

Troy Rummler

Chad OrzelTED-Ed Lesson: What Is the Heisenberg Uncertainty Principle?

The second one of the TED-Ed lessons I wrote about quantum physics has now been published: What Is the Heisenberg Uncertainty Principle. This is, again, very similar to stuff I’ve written before, specifically this old blog post and the relevant chapter of How to Teach [Quantum] Physics to Your Dog.

As usual, I tried but probably failed to do justice to other interpretations in the “Dig Deeper” references I sent; outraged Bohmians should feel free to comment either here or there with better explanations.

Again, it’s really fun to see the images the animators found to put to my words. I love the mustache-hat-guy in this one.

In other notes, over 18,000 people have watched yesterday’s video lesson on particles and waves. As I said on Twitter, that’s more views than there have been students enrolled at Union during the 14 years I’ve been teaching here. (We bring in something on the short side of 600 new students every year, and there were three classes here when I arrived, so I’ve been a faculty member through at least part of the careers of around 10,000 Union students.) Which is an interesting bit of perspective…

Tim GowersICM2014 — Bhargava, Gentry, Sanders

On my last day at the ICM I ended up going to fewer talks. As on the previous two days the first plenary lecture was not to be missed — it was Maryam Mirzakhani — so despite my mounting tiredness I set my alarm appropriately. I was a little surprised when I got there by just how empty it was, until eventually I saw that on the screens at the front it said that the lecture was cancelled because of her Fields medallist’s lecture the following Tuesday. I belonged to the small minority that had not noticed this, partly because I have had a lot of trouble with my supposedly-smart phone so was there with a temporary and very primitive replacement which was not the kind of phone on to which one could download a special ICM app that kept one up to date with things like this. I had planned to skip the second lecture of the morning, so I slightly rued my lost couple of hours of potential sleep, while also looking forward to being able to use those hours to work, or perhaps make progress with writing these posts — I can’t remember which of the two I ended up doing.

As a result, the first talk I went to was Manjul Bhargava’s plenary lecture, which was another superb example of what a plenary lecture should be like. Like Jim Arthur, he began by telling us an absolutely central general problem in number theory, but interestingly it wasn’t the same problem — though it is related.

Bhargava’s central problem was this: given a function on the integers/rationals that takes integer/rational values, when does it take square values? In order to persuade us that this problem had been a central preoccupation of number theorists for a very long time, he took as his first example the function f(x,y)=x^2+y^2. Asking for this to take square values is asking for a Pythagorean triple, and people have been interested in those for thousands of years. To demonstrate this, he showed us a cuneiform tablet, which was probably the Babylonian tablet Plimpton 322, which contains a list of Pythagorean triples, some of which involve what in decimal notation are 5-digit numbers, and therefore not the kind of example one stumbles on without some kind of systematic procedure for generating it.

If one takes one’s function to be a cubic in one variable, then one obtains an elliptic curve, and rational points on elliptic curves are of course a huge topic in modern number theory, one to which Bhargava has made a major contribution. I won’t say much more about that, since I have already said a reasonable amount about it when discussing his laudatio. But there were a few extra details that are worth reporting.

He told us that Goldfeld and Katz and Sarnak had conjectured that 50% of elliptic curves have rank 0 and 50% have rank 1 (so the density of elliptic curves with higher rank is zero). He then told us about some work of Brumer and McGuinness in 1990 that seems to cast doubt on this (later) conjecture: they found that rank 2 curves occur quite often and their frequency increases as the coefficients get larger. More recent computational work has very strongly suggested that the conjecture is false: if you draw a graph of the average rank of elliptic curves as the size goes from 10^5 to 10^8, it increases quickly from 0.7 before tailing off and appearing to tend to about 0.87. Apparently the reaction of Katz and Sarnak was a cheerful, “Well, it will go down eventually.”

Bhargava was pretty sceptical about this, but became properly interested in the problem when he learnt about work of Brumer, who showed assuming the generalized Riemann hypothesis and the Birch–Swinnerton-Dyer conjecture that the average rank was bounded above by 2.3. As Bhargava put it, this was a result that depends on two million dollars worth of conjectures. But that meant that if one could prove that the average rank of elliptic curves was greater than 2.3, then one would have shown that at least one of the generalized Riemann hypothesis and the Birch–Swinnerton-Dyer conjecture was false.

Still using the two million dollars worth of conjecture, Heath-Brown got the bound down to 2 in 2004, and Young got it to 1.79 in 2009. Bhargava and Shankar managed to improve that by 0.9 and two million dollars: that is, they obtained an unconditional bound of 0.89, amusingly close to the apparent asymptote of the graph that comes from the computations. As Bhargava pointed out, if one could extend those computations and find that the density eventually surpassed 0.89, this would, paradoxically, be very good news for the conjecture of Katz and Sarnak, because it would prove that the graph did eventually have to start coming down.

More recently, with Chris Skinner, Bhargava got an unconditional lower bound of 0.2.

One thing I understood a bit better by the end of Bhargava’s lecture was the result that the Birch–Swinnerton-Dyer conjecture holds for a positive proportion of elliptic curves. Although this is a remarkable result, there is a sense in which it is a slight cheat. What I mean by that is that Bhargava and his collaborators have a clever way of proving that a positive proportion of elliptic curves have rank 1. Then of those curves, they have a clever way of showing that for a positive proportion of those curves the order of the L-function at s=1 is also 1. What this argument doesn’t do, if my understanding is correct, is show something like this (except perhaps in some trivial sense):

  1. Every elliptic curve that satisfies a certain criterion also satisfies the Birch–Swinnerton-Dyer conjecture.
  2. A positive proportion of elliptic curves satisfy that criterion.

So in some sense, it doesn’t really get us any closer to establishing a connection between the rank of an elliptic curve and the order of the associated L-function at s=1. Perhaps in that respect it is a bit like the various results that say that a positive proportion of the zeros of the zeta function lie on the critical line, though I’m not sure whether that is a good analogy. Nevertheless, it is a remarkable result, in the sense that it proves something that looked out of reach.

Perhaps my favourite moment in Bhargava’s talk came when he gave us a hint about how he proved things. By this time he was talking about hyperelliptic curves (that is, curves y^2=P(x) where P is a polynomial of degree at least 5), where his main result is that most of them don’t have any rational solutions. How does he show that? The following slide, which I photographed, gives us a huge clue.

DSC00676

He looked at polynomials of degree 6. If the hyperelliptic curve y^2=a_0x^6+a_1x^5+\dots+a_5x+a_6 has a rational solution x=t, then by applying the change of variable x'=x-t, we can assume without loss of generality that the rational solution occurs at x=0, which tells us that a_6=c^2 for some rational c. But then you get the remarkable identity shown in the slide: a pair of explicit matrices A and B such that det(Ax-B)=a_0x^6+a_1x^5+\dots+a_5x+a_6. Note that to get these matrices, it was necessary to split a_6 up as a product c\times c, so we really are using the fact that there is a rational point on the curve. And apparently one can show that for most polynomials of degree 6 such a pair of matrices does not exist, so most polynomials of degree 6 do not take square values.

Just as the Babylonians didn’t find huge Pythagorean triples without some method of producing them, so Bhargava and his collaborators clearly didn’t find those matrices A and B without some method of producing them. He didn’t tell us what that method was, but my impression was that it belonged to the same circle of ideas as his work on generalizing Gauss’s composition law.

The lecture was rapturously received, especially by non-mathematicians in the audience (that could be interpreted as a subtly negative remark, but it isn’t meant that way), who came away from it amazed to feel that they had understood quite a bit of it. Afterwards, he was mobbed in a way that film stars might be used to, but mathematicians rather less so. I photographed that too.DSC00677
If you give the photo coordinates in [0,1]^2, then Bhargava’s head is at around (1/4,1/2) and he is wearing a dark red shirt.

At 2pm there was the Gauss Prize lecture. I thought about skipping it, but then thought that that would be hypocritical of me after my views about people who left the laudationes just before the one for the Nevanlinna Prize. I shouldn’t be prejudiced against applied mathematics, and in any case Stanley Osher’s work, or at least part of it, is about image processing, something that I find very interesting.

I went to the talk thinking it would be given by Osher himself, but in fact it was given by someone else about his work. The slides were fairly dense, and there was a surprising amount of emphasis on what people call metrics — numbers of papers, H-factors and so on. The fact that the speaker said, “I realize there is more to academic output than these metrics,” somehow didn’t help. I found myself gradually zoning out of this talk and as a result, despite my initial good intentions, do not have anything more to say about Osher’s work, clearly interesting though it is.

I then did skip the first of the afternoon’s parallel sessions. I wondered about going to hear Mohammed Abouzaid, because I have heard that he is a rising star (or rather, an already risen star who probably has even further to rise), but I found his abstract too intimidating.

So the first talk I actually did go to was in the second session, when I went to hear Craig Gentry, a theoretical computer scientist famous for something called homomorphic encryption, which I had heard about without quite understanding what it was. My target for the 45 minutes was to remedy this situation.

In the end two things happened, one good and one bad. The good one was that early on in the talk Gentry explained what homomorphic encryption was in a a way that was easy to understand. The bad one was that I was attacked by one of my periodic waves of tiredness, so after the early success I took in very little else — I was too absorbed in the struggle to keep my eyes open (or rather, to ensure that the brief moments when I shut them didn’t accidentally turn into stretches of several minutes).

The basic idea of homomorphic encryption is this. Suppose you have some function f that encrypts data, and let’s suppose that the items one encrypts are integers. Now suppose that you are given the encryptions f(m) and f(n) of m and n and want to work out the encryption of m+n. For an arbitrary encryption system f there’s not much you can do other than decrypt f(m) and f(n), add up the results, and then encrypt again. In other words, you can’t do it unless you know how to decrypt. But what if you want people to be able to do things to encrypted data (such as, say, carrying out transactions on someone’s bank account) without having access to the original data? You’d like some weird operation \oplus with the property that f(m+n)=f(m)\oplus f(n). I think now it is clear what the word “homomorphic” is doing here: we want f to be a homomorphism from (integers, +) to (encrypted integers, \oplus).

Having said that, I think Gentry told us (but can’t remember for sure) that just doing this for addition was already known, and his achievement has been to find a system that allows you to add and multiply. So I think his encryption may be a ring homomorphism. Something I haven’t stressed enough here is that it isn’t enough for the “funny” operations \oplus and \otimes to exist: you need to be able to compute them efficiently without being able to decrypt efficiently. The little I took in about how he actually did this made it sound as though it was very clever: it wasn’t just some little trick that makes things easy once you’ve observed it.

If you want to know more, the talk is here.

The last talk I went to, of the entire congress, was that of Tom Sanders, who was talking about the context surrounding his remarkable work on Roth’s theorem on arithmetic progressions. Sanders was the first to show that a subset of \{1,2,\dots,n\} of density 1/(\log n)^{1-o(1)} must contain an arithmetic progression of length 3. This is tantalizingly close to the density of the primes in that interval, and also tantalizingly close to the density needed to prove the first non-trivial case of Erdős’s famous conjecture that a subset X of \mathbb{N} such that \sum_{n\in X}n^{-1}=\infty contains arithmetic progressions of all lengths.

Sanders discussed the general question of which configurations can be found in the primes, but also the question of why they can be found. For instance, quadruples a,b,c,d such that a+b=c+d can be found in the primes, but the proof has nothing to do with the primes other than their density: the number of pairs (a,b) with a,b prime and less than n is about (n/\log n)^2, and the number of possible sums is at most 2n, so some sum can be achieved in several ways. By contrast, while there are many solutions of the equation x+y=4z in the primes (an example is 13+31=4\times 11), one can easily find dense sets of integers with no solutions: for instance, the set of integers congruent to 1 mod 3 or the set of integers strictly between n/3 and 2n/3.

Roth’s theorem concerns the equation x+y=2z, and while has been known for many decades that there are many solutions to this equation in the primes, there is no proof known that uses only the density of the primes, and also no counterexample known that shows that that density is insufficient.

I had a conversation with Sanders after the talk, in which I asked him what he thought the lowest possible density was that guaranteed a progression of length 3. The two natural candidates, given what we know so far, are somewhere around n/\log n, and somewhere around n\exp(-c\sqrt{\log n}). (The latter is the density of the densest known set with no progression of length 3.) Recent work of Schoen and Shkredov, building on Sanders’s ideas, has shown that the equation x_1+x_2+\dots+x_5=5y has non-trivial solutions in any set of density at least \exp(-(\log n)^c). I put it to him that the fact that Schoen and Shkredov needed the extra “smoothness” that comes from taking a fivefold sumset on the left-hand side rather than just a twofold one paradoxically casts doubt on the fact that this type of bound is correct for Roth’s theorem. Rather, it suggests that perhaps the smoothness is actually needed. Sanders replied that this was not necessarily the case: while a convolution of two characteristic functions of dense sets can have “gaps”, in the sense of points x where the value is significantly less than expected, it is difficult for that value to go all the way down to zero.

That will be a bit too vague to be comprehensible if you are not an additive combinatorialist, so let me try to give a little bit more explanation. Let A be a subset of \mathbb{Z}_p (the integers mod p) of density \alpha. We say that A is c-quasirandom if the sizes of the intersections A\cap(A+x), which have mean \alpha^2 p, have standard deviation at most cp. Now one way for the standard deviation to be small is for most of the intersections to have roughly the same size, but for a few of them to be empty. That is the kind of situation that needs to happen if you want an unexpectedly dense set with no arithmetic progression of length 3. (This exact situation doesn’t have to happen, but I’m trying to convey the general feel of what does.) But in many situations, it seems to be hard to get these empty intersections, rather than merely intersections that are quite a bit smaller than average.

After Sanders’s talk (which is here), I went back to my room. By this time, the stomach bug that I mentioned a few posts ago had struck, which wasn’t very good timing given that the conference banquet was coming up. Before that, I went up to the top of the hotel, where there was a stunning view over much of Seoul, to have a drink with Günter Ziegler and one other person whose name I have forgotten (if you’re reading this, I enjoyed meeting you and apologize for this memory lapse). Günter too had a stomach bug, but like me he had had a similar one shortly before coming to Korea, so neither of us could be sure that Korean food had anything to do with it.

The banquet was notable for an extraordinary Kung Fu performance that was put on for our entertainment. It included things like perfomers forming a human pyramid that other performers would run up in order to do a backwards somersault, in the middle of which they would demolish a piece of wood with a sharp blow from the foot. It was quite repetitive, but the tricks were sufficiently amazing to bear quite a bit of repetition.

My last memory of ICM2014 was of meeting Artur Avila in the lobby of the hotel at about 5:25am. I was waiting for the bus that would take me to the airport. “Are you leaving too?” I naively asked him. No, he was just getting back from a night on the town.


Tommaso DorigoATLAS Higgs Challenge Results

After four months of frenzy by over 1500 teams, the very successful Higgs Challenge launched by the ATLAS collaboration ended yesterday, and the "private leaderboard" with the final standings has been revealed. You can see the top 20 scorers below.


read more

Doug NatelsonWhat is a "bad metal"? What is a "strange metal"?

Way back in the mists of time, I wrote about what what physicists mean when they say that some material is a metal.  In brief, a metal is a material that has an electrical resistivity that decreases with decreasing temperature, and in bulk has low energy excitations of the electron system down to arbitrarily low energies (no energy gap in the spectrum).  In a conventional or good metal, it makes sense to think about the electrons in terms of a classical picture often called the Drude model or a semiclassical (more quantum mechanical) picture called the Sommerfeld model.  In the former, you can think of the electrons as a gas, with the idea that the electrons travel some typical distance scale, \(\ell\), the mean free path, between scattering events that randomize the direction of the electron motion.  In the latter, you can think of a typical electronic state as a plane-wave-like object with some characteristic wavelength (of the highest occupied state) \(\lambda_{\mathrm{F}}\) that propagates effortlessly through the lattice, until it comes to a defect (break in the lattice symmetry) causing it to scatter.  In a good metal, \(\ell >> \lambda_{\mathrm{F}}\), or equivalently \( (2\pi/\lambda_{\mathrm{F}})\ell >> 1\).  Electrons propagate many wavelengths between scattering events.  Moreover, it also follows (given how many valence electrons come from each atom in the lattice) that \(\ell >> a\), where \(a\) is the lattice constant, the atomic-scale distance between adjacent atoms.

Another property of a conventional metal:  At low temperatures, the temperature-dependent part of the resistivity is dominated by electron-electron scattering, which in turn is limited by the number of empty electronic states that are accessible (e.g., not already filled and this forbidden as final states due to the Pauli principle).    The number of excited electrons (that in a conventional metal called a Fermi liquid act roughly like ordinary electrons, with charge \(-e\) and spin 1/2) is proportional to \(T\), and therefore the number of empty states available at low energies as "targets" for scattering is also proportional to \(T\), leading to a temperature-varying contribution to the resistivity proportional to \(T^{2}\).

bad metal is one in which some or all of these assumptions fail, empirically.  That is, a bad metal has gapless excitations, but if you analyze its electrical properties and tried to model them conventionally, you might find that the \(\ell\) that you infer from the data might be small compared to a lattice spacing.   This is called violating the Ioffe-Mott-Regel limit, and can happen in metals like rutile VO2 or LaSrCuO4 at high temperatures.

strange metal is a more specific term.  In a variety of systems, instead of having the resistivity scale like \(T^{2}\) at low temperatures, the resistivity scales like \(T\).  This happens in the copper oxide superconductors near optimal doping.  This happens in the related ruthenium oxides.  This happens in some heavy fermion metals right in the "quantum critical" regime.  This happens in some of the iron pnictide superconductors.  In some of these materials, when some technique like photoemission is applied, instead of finding ordinary electron-like quasiparticles, a big, smeared out "incoherent" signal is detected.  The idea is that in these systems there are not well-defined (in the sense of long-lived) electron-like quasiparticles, and these systems are not Fermi liquids.

There are many open questions remaining - what is the best way to think about such systems?  If an electron is injected from a boring metal into one of these, does it "fractionalize", in the sense of producing a huge number of complicated many-body excitations of the strange metal?  Are all strange metals the same deep down?  Can one really connect these systems with quantum gravity?  Fun stuff.

September 15, 2014

David HoggGRB beaming, classifying stars

Andy Fruchter (STScI) gave the astrophysics seminar, on gamma-ray bursts and their host galaxies. He showed Modjaz's (and others) results on the metallicities of "broad-line type IIc" supernovae, which show that the ones associated with gamma-ray bursts are in much lower-metallicity environments than those not associated. I always react to this result by pointing out that this ought to put a very strong constraint on GRB beaming, because (if there is beaming) there ought to be "off-axis" bursts that we don't see as GRBs, but that we do see as a BLIIc. Both Fruchter and Modjaz claimed that the numbers make the constraint uninteresting, but I am surprised: The result is incredibly strong.

In group meeting, Fadely showed evidence that he can make a generative model of the colors and morphologies (think: angular sizes, or compactnesses) of faint, compact sources in the SDSS imaging data. That is, he can build a flexible model (using the "extreme deconvolution" method) that permits him to predict the compactness of a source given a noisy measurement of its five-band spectral energy distribution. This shows great promise to evolve into a non-parametric, model-free (that is: free of stellar or galaxy models) method for separating stars from galaxies in multi-band imaging. The cool thing is he might be able to create a data-driven star–galaxy classification system without training on any actual star or galaxy labels.

Chad OrzelTED-Ed Lesson: The Central Mystery of Quantum Physics

My TED@NYC adventure last fall didn’t turn into an invite to the big TED meeting, but it did lead to a cool opportunity that is another of the very cool developments I’ve been teasing for a while now: I’ve written some scripts for lessons to be posted with TED-Ed. The first of these, on particle-wave duality just went live today.

The content here is very similar to my talk last fall, which is, in turn, very similar to Chapter 8 of Eureka: a historical survey of the development of quantum physics. I did the script for this, which was then turned over to professional animators, who did a great job of finding visuals to go with my words. It was a neat process to see in action, because they did a great job of capturing the basic feel I was after.

I also wrote the supplemental material that’s on the TED-Ed video page, for the benefit of anyone who would like to use this in a class. It’s pretty challenging to come up with decent multiple-choice questions to go with this stuff…

Anyway, that’s the latest of the cool things I’ve been working on and not able to share. There are more of these in the pipeline; I’m not able to say when they’ll go live, but you can expect an announcement here when they do.

Chad OrzelIntelligence vs. Priorities

Steven Pinker has a piece at the New Republic arguing that Ivy League schools ought to weight standardized test scores more heavily in admissions. this has prompted a bunch of tongue-clucking about the failures of the Ivy League from the usual suspects, and a rather heated concurrence from Scott Aaronson. That last finally got me to read the piece, because I had figured I would be happier not reading it, but I wanted to see what got Scott so worked up.

Sadly, my first instinct was correct. It starts off well enough, taking down an earlier anti-Ivy League piece by William Deresiewicz for being full of Proof by Blatant Assertion and empty verbiage. Then it takes off into the land of Proof by Blatant Assertion and meaningless anecdotes. With some bonus impossible statistics, which Ephblog analyzes in some detail (also: Ephblog is apparently running again; I had dropped them from my feeds a couple of years ago). Something about this whole issue just degrades every argument.

There are a bunch of things to not like about Pinker’s article, but to my mind the biggest is that there are a whole lot of unexamined assumptions underlying the assertion that there is a problem here that demands fixing. This is based largely on anecdotes about Kids These Days, and the interpretation of these strikes me as kind of dubious. There’s also a weird confusion between an argument about intelligence and priorities.

The chief evidence that the Pinker offers for there being a problem is that students don’t go to his classes:

Knowing how our students are selected, I should not have been surprised when I discovered how they treat their educational windfall once they get here. A few weeks into every semester, I face a lecture hall that is half-empty, despite the fact that I am repeatedly voted a Harvard Yearbook Favorite Professor, that the lectures are not video-recorded, and that they are the only source of certain material that will be on the exam. I don’t take it personally; it’s common knowledge that Harvard students stay away from lectures in droves, burning a fifty-dollar bill from their parents’ wallets every time they do. Obviously they’re not slackers; the reason is that they are crazy-busy. Since they’re not punching a clock at Safeway or picking up kids at day-care, what could they be doing that is more important than learning in class? The answer is that they are consumed by the same kinds of extracurricular activities that got them here in the first place.

Some of these activities, like writing for the campus newspaper, are clearly educational, but most would be classified in any other setting as recreation: sports, dance, improv comedy, and music, music, music (many students perform in more than one ensemble). The commitments can be draconian: a member of the crew might pull an oar four hours a day, seven days a week, and musical ensembles can be just as demanding. Many students have told me that the camaraderie, teamwork, and sense of accomplishment made these activities their most important experiences at Harvard. But it’s not clear why they could not have had the same experiences at Tailgate State, or, for that matter, the local YMCA, opening up places for less “well-rounded” students who could take better advantage of the libraries, labs, and lectures.

This is, at bottom, nothing but a complaint that a bunch of 18-to-22-year-olds don’t share the priorities of a prominent professional academic. Which is something that I, for one, am utterly shocked to learn for the first time ever when reading this today.

But the really problematic part of this is the jump where he starts attributing this mismatch in priorities to the personal deficiencies of the students admitted to Harvard, and their families. For one thing, as the Ephblog post shows, it’s not all that mathematically plausible to claim that huge numbers of substandard students are getting into elite schools on the basis of extracurriculars alone. To be fair, this may just be a matter of poor word choice on Pinker’s part, or his editor’s– while the claim he appears to be making in the article is pretty clearly nonsense, it might be plausible to claim that less than 10% of Harvard students were admitted on the basis of academics and nothing else. That’s a conveniently unanswerable question, without access to the inner workings of the Harvard admissions office.

But there are a bunch of assumptions here that don’t sit all that well with me. One is, ironically, a problem that Pinker attributes to the “admissocrats”: “perpetuating the destructive stereotype that smart people are one-dimensional dweebs.” Three paragraphs after rhetorically hanging that on people who favor “holistic” admissions, he does exactly the same thing, by assuming that admitting more one-dimensional dweebs would provide a student body who “could take better advantage of the libraries, labs, and lectures.”

There’s also the assumption that students are only doing these extracurricular activities to get into college; that if students didn’t need to play sports, or music, or do community service, or be active in student government in order to get into college, they wouldn’t do those things, and would instead focus on academics. That’s kind of difficult to square with the class attendance anecdote, though– if students were actually engaging in these activities cynically as an admissions strategy, they would drop them immediately on getting to college. And, indeed, a large fraction of them do. But the ones who continue on in their extracurriculars are presumably doing so because they actually enjoy doing these things, and evidently more than they enjoy Pinker’s award-winning lectures.

But the most problematic of the assumptions is that admitting a different category of students selected more on the basis of test scores would produce a better match between the priorities of young adults and middle-aged professors. I’ll give Pinker credit for being relatively forthright about the sort of shifts you would get from a more-test-centered admissions process, which is to produce a student body that is even more drawn from the socioeconomic upper classes. His contention that this indicates a correlation with inherited intelligence, not a causal relationship between money and high scores is, let’s say “not uncontroversial,” but I think a more fundamental problem with this is the assumption that this will produce students more to the liking of university faculty.

I mean, let’s imagine that Pinker’s contention is correct, and the relationship between economic status and test scores is mostly caused by innate abilities– that is, smart people tend to get rich, and smart people have smart kids, who score well on tests. (Yes, this is lightly brushing aside hundreds of years of institutional racism, etc.– it’s a hypothetical to show that even in the best imaginable case where the problematic assertion of meritocracy is true, his argument is dubious.) Even if that’s true, it’s not clear to me that this necessarily gets you a set of students who will be more interested in going to Pinker’s class, or even the sort of liberal education he lauds earlier in the article.

In the end, most of what he talks up is stuff that matters very deeply to people who go on to become university professors. And while I can’t speak to the contents of Steven Pinker’s bank account, in general, one does not become extremely wealthy by becoming a university professor. Most of the career tracks that lead into the upper socioeconomic strata benefit far more from the “soft skills” and relationships that students build through extracurrciular activities than formal classroom activities. If students inherit personality traits as well as raw intelligence from their parents, it’s entirely possible that admitting a higher-scoring cohort of children of intelligent parents who got rich because of their intelligence will produce a class whose priorities align even less well with those of the faculty. Children of corporate lawyers and venture capitalists may very well choose to pursue careers in those same fields for themselves, and those paths may not require regular attendance at classes on linguistic theory.

This isn’t a sure thing, of course– you could also imagine that, freed of mundane concerns about needing to make a living, the children of the rich would be better able to pursue the life of the mind, and thus end up more like their faculty. But that’s the point: the supposed problem is a matter of student priorities, and the connection between those priorities and student intelligence is an assumption that’s being made without being stated outright.

So, while in general I find Pinker less of a pompous ass than Deresiewicz, ultimately, I don’t find him any more convincing. His piece is the same sort of hopeless muddle of unstated and unquestioned assumptions about Kids These Days, just with a thin veneer of scientific objectivity.

—–

Since there appears to be an absolute obligation to include anecdotes in writing about this stuff, I will tack on a note about my own experience. Like most of the people writing about this stuff, I am not without a personal-historical stake in the matter.

On the one side, I would almost certainly have gotten into college under the more test-based and less “holistic” scheme Pinker and others are pushing. I had excellent (though not perfect) SAT scores back in the day, excellent high school grades, got high scores on all the AP tests I was able to take, etc. And I’m doing my bit to uphold the correlation between high test scores and success in life– I graduated college, got a Ph.D., hold a tenured faculty position at an elite school, and have published multiple books. I don’t think anyone at Williams would have cause to regret having admitted me back in the day, or having given me an extremely generous scholarship.

In between, though, I was the sort of student Professor Pinker grumbles about. I spent a good chunk of my college career more interested in partying and playing rugby than studying and going to class. Had you asked my professors about me during my sophomore year, say, I’m sure a number of them would’ve shaken their heads and questioned the standards being employed by the admissions office. I wasn’t in any particular danger of being kicked out, or anything, but I’m not sure many of my professors in those years would say I was making a positive contribution to the intellectual life of the campus that would justify the institution’s investment in me.

That wasn’t because I lacked ability or intelligence, but because I was kind of an asshole when I was nineteen, and my priorities then were not my priorities now. This is not an uncommon state of affairs, and like a lot of nineteen-year-old assholes, I got better.

But I suspect that in that interim period, I had professors who assumed, incorrectly, that I must’ve gotten in on some basis other than innate ability. Because, of course, high test scores are not an obvious and visible trait.

(Dumb story: once, my sophomore year, I was playing a drinking game with a couple of other guys who were noted as hard-drinking idiots, and a couple of freshmen. One of whom was getting trounced in the game, and busted out “Oh, yeah, well who has higher SAT scores?” Turns out he had the lowest scores of anybody in the game, which didn’t go well for him after that…)

So, I have personal reasons to doubt the link between innate ability as measured by standardized test scores and student priorities that align well with those of the faculty. And I’ve known a fair number of bright students in my years as a faculty member who were the same kind of asshole I was at nineteen, and got better by the time they graduated. Even very good students by whatever objective measure you favor aren’t always going to value the things faculty think they ought to, because students aren’t faculty.

I also bristle a bit at the assumption that people do extracurricular activities solely in a cynical sort of way, to game the admissions process. In addition to my very good grades, I played three sports and was in the band. Not because I felt I needed to do those things to get into a good college, but because I enjoyed doing them (and probably would’ve been bored silly had I not been doing them, but that’s another issue). I still block out time in my schedule to play basketball as often as I can manage it, because I really love the game.

So, you know, there’ my personal story about why I don’t give a lot of credence to Pinker’s arguments. Which has exactly the same evidentiary value (a small number) as any of the other anecdotes trotted out in these various discussions, but fulfills my contractual obligation as a person writing about college admissions.

Matt StrasslerWill the Higgs Boson Destroy the Universe???

No.

The Higgs boson is not dangerous and will not destroy the universe.

The Higgs boson is a type of particle, a little ripple in the Higgs field. [See here for the Higgs FAQ.] This lowly particle, if you’re lucky enough to make one (and at the world’s largest particle accelerator, the Large Hadron Collider, only one in a trillion proton-proton collisions actually does so) has a brief life, disintegrating to other particles in less than the time that it takes light to cross from one side of an atom to another. (Recall that light can travel from the Earth to the Moon in under two seconds.) Such a fragile creature is hardly more dangerous than a mayfly.

Anyone who says otherwise probably read Hawking’s book (or read about it in the press) but didn’t understand what he or she was reading, perhaps because he or she had not read the Higgs FAQ.

If you want to worry about something Higgs-related, you can try to worry about the Higgs field, which is “ON” in our universe, though not nearly as “on” as it could be. If someone were to turn the Higgs field OFF, let’s say as a practical joke, that would be a disaster: all ordinary matter across the universe would explode, because the electrons on the outskirts of atoms would lose their mass and fly off into space. This is not something to worry about, however. We know it would require an input of energy and can’t happen spontaneously.  Moreover, the amount of energy required to artificially turn the Higgs field off is immense; to do so even in a small room would require energy comparable to that of a typical supernova, an explosion of a star that can outshine an entire galaxy and releases the vast majority of its energy in unseen neutrinos. No one, fortunately, has a supernova in his or her back pocket. And if someone did, we’d have more immediate problems than worrying about someone wasting a supernova trying to turn off the Higgs field in a basement somewhere.

Now it would also be a disaster if someone could turn the Higgs field WAY UP… more than when your older brother turned up the volume on your stereo or MP3 player and blew out your speakers. In this case atoms would violently collapse, or worse, and things would be just as nasty as if the Higgs field were turned OFF. Should you worry about this? Well, it’s possible this could happen spontaneously, so it’s slightly more plausible. But I do mean slightly. Very slightly.

Recently, physicists have been writing about this possibility because if (a) you ASSUME that the types of particles that we’ve discovered so far are the only ones that affect the Higgs field, and (b) you ASSUME that there are no other important forces that affect the Higgs field other than the ones we know, then you can calculate, with some degree of reliability (though there is a debate about that degree) that (1) the Higgs field could lower the energy of the universe by suddenly jumping from ON to WAY WAY SUPER-DUPER ON, and (2) that the time we’d have to wait for it to do so spontaneously isn’t infinite.  It would do this in two steps: first a bubble of WAY WAY ON Higgs field would form (via the curious ability of quantum mechanics to make the improbable happen, rarely), and then that bubble would expand and sweep across the universe, destroying everything in its path.

An aside: In particle physics lingo explained here, we say that “the universe has two possible ‘vacua’, the vacuum we live in, in which the Higgs field is ON a bit, and a second vacuum in which the Higgs field is HUGELY ON.” If the second vacuum has lower energy than the first, then the first vacuum is said to be “metastable”: although it lasts a very long time, it has a very small but non-zero probability of turning into the second vacuum someday.  That’s because a bubble of the second vacuum that appears by chance inside the first vacuum will expand, and take over the whole universe.

Ok. First, should you buy the original assumptions? No. It’s just humans assuming that what we currently know is all there is to know; since when has that been true?  Second, even if you do buy them, should you worry about the conclusion? No. The universe has existed in its current form for about 13.7 billion years. The Higgs field may not perform this nasty jump for trillions of years, or trillions of trillions, or trillions of trillions of trillions, or more. Likely more. In any case, nobody knows, but really, nobody should care very much. The calculation is hard, the answer highly uncertain, and worse, the whole thing is profoundly dependent on the ASSUMPTIONS. In fact, if the assumptions are slightly wrong — if there are other particles and forces that affect the Higgs field, or if there is more than one Higgs-like field in nature — then the calculation could end up being way off from the truth. Also possible is that the calculational method, which is subtle, isn’t yet refined enough to give the right answer.  Altogether, this means that not only might the Higgs field’s nasty jump be much more or less likely than is currently believed, it might not even be possible at all. So we don’t actually know anything for sure, despite all the loose talk that suggests that we do.  But in any case, since the universe has lived 13.7 billion years already, the chance is ridiculously tiny that this Higgs field jump, even if it is possible at all, will occur in your ultra-short 100 year-ish lifetime, or even that of any of your descendants.

What about the possibility that human beings could artificially cause the Higgs field to turn WAY WAY ON? Again, the amount of energy involved in trying to do that is extremely large — not a supernova, now, but far, far beyond current human capability, and likely impossible.  (The technology required to build a particle accelerator with collisions at this energy, and the financial and environmental cost of running it, are more than a little difficult to imagine.)  At this point we can barely make Higgs bosons — little ripples in the Higgs field; now you want to imagine us making a bubble where the Higgs field is WAY WAY WAY MORE ON than usual? We’re scientists, not magicians. And we deal in science — i.e., reality.  Current and foreseeable technology cannot turn this imaginary possibility into reality.

Some dangers actually exist in reality. Asteroidlets do sometimes hit the earth; supernovas do explode, and a nearby one would be terrible; and you should not be wandering outdoors or in a shower during a lightning storm.  As for the possible spontaneous destruction of the universe? Well, if it happens some day, it may have nothing to do with the Higgs field; it may very well be due to some other field, about which we currently know nothing, making a jump of a sort that we haven’t even learned about yet. Humans tend to assume that the things they know about are much scarier than they actually are (e.g. Yellowstone, the “super”-volcano) and that the things they don’t know about are much less scary than they actually are (e.g. what people used to think about ozone-destroying chemicals before they knew they destroyed ozone.) This is worth keeping in mind.

So anyone who tells you that we know that the universe is only “meta-stable”, and that someday the Higgs field will destroy it by suddenly screaming at the top of its lungs, or that we might cause it to do so, is forgetting to tell you about all the assumptions that went into that conclusion, and about the incredible energies required which may far exceed what humans can ever manage, and about the incredible lengths of time that may be involved, by which point there may be no more stars left to keep life going anyway, and possibly not even any more protons to make atoms out of. There’s a word for this kind of wild talk: “scare-mongering”.  You can safely go back to sleep.

Or not. There’s plenty to keep you awake. But by comparison with the spread of the Ebola virus, the increasing carbon dioxide in the atmosphere and acidification of the oceans, or the accelerating loss of the world’s biodiversity, not to mention the greed and violence common in our species, worrying about Higgs bosons, or even the Higgs field in which a Higgs particle is a tiny ripple, seems to me a tempest in a top quark.


Filed under: Higgs, Science and Modern Society Tagged: cosmology, Higgs, press, VacuumEnergy

Matt StrasslerAuroras — Quantum Physics in the Sky — Tonight?

Maybe. If we collectively, and you personally, are lucky, then maybe you might see auroras — quantum physics in the sky — tonight.

Before I tell you about the science, I’m going to tell you where to get accurate information, and where not to get it; and then I’m going to give you a rough idea of what auroras are. It will be rough because it’s complicated and it would take more time than I have today, and it also will be rough because auroras are still only partly understood.

Bad Information

First though — as usual, do NOT get your information from the mainstream media, or even the media that ought to be scientifically literate but isn’t. I’ve seen a ton of misinformation already about timing, location, and where to look. For instance, here’s a map from AccuWeather, telling you who is likely to be able to see the auroras.

Don't believe this map by AccuWeather.  Oh, sure, they know something about clouds.  But auroras, not much.

Don’t believe this map by AccuWeather. Oh, sure, they know something about clouds. But auroras, not much.

See that line below which it says “not visible”? This implies that there’s a nice sharp geographical line between those who can’t possibly see it and those who will definitely see it if the sky is clear. Nothing could be further than the truth. No one knows where that line will lie tonight, and besides, it won’t be a nice smooth curve. There could be auroras visible in New Mexico, and none in Maine… not because it’s cloudy, but because the start time of the aurora can’t be predicted, and because its strength and location will change over time. If you’re north of that line, you may see nothing, and if you’re south of it you still might see something.  (Accuweather also says that you’ll see it first in the northeast and then in the midwest.  Not necessarily.  It may become visible across the U.S. all at the same time.  Or it may be seen out west but not in the east, or vice versa.)

Auroras aren’t like solar or lunar eclipses, absolutely predictable as to when they’ll happen and who can see them. They aren’t even like comets, which behave unpredictably but at least have predictable orbits. (Remember Comet ISON? It arrived exactly when expected, but evaporated and disintegrated under the Sun’s intense stare.) Auroras are more like weather — and predictions of auroras are more like predictions of rain, only in some ways worse. An aurora is a dynamic, ever-changing phenomenon, and to predict where and when it can be seen is not much more than educated guesswork. No prediction of an aurora sighting is EVER a guarantee. Nor is the absence of an aurora prediction a guarantee one can’t be seen; occasionally they appear unexpectedly.  That said, the best chance of seeing one further away from the poles than usual is a couple of days after a major solar flare — and we had one a couple of days ago.

Good Information and How to Use it

If you want accurate information about auroras, you want to get it from the Space Weather Prediction Center, click here for their main webpage. Look at the colorful graph on the lower left of that webpage, the “Satellite Environment Plot”. Here’s an example of that plot taken from earlier today:

The "Satellite Environment Plot" from earlier today; focus your attention on the two lower charts, the one with the red and blue wiggly lines (GOES Hp) and on the one with the bars (Kp Index).  How to use them is explained in the text.

The “Satellite Environment Plot” from earlier today; focus your attention on the two lower charts, the one with the red and blue wiggly lines (GOES Hp) and on the one with the bars (Kp Index). How to use them is explained in the text.

There’s a LOT of data on that plot, but for lack of time let me cut to the chase. The most important information is on the bottom two charts.

The bottom row, the “Estimated Kp index”, tells you, roughly, how much “geomagnetic activity” there is (i.e., how disturbed is the earth’s magnetic field). If the most recent bars are red, then the activity index is 5 or above, and there’s a decent chance of auroras. The higher the index, the more likely are auroras and the further away from the earth’s poles they will be seen. That is, if you live in the northern hemisphere, the larger is the Kp index, the further south the auroras are likely to be visible. [If it's more than 5, you've got a good shot well down into the bulk of the United States.]

The only problem with the Kp index is that it is a 3-hour average, so it may not go red until the auroras have already been going for a couple of hours! So that’s why the row above it, “GOES Hp”, is important and helpful. This plot gives you much more up-to-date information about what the magnetic field of the earth is up to. Notice, in the plot above, that the magnetic field goes crazy (i.e. the lines get all wiggly) just around the time that the Kp index starts to be yellow or starts to be red.

Therefore, keep an eye on the GOES Hp chart. If you see it start to go crazy sometime in the next 48 hours, that’s a strong indication that the blast of electrically-charged particles from the Sun, thrown out in that recent solar flare, has arrived at the Earth, and auroras are potentially imminent.  It won’t tell you how strong they are though.  Still, this is your signal, if skies near you are dark and sufficiently clear, to go out and look for auroras. If you don’t see them, try again later; they’re changeable. If you don’t see them over the coming hour or so, keep an eye on the Kp index chart. If you’re in the mid-to-northern part of the U.S. and you see that index jump higher than 5, there’s a significant geomagnetic storm going on, so keep trying. And if you see it reach 8 or so, definitely try even if you’re living quite far south.

Of course, don’t forget Twitter and other real-time feeds.  These can tell you whether and where people are seeing auroras. Keeping an eye on Twitter and hashtags like #aurora, #auroras, #northernlights is probably a good idea.

One more thing before getting into the science. We call these things the “northern lights” in the northern hemisphere, but clearly, since they can be seen in different places, they’re not always or necessarily north of any particular place. Looking north is a good idea — most of us who can see these things tonight or tomorrow night will probably be south of them — but the auroras can be overhead or even south of you. So don’t immediately give up if your northern sky is blocked by clouds or trees. Look around the sky.

Auroras: Quantum Physics in the Sky

Now, what are you seeing if you are lucky enough to see an aurora? Most likely what you’ll see is green, though red, blue and purple are common (and sometimes combinations which give other colors, but these are the basic ones.)  Why?

The typical sequence of events preceding a bright aurora is this:

  1. A sunspot — an area of intense magnetic activity on the Sun, where the sun’s apparent surface looks dark — becomes unstable and suffers an explosion of sorts, a solar flare.
  2. Associated with the solar flare may be a “coronal mass ejection” — the expulsion of huge numbers of charged (and neutral) particles out into space. These charged particles include both electrons and ions (i.e. atoms which have lost one or more electrons). (Coronal mass ejections, which are not well understood, can occur in other ways, but the strongest are from big flares.)
  3. These charged particles travel at high speeds (much faster than any current human spaceship, but much slower than the speed of light) across space. If the sunspot that flared happens to be facing Earth, then some of those particles will arrive at Earth after as little as a day and as much as three days. Powerful flares typically make faster particles which therefore arrive sooner.
  4. When these charged particles arrive near Earth, it may happen (depending on what the Sun’s magnetic field and the Earth’s magnetic field and the magnetic fields near the particles are all doing) that many of the particles may spiral down the Earth’s magnetic field, which draws them to the Earth’s north and south magnetic poles (which lie close to the Earth’s north and south geographic poles.)
  5. When these high-energy particles (electrons and ions) rain down onto the Earth, they typically will hit atoms in the Earth’s upper atmosphere, 40 to 200 miles up. The ensuing collisions kick electrons in the struck atoms into “orbits” that they don’t normally occupy, as though they were suddenly moved from an inner superhighway ring road around a city to an outer ring road highway. We call these outer orbits “excited orbits”, and an atom of this type an “excited atom”.
  6. Eventually the electrons fall from these “excited orbits” back down to their usual orbits. This is often referred to as a “quantum transition” or, colloquially, a “quantum jump”, as the electron is never really found between the starting outer orbit and the final inner one; it almost instantaneously transfers from one to the other.
  7. In doing so, the jumping electron will emit a particle of electromagnetic radiation, called a “photon”. The energy of that photon, thanks to the wonderful properties of quantum mechanics, is always the same for any particular quantum transition.
  8. Visible light is a form of electromagnetic radiation, and photons of visible light are, by definition, ones that our eyes can see. The reason we can see auroras is that for particular quantum transitions of oxygen and nitrogen, the photons emitted are indeed those of visible light. Moreover, because the energy for each photon from a given transition is always the same, the color of the light that our eyes see, for that particular transition, is always the same. There is a transition in oxygen that always gives green light; that’s why auroras are often green. There is a more fragile transition that always gives red light; powerful auroras, which can excite oxygen atoms even higher in the atmosphere, where they are more diffuse and less likely to hit something before they emit light, can give red auroras. Similarly, nitrogen molecules have a transition that can give blue light. (Other transitions give light that our eyes can’t see.)  Combinations of these can give yellows, pinks, purples, whites, etc. But the basic colors are typically green and red, occasionally blue, etc.

So if you are lucky enough to see an aurora tonight or tomorrow night, consider what you are seeing.  Huge energies involving magnetic fields on the Sun have blown particles — the same particles that are of particular significance to this website — into space.  Particle physics and atomic physics at the top of the atmosphere lead to the emission of light many miles above the Earth.  And the remarkable surprises of quantum mechanics make that light not a bland grey, with all possible colors blended crudely together, but instead a magical display of specific and gorgeous hues, reminding us that the world is far more subtle than our daily lives would lead us to believe.


Filed under: Astronomy, Particle Physics Tagged: astronomy, atoms, auroras, press

Matt StrasslerWhy did so few people see Auroras on Friday night?

Why did so few people see auroras on Friday night, after all the media hype? You can see one of two reasons in the data. As I explained in my last post, you can read what happened in the data shown in the Satellite Environment Plot from this website (warning — they’re going to make new version of the website soon, so you might have to modify this info a bit.) Here’s what the plot looked like Sunday morning.

What the "Satellite Environment Plot" on swpc.noaa.gov looked like on Sunday.  Friday is at left; time shown is "Universal" time; New York time is 4 hours later. There were two storms, shown as the red bars in the Kp index plot; one occurred very early Friday morning and one later on Friday.  You can see the start of the second storm in the "GOES Hp" plot, where the magnetic field goes wild very suddenly.  The storm was subsiding by midnight universal time, so it was mostly over by midnight New York time.

What the “Satellite Environment Plot” on swpc.noaa.gov looked like on Sunday. Friday is at left.  Time shown is “Universal” time (UTC); New York time is 4 hours later at this time of year. There were two storms, shown as the red bars in the Kp index chart (fourth line); one occurred very early Friday morning and one later on Friday. You can see the start of the second storm in the “GOES Hp” chart (third line), where the magnetic field goes wild very suddenly. The storm was subsiding by midnight Universal time, so it was mostly over by midnight New York time.

What the figure shows is that after a first geomagnetic storm very early Friday, a strong geomagnetic storm started (as shown by the sharp jump in the GOES Hp chart) later on Friday, a little after noon New York time ["UTC" is currently New York + 4/5 hours], and that it was short — mostly over before midnight. Those of you out west never had a chance; it was all over before the sun set. Only people in far western Europe had good timing. Whatever the media was saying about later Friday night and Saturday night was somewhere between uninformed and out of date.  Your best bet was to be looking at this chart, which would have shown you that (despite predictions, which for auroras are always quite uncertain) there was nothing going on after Friday midnight New York time.

But the second reason is something that the figure doesn’t show. Even though this was a strong geomagnetic storm (the Kp index reached 7, the strongest in quite some time), the auroras didn’t migrate particularly far south. They were seen in the northern skies of Maine, Vermont and New Hampshire, but not (as far as I know) in Massachusetts. Certainly I didn’t see them. That just goes to show you (AccuWeather, and other media, are you listening?) that predicting the precise timing and extent of auroras is educated guesswork, and will remain so until current knowledge, methods and information are enhanced. One simply can’t know for sure how far south the auroras will extend, even if the impact on the geomagnetic field is strong.

For those who did see the auroras on Friday night, it was quite a sight. And for the rest of us who didn’t see them this time, there’s no reason for us to give up. Solar maximum is not over, and even though this is a rather weak sunspot cycle, the chances for more auroras over the next year or so are still pretty good.

Finally, a lesson for those who went out and stared at the sky for hours after the storm was long over — get your scientific information from the source!  There’s no need, in the modern world, to rely on out-of-date media reports.


Filed under: Astronomy, Science and Modern Society Tagged: auroras, press

Secret Blogging SeminarEditorial board of “Journal of K-Theory” on strike, demanding Tony Bak hands over the journal to the K-Theory foundation.

Text of the announcement below:

Dear Colleagues,

We the undersigned announce that, as of today 15 September 2014, we’re starting an indefinite strike. We will decline all papers submitted to us at the Journal of K-Theory.

Our demand is that, as promised in 2007-08, Bak’s family company (ISOPP) hand over the ownership of the journal to the K-Theory Foundation (KTF). The handover must be unconditional, free of charge and cover all the back issues.

The remaining editors are cordially invited to join us.

 Yours Sincerely,
Paul Balmer, Spencer Bloch, Gunnar Carlsson, Guillermo Cortinas, Eric Friedlander, Max Karoubi, Gennadi Kasparov, Alexander Merkurjev, Amnon Neeman, Jonathan Rosenberg, Marco Schlichting, Andrei Suslin, Vladimir Voevodsky, Charles Weibel, Guoliang Yu
More details to follow!

September 14, 2014

Scott AaronsonSteven Pinker’s inflammatory proposal: universities should prioritize academics

If you haven’t yet, I urge you to read Steven Pinker’s brilliant piece in The New Republic about what’s broken with America’s “elite” colleges and how to fix it.  The piece starts out as an evisceration of an earlier New Republic article on the same subject by William Deresiewicz.  Pinker agrees with Deresiewicz that something is wrong, but finds Deresiewicz’s diagnosis of what to be lacking.  The rest of Pinker’s article sets out his own vision, which involves America’s top universities taking the radical step of focusing on academics, and returning extracurricular activities like sports to their rightful place as extras: ways for students to unwind, rather than a university’s primary reason for existing, or a central criterion for undergraduate admissions.  Most controversially, this would mean that the admissions process at US universities would become more like that in virtually every other advanced country: a relatively-straightforward matter of academic performance, rather than an exercise in peering into the applicants’ souls to find out whether they have a special je ne sais quoi, and the students (and their parents) desperately gaming the intentionally-opaque system, by paying consultants tens of thousands of dollars to develop souls for them.

(Incidentally, readers who haven’t experienced it firsthand might not be able to understand, or believe, just how strange the undergraduate admissions process in the US has become, although Pinker’s anecdotes give some idea.  I imagine anthropologists centuries from now studying American elite university admissions, and the parenting practices that have grown up around them, alongside cannibalism, kamikaze piloting, and other historical extremes of the human condition.)

Pinker points out that a way to assess students’ ability to do college coursework—much more quickly and accurately than by relying on the soul-detecting skills of admissions officers—has existed for a century.  It’s called the standardized test.  But unlike in the rest of the world (even in ultraliberal Western Europe), standardized tests are politically toxic in the US, seen as instruments of racism, classism, and oppression.  Pinker reminds us of the immense irony here: standardized tests were invented as a radical democratizing tool, as a way to give kids from poor and immigrant families the chance to attend colleges that had previously only been open to the children of the elite.  They succeeded at that goal—too well for some people’s comfort.

We now know that the Ivies’ current emphasis on sports, “character,” “well-roundedness,” and geographic diversity in undergraduate admissions was consciously designed (read that again) in the 1920s, by the presidents of Harvard, Princeton, and Yale, as a tactic to limit the enrollment of Jews.  Nowadays, of course, the Ivies’ “holistic” admissions process no longer fulfills that original purpose, in part because American Jews learned to play the “well-roundedness” game as well as anyone, shuttling their teenage kids between sports, band practice, and faux charity work, while hiring professionals to ghostwrite application essays that speak searingly from the heart.  Today, a major effect of “holistic” admissions is instead to limit the enrollment of Asian-Americans (especially recent immigrants), who tend disproportionately to have superb SAT scores, but to be deficient in life’s more meaningful dimensions, such as lacrosse, student government, and marching band.  More generally—again, pause to wallow in the irony—our “progressive” admissions process works strongly in favor of the upper-middle-class families who know how to navigate it, and against the poor and working-class families who don’t.

Defenders of the status quo have missed this reality on the ground, it seems to me, because they’re obsessed with the notion that standardized tests are “reductive”: that is, that they reduce a human being to a number.  Aren’t there geniuses who bomb standardized tests, they ask, as well as unimaginative grinds who ace them?  And if you make test scores a major factor in admissions, then won’t students and teachers train for the tests, and won’t that pervert open-ended intellectual curiosity?  The answer to both questions, I think, is clearly “yes.”  But the status-quo-defenders never seem to take the next step, of examining the alternatives to standardized testing, to see whether they’re even worse.

I’d say the truth is this: spots at the top universities are so coveted, and so much rarer than the demand, that no matter what you use as your admissions criterion, that thing will instantly get fetishized and turned into a commodity by students, parents, and companies eager to profit from their anxiety.  If it’s grades, you’ll get a grades fetish; if sports, you’ll get a sports fetish; if community involvement, you’ll get soup kitchens sprouting up for the sole purpose of giving ambitious 17-year-olds something to write about in their application essays.  If Harvard and Princeton announced that from now on, they only wanted the most laid-back, unambitious kids, the ones who spent their summers lazily skipping stones in a lake, rather than organizing their whole lives around getting in to Harvard and Princeton, tens of thousands of parents in the New York metropolitan area would immediately enroll their kids in relaxation and stone-skipping prep courses.  So, given that reality, why not at least make the fetishized criterion one that’s uniform, explicit, predictively valid, relatively hard to game, and relevant to universities’ core intellectual mission?

(Here, I’m ignoring criticisms specific to the SAT: for example, that it fails to differentiate students at the extreme right end of the bell curve, thereby forcing the top schools to use other criteria.  Even if those criticisms are true, they could easily be fixed by switching to other tests.)

I admit that my views on this matter might be colored by my strange (though as I’ve learned, not at all unique) experience, of getting rejected from almost every “top” college in the United States, and then, ten years later, getting recruited for faculty jobs by the very same institutions that had rejected me as a teenager.  Once you understand how undergraduate admissions work, the rejections were unsurprising: I was a 15-year-old with perfect SATs and a published research paper, but not only was I young and immature, with spotty grades and a weird academic trajectory, I had no sports, no music, no diverse leadership experiences.  I was a narrow, linear, A-to-B thinker who lacked depth and emotional intelligence: the exact opposite of what Harvard and Princeton were looking for in every way.  The real miracle is that despite these massive strikes against me, two schools—Cornell and Carnegie Mellon—were nice enough to give me a chance.  (I ended up going to Cornell, where I got a great education.)

Some people would say: so then what’s the big deal?  If Harvard or MIT reject some students that maybe they should have admitted, those students will simply go elsewhere, where—if they’re really that good—they’ll do every bit as well as they would’ve done at the so-called “top” schools.  But to me, that’s uncomfortably close to saying: there are millions of people who go on to succeed in life despite childhoods of neglect and poverty.  Indeed, some of those people succeed partly because of their rough childhoods, which served as the crucibles of their character and resolve.  Ergo, let’s neglect our own children, so that they too can have the privilege of learning from the school of hard knocks just like we did.  The fact that many people turn out fine despite unfairness and adversity doesn’t mean that we should inflict unfairness if we can avoid it.

Let me end with an important clarification.  Am I saying that, if I had dictatorial control over a university (ha!), I would base undergraduate admissions solely on standardized test scores?  Actually, no.  Here’s what I would do: I would admit the majority of students mostly based on test scores.  A minority, I would admit because of something special about them that wasn’t captured by test scores, whether that something was musical or artistic talent, volunteer work in Africa, a bestselling smartphone app they’d written, a childhood as an orphaned war refugee, or membership in an underrepresented minority.  Crucially, though, the special something would need to be special.  What I wouldn’t do is what’s done today: namely, to turn “specialness” and “well-roundedness” into commodities that the great mass of applicants have to manufacture before they can even be considered.

Other than that, I would barely look at high-school grades, regarding them as too variable from one school to another.  And, while conceding it might be impossible, I would try hard to keep my university in good enough financial shape that it didn’t need any legacy or development admits at all.


Update (Sep. 14): For those who feel I’m exaggerating the situation, please read the story of commenter Jon, about a homeschooled 15-year-old doing graduate-level work in math who, three years ago, was refused undergraduate admission to both Berkeley and Caltech, with the math faculty powerless to influence the admissions officers. See also my response.

David Hoggsingle-example learning

I pitched projects to new graduate students in the Physics and Data Science programs today; hopefully some will stick. Late in the day, I took out new Data Science Fellow Brenden Lake (NYU) for a beer, along with Brian McFee (NYU) and Foreman-Mackey. We discussed many things, but we were blown away by Lake's experiments on single-instance learning: Can a machine learn to identify or generate a class of objects from seeing only a single example? Humans are great at this but machines are not. He showed us comparisons between his best machines and experimental subjects found on the Mechanical Turk. His machines don't do badly!

September 13, 2014

Doug NatelsonWhat is the Casimir effect?

This is another in an occasional series of posts where I try to explain some physical phenomena and concepts in a comparatively accessible way.  I'm going to try hard to lean toward a lay audience here, with the very real possibility that this will fail.

You may have heard of the Casimir effect, or the Casimir force - it's usually presented in language that refers to "quantum fluctuations of the electromagnetic field", and phrases like "zero point energy" waft around.  The traditional idea is that two electrically neutral, perfectly conducting plates, parallel to each other, will experience an attractive force per unit area given by \( \hbar c \pi^{2}/(240 a^{4})\), where \(a \) is the distance between the plates.  For realistic conductors (and even dielectrics) it is possible to derive analogous expressions.  For a recent, serious scientific review, see here (though I think it's behind a paywall).

To get some sense of where these forces come from, we need to think about van der Waals forces.  It turns out that there is an attractive force between neutral atoms, say helium atoms for simplicity.  We are taught to think about the electrons in helium as "looking" like puffy, spherical clouds - that's one way to visualize the electron's quantum wave function, related to the probability of finding the electron in a given spot if you decided to look through some experimental means.  If you imagine using some scattering experiment to "take a snapshot" of the helium atom, you'd find the two electrons located at particular locations, probably away from the nucleus.  In that sense, the helium atom would have an "instantaneous electric dipole moment".  To use an analogy with magnetic dipoles, imagine that there are little bar magnets pointing from the nucleus to each electron.  The influence (electric field in the real atom; magnetic field from the bar magnet analogy) of those dipoles drops off in distance like \(1/r^{3}\).  Now, if there was a second nearby atom, its electrons would experience the fields from the first atom.  This would tend to influence its own dipole (in the magnet analogy, instead of the bar magnets pointing on average in all directions, they would tend to align with the field from the first atom, rather like how a compass needle is influenced by a nearby bar magnet).   The result would be an attractive force, proportional to \(1/r^{6}\).

In this description, we ignored that it takes time for the fields from the first atom to propagate to the second atom.  This is called retardation, and it's one key difference between the van der Waals interaction (when retardation is basically assumed to be unimportant) and so-called Casimir-Polder forces.   

Now we can ask, what about having more than two atoms?  What happens to the forces then?  Is it enough just to think of them as a bunch of pairs and add up the contributions?  The short answer is, no, you can't just think about pair-wise interactions (interference effects and retardation make it necessary to treat extended objects carefully).

What about exotic quantum vacuum fluctuations, you might ask.  Well, in some sense, you can think about those fluctuations and interactions with them as helping to set the randomized flipping dipole orientations in the first place, though that's not necessary.  It has been shown that you can do full, relativistic, retarded calculations of these fluctuating dipole effects and you can reproduce the Casimir results (and with greater generality) without saying much of anything about zero point stuff.  That is why while it is fun to speculate about zero point energy and so forth (see here for an entertaining and informative article - again, sorry about the paywall), there really doesn't seem to be any way to get net energy "out of the vacuum".

Tommaso DorigoLife After The 125 GeV Higgs: What Is Left Of Two-Higgs Doublet Models

I just read with interest the new paper on the arxiv by my INFN-Padova colleague Massimo Passera and collaborators, titled "Limiting Two-Higgs Doublet Models", and I thought I would explain to you here why I consider it very interesting and what are its conclusions.

read more

Chad OrzelOn Academic Scandals

Two very brief notes about high-profile scandals in academia:

1) While it involves one of my faculty colleagues, I have no special insight to offer into the case of Valerie Barr’s firing by the NSF over long-ago political activity. I know and like Valerie as a colleague, and she did some really good stuff as chair of the CS department, but that’s all I know.

As reported by Science, the government’s actions in this case seem like that very special kind of stupid that you get in extremely large organizations, where this probably isn’t really about her at all. Either somebody in a position to make trouble has a bug up their ass about this particular organization from the early 1980′s, or somebody in a position to make trouble has a beef with the NSF and is taking it out through petty bureaucratic horseshit.

Regardless of the actual reason, this reflects very badly on the NSF, on the Office of Personnel Management, and the Federal Investigative Services.

2) The other Great Big Scandal running through academic circles is this whole Steven Salaita business, where he was offered a tenured position at Illinois, then had the offer taken back at a very late stage because of “uncivil” comments he made. There’s been a lot of back and forth, and calls for boycotts, etc.

This one doesn’t reflect well on anyone involved. Salaita’s Twitter comments were intemperate and asinine, the flip-flopping by Illinois has been disgraceful, and I strongly suspect that had the asinine remarks been of a different political slant, a lot of the people writing outraged essays about the whole business would be writing similar volumes about how the tweeter in question richly deserved to be drummed out of academia (and a completely different subset of people would be outraged at length). Which is why I have no real enthusiasm for writing about this (part of why I’m posting it on a Saturday morning when nobody will read it…).

As long as I’m commenting on scandals, though, I might as well note that while Salaita’s comments were very dumb, they’re not half offensive enough to deserve firing. Especially by the standards of jackassery on Twitter. The actions of the Illinois administration in caving to pressure from donors were venal and cowardly, and reflect very poorly on the university. I’m basically in agreement with Sean Carroll’s comments on this and other bothersome speech issues: the bar for firing/disinviting/banning people for saying things we disagree with ought to be very high indeed.

And that’s it for academic scandals this week. On the bright side, um… at least academia is having a better week/month/year scandal-wise than the NFL? Yay?

BackreactionIs there a smallest length?

Good ideas start with a question. Great ideas start with a question that comes back to you. One such question that has haunted scientists and philosophers since thousands of years is whether there is a smallest unit of length, a shortest distance below which we cannot resolve structures. Can we look closer and always closer into space, time, and matter? Or is there a limit, and if so, what is the limit?

I picture our foreign ancestors sitting in their cave watching the world in amazement, wondering what the stones, the trees and they themselves are made of – and starving to death. Luckily, those smart enough to hunt down the occasional bear eventually gave rise to human civilization sheltered enough from the harshness of life to let the survivors get back to watching and wondering what we are made of. Science and philosophy in earnest is only a few thousand years old, but the question whether there is smallest unit has always been a driving force in our studies of the natural world.

The old Greeks invented atomism, the idea that there is an ultimate and smallest element of matter that everything is made of. Zeno’s famous paradoxa sought to shed light on the possibility of infinite divisibility. The question came back with the advent of quantum mechanics, with Heisenberg’s uncertainty principle that fundamentally limits the precision by which we can measure. It became only more pressing with the divergences in quantum field theory that are due to the inclusion of infinitely short distances.

It was in fact Heisenberg who first suggested that divergences in quantum field theory might be cured by the existence of a fundamentally minimal length, and he introduced it by making position operators non-commuting among themselves. Like the non-commutativity of momentum and position operators leads to an uncertainty principle, so does the non-commutativity of position operators limits how well distances can be measured.

Heisenberg’s main worry, which the minimal length was supposed to deal with, was the non-renormalizability of Fermi’s theory of beta-decay. This theory however turned out to be only an approximation to the renormalizable electro-weak interaction, so he had to worry no more. Heisenberg’s idea was forgotten for some decades, then picked up again and eventually grew into the area of non-commutative geometries. Meanwhile, the problem of quantizing gravity appeared on stage and with it, again, non-renormalizability.

In the mid 1960s Mead  reinvestigated Heisenberg’s microscope, the argument that lead to the uncertainty principle, with (unquantized) gravity taken into account. He showed that gravity amplifies the uncertainty so that it becomes impossible to measure distances below the Planck length, about 10-33cm. Mead’s argument was forgotten, then rediscovered in the 1990s by string theorists who had noticed using strings to prevent divergences by avoiding point-interactions also implies a finite resolution, if in a technically somewhat different way than Mead’s.

Since then the idea that the Planck length may be a fundamental length beyond which there is nothing new to find, ever, appeared in other approaches towards quantum gravity, such as Loop Quantum Gravity or Asymptotically Safe Gravity. It has also been studied as an effective theory by modifying quantum field theory to include a minimal length from scratch, and often runs under the name “generalized uncertainty”.

One of the main difficulties with these theories is that a minimal length, if interpreted as the length of a ruler, is not invariant under Lorentz-transformations due to length contraction. This problem is easy to overcome in momentum space, where it is a maximal energy that has to be made Lorentz-invariant, because momentum space is not translationally invariant. In position space one either has to break Lorentz-invariance or deform it and give up locality, which has observable consequences, and not always desired ones. Personally, I think it is a mistake to interpret the minimal length as the length of a ruler (a component of a Lorentz-vector), and it should instead be interpreted as a Lorentz-invariant scalar to begin with, but opinions on that matter differ.

The science and history of the minimal length has now been covered in a recent book by Amit Hagar:

Amit is a philosopher but he certainly knows his math and physics. Indeed, I suspect the book would be quite hard to understand for a reader without at least some background knowledge in math and physics. Amit has made a considerable effort to address the topic of a fundamental length from as many perspectives as possible, and he covers a lot of scientific history and philosophical considerations that I had not previously been aware of. The book is also noteworthy for including a chapter on quantum gravity phenomenology.

My only complaint about the book is its title because the question of discrete vs continuous is not the same as the question of finite vs infinite resolution. One can have a continuous structure and yet be unable to resolve it beyond some limit (this is the case when the limit makes itself noticeable as a blur rather than a discretization). On the other hand, one can have a discrete structure that does not prevent arbitrarily sharp resolution (which can happen when localization on a single base-point of the discrete structure is possible).

(Amit’s book is admittedly quite pricey, so let me add that he said should sales numbers reach 500 Cambridge University Press will put a considerably less expensive paperback version on offer. So tell your library to get a copy and let’s hope we’ll make it to 500 so it becomes affordable for more of the interested readers.)

Every once in a while I think that there maybe is no fundamentally smallest unit of length, that all these arguments for its existence are wrong. I like to think that we can look infinitely close into structures and will never find a final theory, turtles upon turtles, or that structures are ultimately self-similar and repeat. Alas, it is hard to make sense of the romantic idea of universes in universes in universes mathematically, not that I didn’t try, and so the minimal length keeps coming back to me.

Many if not most endeavors to find observational evidence for quantum gravity today look for manifestations of a minimal length in one way or the other, such as modifications of the dispersion relation, modifications of the commutation-relations, or Bekenstein’s tabletop search for quantum gravity. The properties of these theories are today a very active research area. We’ve come a long way, but we’re still out to answer the same questions that people asked themselves thousands of years ago.


This post first appeared on Starts With a Bang with the title "The Smallest Possible Scale in the Universe" on August 12, 2014.

September 12, 2014

Chad OrzelImminent Death of the Paper Book Predicted, .GIF at 11

I got a royalty statement yesterday for How to Teach [Quantum] Physics to Your Dog (it continues to sell steadily, which is very gratifying), which includes a breakdown of the sales in terms of different formats. That reminded me of a particular annoying quirk of many recent discussions of the state of modern publishing, which is the often unsupported assertion that everything is ebooks these days, and paper books (and book stores) are just a small residual element that publishers and authors cling to out of historical affection.

Since I happen to have my royalty statements in front of me, let me add a bit of, if not data, at least more than one anecdote. I’m not going to give absolute sales numbers, because that seems to be one of those things that’s Not Done, but for the last five royalty periods, here’s the fraction of ebooks out of the total sales of How to Teach Physics to Your Dog

2014 Mar 0.20
2013 Sep 0.27
2013 Mar 0.15
2012 Sep 0.16
2012 Mar 0.25

That is, electronic versions account for between 15% and 27% of the total number of copies sold. The lower fractions in the royalty periods ending in September 2012 and March 2013 are probably due to discount sales of the hardcover as that was phased out– hardcover sales are comparable in number to the ebooks in those two periods, but drop to zero for the most recent periods. (Prior to March 2012, the royalty statements are in a different format, and I can’t be bothered to figure out where the ebook sales figures are.)

For How to Teach Relativity to Your Dog, I only have two periods of reasonably clean data– the third one back has a bunch of returns (sigh) that skew the totals really badly.

2013 Dec 0.16
2013 Jun 0.19

So, that’s basically consistent with the figures for the quantum book– in the neighborhood of 20%.

Note that these are percentages of copies sold, not revenue. The fraction of revenue derived from ebook sales is difficult to calculate, because the ebooks are sold at a bunch of different price points, which one set of royalty statements reports on a separate line for each. The ebook price is generally a bit lower than the paper price, but my royalty rates for ebooks are considerably higher than for paper books, so it’s probably a wash, more or less.

Now, of course, there are a lot of caveats to be applied here. The books in question were not new releases in any of these periods (How to Teach Relativity to Your Dog was released in Feb. 2012, which temporarily boosted sales of the quantum book, but they’re from different publishers, so they weren’t extensively cross-marketed or anything like that). Some of the low ebook fraction may reflect the awful discoverability of ebooks relative to paper books; a new release that was getting more prominent mentions on the electronic sales sites might move better than one that’s off the front page, where you pretty much have to specifically search for the ebook to find it.

These are also non-fiction books, where most annoying arguments about ebooks focus on fiction sales. The two sides of the publishing business are very different in lots of ways, so I can’t swear that these numbers would be reproduced in the novel sector.

You can also argue that this is a matter of pricing– that if ebooks were just sold at the “right” price point of $cheaper, these numbers would reverse. Which, yeah, maybe. I don’t see a very clear pattern in all the different price point figures on the royalty statements, but then they don’t have the venue listed– a $4.99 ebook sold direct from the publisher’s web store is a very different thing than a $4.99 ebook sold via Amazon, and probably sells many fewer copies. Without the source data, there’s no way to say.

But the take-away here should be pretty clear: while ebooks have made great strides in the last several years, paper books are very much Still A Thing. If you’re talking about paper books as a basically negligible historical legacy, well, you really don’t know what you’re talking about. If you’re lambasting publishers and authors for making decisions right now that give more weight to paper book sales than ebooks, you’re ignoring current reality. I absolutely agree that in the medium-to-long-term ebooks will be increasingly important, but there’s a long way to go before they’re the most important factor to be considered.

(And, for the record, I have more or less stopped buying paper books in favor of ebooks, at least for myself. We still average something around two paper book purchases per week, as by standing agreement SteelyKid and The Pip are allowed to pick one book each when we go to the local indie bookstore on our Sunday morning market run. But this is a reminder that personal experience notwithstanding, ebook-only readers are not yet dominating the market.)

David Hoggcrazy diversity of stars; cosmological anomalies

At CampHogg group meeting (in the new NYU Center for Data Science space!), Sanderson (Columbia) talked about her work on finding structure through unsupervised clustering methods, and Price-Whelan talked about chaotic orbits and the effect of chaos on the streams in the Milky Way. Dun Wang blew us all away by showing us the amazing diversity of Kepler light-curves that go into his effective model of stellar and telescope variability. Even in a completely random set of a hundred light-curves you get eclipsing binaries, exoplanet transits, multiple-mode coherent pulsations, incoherent pulsations, and lots of other crazy variability. We marveled at the range of things used as "features" in his model.

At lunch (with surprise, secret visitor and Nobel Laureate Brian Schmidt), I had a long conversation with Matt Kleban (NYU), following my conversation from yesterday with D'Amico. We veered onto the question of anomalies: Just as there are anomalies in the CMB, there are probably also anomalies in the large-scale structure, but no-one really knows how to look for them. We should figure out and look! Also, each anomaly known in the CMB should make a prediction for an anomaly visible (or maybe not) in the large-scale structure. That would make for a valuable research program.

David Hoggsearching in the space of observables

In the early morning, Ness and I talked by phone about The Cannon (or maybe The Jump; guess the source of the name!), our method for providing stellar parameter labels to stars without using stellar models. We talked about possibly putting priors in the label space; this might regularize the results towards plausible values when the data are ambiguous. That's for paper 2, not paper 1. She has drafted an email to the APOGEE-2 collaboration about our current status, and we talked about next steps.

In the late morning, I spoke with Guido D'Amico (NYU) about future projects in cosmology that I am interested in thinking about. One class of projects involves searching for new kinds of observables (think: large-scale structure mixed-order statistics and the like) that are tuned to have maximum sensitivity to the cosmological parameters of interest. I feel like there is some kind of data-science-y approach to this, given the incredible simulations currently on the market.

September 11, 2014

Clifford JohnsonScreen Junkies Chat: Guardians of the Galaxy

Screen Shot 2014-09-11 at 3.13.03 PMYou may recall that back in June I had a chat with Hal Rudnick over at Screen Junkies about science and time travel in various movies (including the recent "X-Men: Days of Future Past"). It was a lot of fun, and people seemed to like it a lot. Well, some good news: On Tuesday we recorded (along with my Biophysicist colleague Moh El-Naggar) another chat for Screen Junkies, this time talking a bit about the fun movie "Guardians of the Galaxy"! Again, a lot of fun was had... I wish you could hear all of the science (and more) that we went into, but rest assured that they* did a great job of capturing some of it in this eight-minute episode. Have a look. (Embed below the more-click): [...] Click to continue reading this post

Clifford JohnsonBut How…?

I get questions from time to time about where the drawings on the site come from, or how they are done. notebook_and_tools The drawing I had in one of last week's posts is a good example of one that can raise questions, partly because you don't get a sense of scale after I've done a scan and cropped off the notebook edges and so forth. Also, people are not expecting much in the way of colour from drawing on location. Anyway, the answer is, yes I drew it, and yes it was drawn on location. I was just sitting on a balcony, chose which part of the view I wanted to represent on the page, and went for it. I wanted to spread across two pages of my notebook and make something of a tall sketch. See above right (click for larger view.) A quick light pencil rough helped me place things, and then a black [...] Click to continue reading this post

Clifford JohnsonNo Pressure Then…

Saw this the other day: Screen Shot 2014-09-11 at 09.42.23 Eek! Better get around to writing my remarks before Saturday! In case you're wondering, find out more about the Bridging the STEM Divide [...] Click to continue reading this post

BackreactionExperimental Search for Quantum Gravity – What is new?

Last week I was at SISSA in Trieste for the 2014 conference on “Experimental Search for Quantum Gravity”. I missed the first two days because of child care problems (Kindergarten closed during holiday season, the babysitter ill, the husband has to work), but Stefano Liberati did a great job with the summary talk the last day, so here is a community update.

The briefest of brief summaries is that we still have no experimental evidence for quantum gravity, but then you already knew this. During the last decade, the search for experimental evidence for quantum gravity has focused mostly on deviations from Lorentz-invariance and strong quantum gravity in the early universe that might have left imprints on the cosmological observables we measure today. The focus on these two topics is still present, but we now have some more variety which I think is a good development.

There is still lots of talk about gamma ray bursts and the constraints on deformations of Lorentz-invariance that can be derived from this. One has to distinguish these constraints on deformations from constraints on violations of Lorentz-invariance. In the latter case one has a preferred frame, in the former case not. Violations of Lorentz-invariance are very strongly constrained already. But to derive these constraints one makes use of an effective field theory approach, that is one assumes that whatever quantum gravity at high energies (close by the Planck scale) looks like, at small energies it must be describable by the quantum field theories of the standard model plus some additional, small terms.

Deformations of Lorentz-symmetry are said to not have an effective field theory limit and thus these constraints cannot be applied. I cautiously say “are said not to have” such a limit because I have never heard a good argument why such a limit shouldn’t exist. For all I can tell it doesn’t exist just because nobody working on this wants it to exist. In any case, without this limit one cannot use the constraints on the additional interaction terms and has to look for other ways to test the model.

This is typically done by constraining the dispersion relation for free particles which obtains small correction terms. These corrections to the dispersion relation affect the speed of massless particles, which now is energy-dependent. The effects of the deformation become larger with long travel times and large energies which is why high energetic gamma ray bursts are so interesting. The deformation would make itself noticeable by either speeding up or slowing down the highly energetic photons, depending on the sign of a parameter.

Current constraints put the limits roughly at the Planck scale if the modification is either to slow down or to speed up the photons. Putting constraints on the case where the deformation is stochastic (sometimes speeding up, sometimes slowing down) is more difficult and so far there haven’t been any good constraints on this. Jonathan Granot briefly flashed by some constraints on the stochastic case, but said he can’t spill the details yet, some collaboration issue. He and collaborators do however have a paper coming out within the next months that I expect will push the stochastic case up to the Planck scale as well.

On the other hand we heard a talk by Giacomo Rosati who argues that to derive these bounds one uses the normal expansion of the Friedmann-Robertson-Walker metric, but that the propagation of particles in this background should be affected by the deformed theory as well, which weakens the constraints somewhat. Well, I can see the rationale behind the argument, but after 15 years the space-time picture that belongs to deformed Lorentz-invariance is still unclear, so this might or might not be the case. There were some other theory talks that try to get this space-time picture sorted out but they didn’t make a connection to phenomenology.

Jakub Mielczarek was at the meeting talking about the moment of silence in the early universe and how to connect this to phenomenology. In this model for the early universe space-time makes a phase-transition from a Euclidean regime to the present Lorentzian regime, and in principle one should be able to calculate the spectral index from this model, as well as other cosmological signatures. Alas, it’s not a simple calculation and progress is slow since there aren’t many people working on it.

Another possible observable from this phase-transition may be leftover defects in the space-time structure. Needless to say, I like that very much because I was talking about my model for space-time defects that basically is a parameterization of this possibility in general (slides here). It would be great if one could connect these parameters to some model about the underlying space-time structure.

The main message that I have in my talk is that if you want to preserve Lorentz-invariance, as my model does, then you shouldn’t look at high energies because that’s not a Lorentz-invariant statement to begin with. You should look instead at wave-functions sweeping over large world-volumes. This typically means low energies and large distances, which is not a regime that presently gets a lot of attention when it comes to quantum gravity phenomenology. I certainly hope this will change within the next years because it seems promising to me. Well, more promising than the gamma ray bursts anyway.

We also heard Joao Magueijo in his no-bullshit style explaining that modified dispersion relations in the early universe can reproduce most achievements of inflation, notably the spectral index including the tilt and solving the horizon problem. This becomes possible because an energy-dependence in the speed of light together with redshift during expansion turns the energy-dependence into a time-dependence. If you haven’t read his book “Faster Than the Speed of Light”, I assure you you won’t regret it.

The idea of dimensional reduction is still popular but experimental consequences, if any, come through derived concepts such as a modified dispersion relation or early universe dynamics, again.

There was of course some discussion of the BICEP claim that they’ve found evidence for relic gravitational waves. Everybody who cared to express an opinion seemed to agree with me that this isn’t the purported evidence for quantum gravity that the press made out of it, even if the measurement was uncontroversial and statistically significant.

As we discussed in this earlier post, to begin with this doesn’t test the quantum gravity at high energies but only the perturbative quantization of gravity, which for most of my colleagues isn’t really quantum gravity. It’s the high energy limit that we do not know how to deal with. And even to claim that it is evidence for perturbative quantization requires several additional assumptions that may just not be fulfilled, for example that there are no non-standard matter couplings and that space-time and the metric on it exist to begin with. This may just not be the case in a scenario with a phase-transition or with emergent gravity. I hope that next time the media picks up the topic they care to talk to somebody who actually works on quantum gravity phenomenology.

Then there was a member from the Planck collaboration whose name I forgot, who tried to say something about their analysis of the foreground effects from the galactic dust that BICEP might not have accurately accounted for. Unfortunately, their paper isn’t finished and he wasn’t really allowed to say all that much. So all I can tell you is that Planck is pretty much done with their analysis and the results are with the BICEP collaboration which I suppose is presently redoing their data fitting. Planck should have a paper out by the end of the month we’ve been told. I am guessing it will primarily say there’s lots of uncertainty and we can’t really tell whether the signal is there or isn’t, but look out for the paper.

There was also at the conference some discussion about the possibility to test quantum gravitational effects in massive quantum systems, as suggested for example by Igor Pikovski et al. This is a topic we previously discussed here, and I still think it is extremely implausible. The Pikovski et al paper is neither the first nor the last to have proposed this type of test, but it is arguably the one that got the most attention because they managed to get published in Nature Physics. These experiments are supposed to test basically the same deformation that the gamma ray bursts also test, just on the level of commutation relations in quantum mechanics rather than in the dispersion relation (the former leads to the latter, the opposite is not necessarily so).

The problem is that in this type of theory nobody really knows how to get from the one-particle case to the many-particle case, which is known as the ‘soccer-ball-problem’. If one naively just adds the energies of particles, one finds that the corrections blow up when one approaches the Planck mass, which is about 10-5 grams. That doesn’t make a lot of sense - to begin with because we wouldn’t reproduce classical mechanics, but also because quantum gravitational effects shouldn’t scale with the energy but with the energy density. This means that the effects should get smaller for systems composed of many particles. In this case then, you cannot get good constraints on quantum gravitational effects in the proposed experiments. That doesn’t mean one shouldn’t do the experiment. This is new parameter space in quantum mechanics and one never knows what interesting things one might find there. I’m just saying don’t expect any quantum gravity there.

Also at the conference was Jonathan Miller, who I had been in contact with earlier about his paper in which he and his coauthor estimate whether the effect of gravitational bremsstrahlung on neutrino propagation is detectable (we discussed this here). It is an interesting proposal that I spent quite some time thinking about because they don’t make giant leaps of faith about the scaling of quantum gravitational effects. In this paper, it is plainly perturbatively quantized gravity.

However, after some thinking about this I came to the conclusion that while the cross-section that they estimate may be at the right order of magnitude for some cases (I am not too optimistic about the exact case that they discuss in the paper), the total probability for this to happen is still tiny. That is because unlike the case of cross-sections measured at the LHC, for neutrinos scattering off a black hole one doesn’t have a high luminosity to bring up the chance of ever observing this. When I estimated the flux, the probability turned out to be too small to be observable by at least 30 orders of magnitude, ie what you typically expect for quantum gravity. Anyways, I had some interesting exchange with Jonathan who, needless to say, isn’t entirely convinced by my argument. So it’s not a settled story, and I’ll let you know what comes out of this.

Finally, I should mention that Carlo Rovelli and Francesca Vidotto talked about their Planck stars and the possible phenomenology that these could lead to. We previously discussed their idea here. They are arguing basically that quantum gravitational effects can be so that a black hole (with an apparent horizon, not an event horizon) does not slowly evaporate until it reaches the Planck mass, but suddenly explodes at a mass still much higher than the Planck mass, thereby releasing its information. If that was possible, it would sneak around all the issues with firewalls and remnants and so on. It might also have observable consequences for these explosions might be detectable. However, this idea is still very much in its infancy and several people in the audience raised concerns similar to mine, whether this can work without violating locality and/or causality in the semi-classical limit. In any case, I am sure that we will hear more about this in the soon future.

All together I am relieved that the obsession with gamma ray bursts seems to be fading, though much of this fading is probably due to both Giovanni Amelino-Camelia and Lee Smolin not being present at this meeting ;)

This was the first time I visited SISSA since they moved to their new building, which is no longer located directly at the coast. It is however very nicely situated on a steep hill, surrounded by hiking paths through the forest. The new SISSA building used to be a hospital, like the buildings that house Nordita in Stockholm. I’ve been told my office at Nordita is in what used to be the tuberculosis sector, and if I’m stuck with a computation I can’t help but wonder how many people died at the exact spot my desk stands now. As to SISSA, I hope that the conference was on what was formerly the pregnancy ward, and that the meeting, in spirit, may give birth to novel ideas how to test quantum gravity.

September 10, 2014

n-Category Café Quasistrict Symmetric Monoidal 2-Categories via Wire Diagrams

Guest post by Bruce Bartlett

I recently put an article on the arXiv:

It’s about Chris Schommer-Pries’s recent strictification result from his updated thesis, that every symmetric monoidal bicategory is equivalent to a quasistrict one. Since symmetric monoidal bicategories can be viewed as the syntax for ‘stable 3-dimensional algebra’, one aim of the paper is to write out this stuff out in a diagrammatic notation, like this:

pic

The other aim is to try to strip down the definition of a ‘quasistrict symmetric monoidal bicategory’, emphasizing the central role played by the interchangor isomorphisms. Let me explain a bit more.

Motivation

Firstly, some motivation. For a long time now I’ve been finishing up a project together with Chris Douglas, Chris Schommer-Pries and Jamie Vicary about 1-2-3 topological quantum field theories. The starting point is a generators-and-relations presentation of the oriented 3-dimensional bordism bicategory (objects are closed 1-manifolds, morphisms are two-dimensional bordisms, and 2-morphisms are diffeomorphism classes of three-dimensional bordisms between those). So, you present a symmetric monoidal bicategory from a bunch of generating objects, 1-morphisms, and 2-morphisms, and a bunch of relations between the 2-morphisms. These relations are written diagrammatically. For instance, the ‘pentagon relation’ looks like this:

pic

To make rigorous sense of these diagrams, we needed a theory of presenting symmetric monoidal bicategories via generators-and-relations in the above sense. So, Chris Schommer-Pries worked such a theory out, using computads, and proved the above strictification result. This implies that we could use the simple pictures above to perform calculations.

Strictifying symmetric monoidal bicategories

The full algebraic definition of a symmetric monoidal bicategory is quite intimidating, amounting to a large amount of data satisfying a host of diagrams. A self-contained definition can be found in this paper of Mike Stay. So, it’s of interest to see how much of this data can be strictified, at the cost of passing to an equivalent symmetric monoidal bicategory.

Before Schommer-Pries’s result, the best strictification result was that of Gurski and Osorno.

Theorem (GO). Every symmetric monoidal bicategory is equivalent to a semistrict symmetric monoidal 2-category.

Very roughly, a semistrict symmetric monoidal 2-category consists of a strict 2-category equipped with a strict tensor product, plus the following coherence data (see eg. HDA1 for a fuller account) satisfying a bunch of equations:

  • tensor naturators, i.e. 2-isomorphisms Φ f,g:(fg)(fg)(ff)(gg)\Phi_{f,g} : (f' \otimes g') \circ (f \otimes g) \Rightarrow (f' \circ f) \otimes (g' \circ g)
  • braidings, i.e. 1-morphisms β A,B:ABBA\beta_{A,B} : A \otimes B \rightarrow B \otimes A
  • braiding naturators, i.e. 2-isomorphisms β f,g:β A,B(fg)(gf)β A,B\beta_{f,g} : \beta_{A,B} \circ (f \otimes g) \Rightarrow (g \otimes f) \circ \beta_{A,B}
  • braiding bilinearators, i.e. 2-isomorphisms R (A|B,C):(idR B,C)(R A,Bid)R A,BCR_{(A|B, C)} : (id \otimes R_{B,C}) \circ (R_{A,B} \otimes id) \Rightarrow R_{A, B\otimes C}
  • symmetrizors, i.e. 2-isomorphisms ν A,B:id ABR B,AR A,B\nu_{A,B} : id_{A \otimes B} \Rightarrow R_{B,A} \circ R_{A,B}

So — Gurski and Osorno’s result represents a lot of progress. It says that the other coherence data in a symmetric monoidal bicategory (associators for the underlying bicategory, associators for the underlying monoidal bicategory, pentagonator, unitors, adjunction data, …) can be eliminated, or more precisely, strictified.

Schommer-Pries’s result goes further.

Theorem (S-P). Every semistrict monoidal bicategory is equivalent to a quasistrict symmetric monoidal 2-category.

A quasistrict symmetric monoidal 2-category is a semistrict symmetric monoidal 2-category where the braiding bilinearators and symmetrizors are equal to the identity. So - only the tensor naturators, braiding 1-morphisms, and braiding naturators remain!

The method of proof is to show that every symmetric monoidal bicategory admits a certain kind of presentation by generators-and-relations (a ‘quasistrict 3-computad’). And the gismo built out of a quasistrict 3-computad is a quasistrict symmetric monoidal 2-category! Q.E.D.

Stringent symmetric monoidal 2-categories

In my article, I reformulate the definition of a quasistrict symmetric monoidal 2-category a bit, removing redundant data. Firstly, the tensor naturators Φ (f,g),(f,g)\Phi_{(f',g'),(f,g)} are fully determined by their underlying interchangors ϕ f,g\phi_{f,g},

(1)ϕ f,g=Φ (f,id),(id,g):(fid)(idg)(idg)(fid) \phi_{f,g} = \Phi_{(f, id), (id, g)} : (f \otimes id) \circ (id \otimes g) \Rightarrow (id \otimes g) \circ (f \otimes id)

This much is well-known. But also, the braiding naturators are fully determined by the interchangors. So, I define a stringent symmetric monoidal 2-category purely in terms of this coherence data: interchangors, and braiding 1-morphisms. I show that they’re equivalent to quasistrict symmetric monoidal bicategories.

Wire diagrams

The ‘stringent’ version of the definition is handy, because it admits a nice graphical calculus which I call ‘wire diagrams’. I needed a new name just to distinguish them from vanilla-flavoured string diagrams for 2-categories where the objects of the 2-category correspond to planar regions; now the objects of the 2-category correspond to lines. But it’s really just a rotated version of string diagrams in 3 dimensions. So, the basic setup is as follows:

pic

But to keep things nice and planar, we’ll draw this as follows:

pic

These diagrams are interpreted according to the prescription: tensor first, then compose! So, the interchangor isomorphisms look as follows:

pic

So, what I do is write out the definitions of quasistrict and stringent symmetric monoidal 2-categories in terms of wire diagrams, and use this graphical calculus to prove that they’re the same thing.

That’s good for us, because it turns out these ‘wire diagrams’ are precisely the diagrammatic notation we were using for the generators-and-relations presentation of the oriented 3-dimensional bordism bicategory. For instance, I hope you can see the interchangor ϕ\phi being used in the ‘pentagon relation’ I drew near the top of this post. So, that diagrammatic notation has been justified.

Sean CarrollCosmological Attractors

I want to tell you about a paper I recently wrote with grad student Grant Remmen, about how much inflation we should expect to have occurred in the early universe. But that paper leans heavily on an earlier one that Grant and I wrote, about phase space and cosmological attractor solutions — one that I never got around to blogging about. So you’re going to hear about that one first! It’s pretty awesome in its own right. (Sadly “cosmological attractors” has nothing at all to do with the hypothetical notion of attractive cosmologists.)

Attractor Solutions in Scalar-Field Cosmology
Grant N. Remmen, Sean M. Carroll

Models of cosmological scalar fields often feature “attractor solutions” to which the system evolves for a wide range of initial conditions. There is some tension between this well-known fact and another well-known fact: Liouville’s theorem forbids true attractor behavior in a Hamiltonian system. In universes with vanishing spatial curvature, the field variables (\phi, \dot\phi) specify the system completely, defining an effective phase space. We investigate whether one can define a unique conserved measure on this effective phase space, showing that it exists for m2φ2 potentials and deriving conditions for its existence in more general theories. We show that apparent attractors are places where this conserved measure diverges in the (\phi, \dot\phi) variables and suggest a physical understanding of attractor behavior that is compatible with Liouville’s theorem.

This paper investigates a well-known phenomenon in inflationary cosmology: the existence of purported “attractor” solutions. There is a bit of lore that says that an inflationary scalar field might start off doing all sorts of things, but will quickly settle down to a preferred kind of evolution, known as the attractor. But that lore is nominally at odds with a mathematical theorem: in classical mechanics, closed systems never have attractor solutions! That’s because “attractor” means “many initial conditions are driven to the same condition,” while Liouville’s theorem says “a set of initial conditions maintains its volume as it evolves.” So what’s going on?

Let’s consider the simplest kind of model: you just have a single scalar field φ, and a potential energy function V(φ), in the context of an expanding universe with no other forms of matter or energy. That fully specifies the model, but then you have to specify the actual trajectory that the field takes as it evolves. Any trajectory is fixed by giving certain initial data in the form of the value of the field φ and its “velocity” \dot\phi. For a very simple potential like V(φ) ~ φ2, the trajectories look like this:

attractors

This is the “effective phase space” of the model — in a spatially flat universe (and only there), specifying φ and its velocity uniquely determines a trajectory, shown as the lines on the plot. See the dark lines that start horizontally, then spiral toward the origin? Those are the attractor solutions. Other trajectories (dashed lines) basically zoom right to the attractor, then stick nearby for the rest of their evolution. Physically, the expansion of the universe acts as a kind of friction; away from the attractor the friction is too small to matter, but once you get there friction begins to dominate and the the field rolls very slowly. So the idea is that there aren’t really that many different kinds of possible evolution; a “generic” initial condition will just snap onto the attractor and go from there.

This story seems to be in blatant contradiction with Liouville’s Theorem, which roughly says that there cannot be true attractors, because volumes in phase space (the space of initial conditions, i.e. coordinates and momenta) remain constant under time-evolution. Whereas in the picture above, volumes get squeezed to zero because every trajectory flows to the 1-dimensional attractor, and then of course eventually converges to the origin. But we know that the above plot really does show what the trajectories do, and we also know that Liouville’s theorem is correct and does apply to this situation. Our goal for the paper was to show how everything actually fits together.

Obviously (when you think about it, and know a little bit about phase space), the problem is with the coordinates on the above graph. In particular, \dot\phi might be the “velocity” of the field, but it definitely isn’t its “momentum,” in the strict mathematical sense. The canonical momentum is actually a^3\dot\phi, where a is the scale factor that measures the size of the universe. And the scale factor changes with time, so there is no simple translation between the nice plot we saw above and the “true” phase space — which should, after all, also include the scale factor itself as well as its canonical momentum.

So there are good reasons of convenience to draw the plot above, but it doesn’t really correspond to phase space. As a result, it looks like there are attractors, although there really aren’t — at least not by the strict mathematical definition. It’s just a convenient, though possibly misleading, nomenclature used by cosmologists.

Still, there is something physically relevant about these cosmological attractors (which we will still call “attractors” even if they don’t match the technical definition). If it’s not “trajectories in phase space focus onto them,” what is it? To investigate this, Grant and I turned to a formalism for defining the measure on the space of trajectories (rather than just points in phase space), originally studied by Gibbons, Hawking, and Stewart and further investigated by Heywood Tam and me a couple of years ago.

The interesting thing about the “GHS measure” on the space of trajectories is that it diverges — becomes infinitely big — for cosmologies that are spatially flat. That is, almost all universes are spatially flat — if you were to pick a homogeneous and isotropic cosmology out of a hat, it would have zero spatial curvature with probability unity. (Which means that the flatness problem you were taught as a young cosmologist is just a sad misunderstanding — more about that later in another post.) That’s fine, but it makes it mathematically tricky to study those flat universes, since the measure is infinity there. Heywood and I proposed a way to regulate this infinity to get a finite answer, but that was a mistake on our part — upon further review, our regularization was not invariant under time-evolution, as it should have been.

That left an open problem — what is the correct measure on the space of flat universes? This is what Grant and I tackled, and basically solved. Long story short, we studied the necessary and sufficient conditions for there to be the right kind of measure on the effective phase space shown in the plot above, and argued that such a measure (1) exists, and (2) is apparently unique, at least in the simple case of a quadratic potential (and probably more generally). That is, we basically reverse-engineered the measure from the requirement that Liouville’s theorem be obeyed!

So there is such a measure, but it’s very different from the naïve “graph-paper measure” that one is tempted to use for the effective phase space plotted above. (A temptation to which almost everyone in the field gives in.) Unsurprisingly, the measure blows up on the attractor, and near the origin. That is, what looks like an attractor when you plot it in these coordinates is really a sign that the density of trajectories grows very large there — which is the least surprising thing in the world, really.

At the end of the day, despite the fact that we mildly scold fellow cosmologists for their sloppy use of the word “attractor,” the physical insights connected to this idea go through essentially unaltered. The field and its velocity are the variables that are most readily observable (or describable) by us, and in terms of these variables the apparent attractor behavior is definitely there. The real usefulness of our paper would come when we wanted to actually use the measure we constructed, for example to calculate the expected amount of inflation in a given model — which is what we did in our more recent paper, to be described later.

This paper, by the way, was one from which I took equations for the blackboards in an episode of Bones. It was fun to hear Richard Schiff, famous as Toby from The West Wing, play a physicist who explains his alibi by saying “I was constructing an invariant measure on the phase space of cosmological spacetimes.” 

richard-schiff-on-bones

The episode itself is great, you should watch it if you can. But I warn you — you will cry.

Dave BaconSailing Stones: Mystery No More

My first research project, my first research paper, was on a perplexing phenomenon: the sliding rocks of Death Valley’s Racetrack playa. Racetrack playa is a large desolate dry lake bed that has one distinguishing feature above and beyond its amazing flatness. At the south end of the playa are a large number of large rocks (one man size and smaller), and behind these rocks, if you visit in the summer, are long tracks caked into the dried earth of the playa. Apparently these rocks, during the winter months, move and leave these long tracks. I say apparently, because, for many many years, no one had ever seen these rocks move. Until now! The following video makes me extremely happy

This is a shot of one of the playa stones actually moving! This is the end result of a large study that sought to understand the mechanism behind the sliding stones, published recently in PloS one:

In 1993, fresh out of Yreka High School, I found myself surrounded by 200+ geniuses taking Caltech’s first year physics class, Physics 1 (med schools sometimes ask students at Caltech to verify that they know Calculus because the transcripts have just these low numerical course indicators on them, and of course Physics 1 couldn’t actually be physics with calculus, could it?) It would be a lie to say that this wasn’t intimidating: some of the TAs in the class where full physics professors! I remember a test where the average score was 0.5 out of 10 and perhaps it didn’t help that my roommate studied with a Nobel prize winner as a high school student. Or that another freshman in my class was just finishing a paper with his parents on black holes (or that his dad is one of the founders of the theory of inflation!) At times I considered transferring, because that is what all Caltech students do when they realized how hard Caltech is going to be, and also because it wasn’t clear to me what being a physics major got you.

One day in Physics 1 it was announced that there was a class that you could gain entrance to that was structured to teach you not physics, but how to do creative research. Creativity: now this was something I truly valued! It was called Physics 11 and it was run by one Professor Tom Tombrello (I’d later see his schedule on the whiteboard with the abbreviation T2). The only catch was that you had to get accepted into the class and to do this you had to do you best at solving a toy research problem, what the class termed a “hurdle”. The students from the previous class then helped select the new Physics 11 students based upon their performance on the hurdles. The first hurdle also caught my eye: it was a problem based upon the old song Mairzy Doats which my father had weekly sung while showering in the morning. So I set about working on the problem. I don’t remember much of my solution, except that it was long and involved lots of differential equations of increasing complexity. Did I mention that it was long? Really long. I handed in the hurdle, then promptly ran out of time to work on the second hurdle.

Because I’d not handed in the second hurdle, I sort of expected that I’d not get selected into the class. Plus I wasn’t even in the advanced section of physics 1 (the one TAed by the professors, now those kids were well prepared and smart!) But one late night I went to my mailbox, opened it, and found…nothing. I closed it, and then, for some strange reason, thought: hey maybe there is something stuck in there. So I returned and opened the box, dug deep, and pulled out an invitation to join physics 11! This story doesn’t mean much to you, but I can still smell, feel, and hear Caltech when I think of this event. Also I’ve always been under the impression that being accepted to this class was a mistake and really the invitation I got was meant for another student in a mailbox next to mine. But that’s a story for another session on the couch.

So I enrolled in Physics 11. It’s not much of a stretch to say that it was the inspiration for me to go to graduate school, to do a postdoc, and to become a pseudo-professor. Creative research is an amazing drug, and also, I believe, one of the great endeavors of humanity. My small contribution to the racetrack playa story was published in the Journal of Geology:

The basic mystery was what caused these rocks to move. Was it the wind? It seemed hard to get enough force to move the rocks. Was it ice? When you placed stakes around the rocks, some of the rocks moved out of the stakes and some did not. In the above paper we pointed out that a moving layer of water would mean that there was more wind down low that one would normally get because the boundary layer was moving. We also looked for the effect of said boundary layer on the rocks motion and found a small effect.

The answer, however, as to why the rocks moved, turned out to be even more wonderful. Ice sheets dislodged and bashing the rocks forward. A sort of combination of the two competing previous hypothesis! This short documentary explains it nicely

So, another mystery solved! We know more about how the world works, not on a level of fundamental physics, but on a level of, “because it is interesting”, and “because it is fun”, and isn’t that enough? Arthur C. Clarke, who famously gave airtime to these rocks, would, I think, have been very please with this turn of events

September 09, 2014

Tommaso DorigoGeorge Zweig's Brilliant Intuition And Van Hove's Horrible Censorship

One year ago I had the pleasure to spend some time with George Zweig during a conference in Crete (ICNFP 2013). He is a wonderful storyteller and a great chap to hung around with, and I had great fun in the after-dinners on the terrace of the Orthodox Academy of Crete overlooking the Aegean sea, drinking raki and chatting about physics and other subjects.

read more

Sean CarrollNorms for Respectful Classroom/Seminar Discussion

David Chalmers is compiling a useful set of guidelines for respectful, constructive, and inclusive philosophical discussion. It makes sense to concentrate on a single field, like philosophy, since customs often vary wildly from one discipline to the other — but there’s really nothing specifically “philosophical” about the list, it could easily be adopted in just about any classroom or seminar environment I can think of. (Online, alas, is another story.)

What immediately strikes me are (1) how nominally unobjectionable all the suggestions are, and (2) how so many of them are routinely violated even in situations that wouldn’t strike us as relatively civil and respectful. Here are some cherry-picked examples from the list:

  • Don’t interrupt.
  • Don’t present objections as flat dismissals (leave open the possibility that there’s a response).
  • Don’t dominate the discussion (partial exception for the speaker here!).
  • Unless you’re speaker, existing questioner, or chair, don’t speak without being called on (limited exceptions for occasional jokes and other very brief interjections, not to be abused).
  • The chair should attempt to balance the discussion among participants, prioritizing those who have not spoken before.
  • Prioritize junior people in calling on questions (modified version: don’t prioritize senior people).

Not that I’m saying we shouldn’t strive to be as respectful as David’s lists suggests — just that we don’t even try, really. How many times have you been in a seminar in which a senior person in the audience interrupted and dominated discussion? I can imagine someone defending that kind of environment, on the grounds that it leads to more fun and feisty give-and-take, and perhaps even a more rapid convergence to the truth. I can testify that I was once at a small workshop — moderated by David Chalmers — where the “hand-and-finger system” was employed, according to which you raise a hand to ask a new question, and a finger to ask a follow-up to someone else’s question, and the chair keeps a list of who is next in the queue. Highly structured, but it actually worked quite well, carving out some space for everyone’s questions to be treated equally, regardless of their natural degree of assertiveness or their social status within the room. (If you’ve ever been at a “Russian” physics seminar, imagine exactly the opposite of that.) I will leave it as an exercise for the reader to judge whether enforcing these norms more actively would create a more hospitable academic environment overall.

David also suggests some related resources:

Any other suggestions? Comments are open, and you don’t have to wait to be called on.

David HoggCDS space

Does it count as research when I work on the NYU Center for Data Science space planning? Probably not, but I spent a good fraction of the day analyzing plans and then discussing with the architects working to create schematics for the new space. We want a great diversity of offices, shared spaces (meeting rooms, offices, and carrels), and open space (studio space and lounge and cafe space). We want our space plans to be robust to our uncertainties about how people will want to use the space.

September 08, 2014

Quantum DiariesNeutrinos permeate Fermilab’s past, present and future

This article appeared in Fermilab Today on Sept. 5, 2014.

This aerial view shows the Neutrino Area under construction in May 1971. The 15-foot bubble chamber, pictured on the left, would later be moved to the present-day location of Lab D.  Photo: Fermilab

This aerial view shows the Neutrino Area under construction in May 1971. The 15-foot bubble chamber, pictured on the left, would later be moved to the present-day location of Lab B. Photo: Fermilab

It was called Target Station C. One of three stations north of Wilson Hall at the end of beamlines extending from the Main Ring (later replaced by the Tevatron), Target Station C was assigned to experiments that would require high beam intensities for investigating neutrino interactions, according to a 1968 design report.

Within a few years, Target Station C was officially renamed the Neutrino Area. It was the first named fixed-target area and the first to be fully operational. Neutrinos and the Intensity Frontier had an early relationship with Fermilab. But why is it resurfacing now?

“The experimental program is driven by the current state of knowledge, and that’s always changing,” said Jeffrey Appel, a retired Fermilab physicist and assistant laboratory director who started research at the lab in 1972.

When Appel first arrived, there was intense interest in neutrinos because the weak force was poorly understood, and neutral currents were still a controversial idea. Fermilab joined forces with many institutions both in and outside the United States, and throughout the 1970s and early 1980s, neutrinos generated from protons in the Main Ring crashed through a 15-foot bubble chamber filled with super-heated liquid hydrogen. Other experiments running in parallel recorded neutrino interactions in iron and scintillator.

“The goal was to look for the W and Z produced in neutrino interactions,” said Appel. “So the priority for getting the beam up first and the priority for getting the detectors built and installed was on that program in those days.”

It turns out that the W and Z bosons are too massive to have been produced this way and had to wait to be discovered at colliding-beam experiments. As soon as the Tevatron was ready for colliding beams in 1985, the transition began at Fermilab from fixed-target areas to high-energy particle colliding.

More recent revelations have shown that neutrinos have mass. These findings have raised new questions that need answers. In 1988, plans were laid to add the Main Injector to the Fermilab campus, partly to boost the capabilities of the Tevatron, but also, according to one report, because “intense beams of neutral kaons and neutrinos would provide a unique facility for CP violation and neutrino oscillation experiments.”

Although neutrino research was a smaller fraction of the lab’s program during Tevatron operations, it was far from dormant. Two great accomplishments in neutrino research occurred in this time period: One was the most precise neutrino measurement of the strength of the weak interaction by the NuTeV experiment. The other was when the DONUT experiment achieved its goal of making the first direct observation of the tau neutrino in 2000.

“In the ’90s most evidence of neutrinos changing flavors was coming from natural sources. But this inspired a whole new generation of accelerator-based neutrino experiments,” said Deborah Harris, co-spokesperson for the MINERvA neutrino experiment. “That’s when Fermilab changed gears to make lower-energy but very intense neutrino beams that were uniquely suited for oscillation physics.”

In partnership with institutions around the globe, Fermilab began planning and building a suite of neutrino experiments. MiniBooNE and MINOS started running in the early 2000s and MINERvA started in 2010. MicroBooNE and NOvA are starting their runs this year.

Now the lab is working with other institutions to establish a Long-Baseline Neutrino Facility at the laboratory and advance its short-baseline neutrino research program. As Fermilab strengthens its international partnerships in all its neutrino experiments, it is also working to position itself as the home of the world’s forefront neutrino research.

“The combination of the completion of the Tevatron program and the new questions about neutrinos means that it’s an opportune time to redefine the focus of Fermilab,” Appel explained.

“Everybody says: ‘It’s not like the old days,’ and it’s always true,” Appel said. “Experiments are bigger and more expensive, but people are just as excited about what they’re doing.”

He added, “It’s different now but just as exciting, if not more so.”

Troy Rummler

Special thanks go to Fermilab archivists Valerie Higgins and Adrienne Kolb for helping navigate Fermilab’s many resources on early neutrino research at the laboratory.

BackreactionScience changed my life – and yours too.

Can you name a book that made you rethink? A song that helped you through bad times? A movie that gave you a new perspective, new hope, an idea that changed your life or that of people around you? And was it worth the price of the book, the download fee, the movie ticket? If you think of the impact it has had, does it come as a number in your currency of choice?

Those of us working in basic research today are increasingly forced to justify their work by its social impact, it’s value for the society that they live in. It is a good question because scientists payed by tax money should keep in mind who they are working for. But that impact that the funding agencies are after, it is expected to come in the form of applications, something that your neighbor will eventually be able to spend money on, to keep the economic wheels turning and the gears running.

It might take centuries for today’s basic research to result in technological applications, and predicting them is more difficult than doing the research itself. The whole point of doing basic research is that its impact is unpredictable. And so this pressure to justify what we are doing is often addressed by fantastic extrapolations of today’s research, potential gadgets that might come out of it, new materials, new technologies, new services. These justification that we come up with ourselves are normally focused on material value, something that seems tangible to your national funding agency and your member of parliament who wants to be reelected.

But basic research has a long tail, and a soft one, that despite its softness has considerable impact that is often neglected. At our recent workshop for science writers, Raymond Laflamme gave us two great lectures on quantum information technology, the theory and the applications. Normally if somebody starts talking about qubits and gates, my brain switches off instantly, but amazingly enough listening to Laflamme made it sound almost comprehensible.

Here is the slide that he used to motivate the relevance of basic research (full pdf here):


Note how the arrows in the circle gradually get smaller. A good illustration for the high-risk, high impact argument. Most of what we work on in basic research will never lead anywhere, but that which does changes our societies, rewards and refuels our curiousity, then initiates a new round in the circle.

Missing in this figure though is a direct link from understanding to social impact.



New scientific insights have historically had a major impact on the vision the thinkers of the day had for the ideal society and how it was supposed to work, and they still have. Knowledge about the workings of the universe have eroded the rationale behind monarchy, strong hierarchies in general, the influence of the church, and given rise to other forms of organizations that we may call enlightened today, but that will seem archaic a thousand years from now.

The variational principle, made popular in Leibnitz’ conclusion that we live in the “best of all possible worlds”, a world that must be “optimal” in some sense, has been hugely influential and eventually spun off the belief in self-organization, in the existence of an “invisible hand” that will lead societies to an optimal state, and that we better not try to outsmart. This belief is still wide-spread among today’s liberals, even though it obviously begs the questions whether what an unthinking universe optimizes is that what humans want.

The ongoing exploration of nature on large and small scales has fundamentally altered the way in which we perceive of us as special, now knowing that our solar system is but one among billions, many of which contain planets similar to our own. And the multiverse in all its multiple realizations is the maybe ultimate reduction of humanity to an accident, whereby it remains to be seen just how lucky this accident is.

That insights coming from fundamental research affect our societies long before and in many ways besides applications come along today is documented vividly by the Singularity believers who talk about the coming of artificial intelligence surpassing our own intelligence like Christians talk about the rapture. Unless you live in Silicon valley it's a fringe phenomenon, but it is vivid proof just how much ideas affect us.

Other recent developments that have been influential way beyond the scientific niches where they originated are chaos, instability, tipping points, complexity and systemic risk. And it seems to me that the awareness that uncertainty is an integral part of scientific knowledge is slowly spreading.

The connection between understanding and social impact is one you are part of every time you read a popular science piece and update your views about the world, the planet we inhabit, our place on it, and its place in the vastness of the universe. It doesn’t seem to mean all that much, all these little people with their little blogs and their little discussions, but multiply it by some hundred millions. How we think about our being part of nature affects how we organize our living together and our societies.

Downloading a Wikipedia entry of 300 kb through your home wireless: 0.01 Euro. Knowing that the universe expands and will forever continue to expand: Priceless.

John PreskillWhere are you, Dr. Frank Baxter?

This year marks the 50th anniversary of my first publication. In 1964, when we were eleven-year-old fifth graders, my best friend Mace Rosenstein and I launched The Pres-stein Gazette, a not-for-profit monthly. Though the first issue sold well, the second issue never appeared.

Front page of the inaugural issue of the Pres-stein Gazette

Front page of the inaugural issue of the Pres-stein Gazette. Faded but still legible, it was produced using a mimeograph machine, a low-cost printing press which was popular in the pre-Xerox era.

One of my contributions to the inaugural  issue was a feature article on solar energy, which concluded that fossil fuel “isn’t of such terrific abundance and it cannot support the world for very long. We must come up with some other source of energy. Solar energy is that source …  when developed solar energy will be a cheap powerful “fuel” serving the entire world generously forever.”

This statement holds up reasonably well 50 years later. You might wonder how an eleven-year-old in 1964 would know something like that. I can explain …

In the 1950s and early 1960s, AT&T and the Bell Telephone System produced nine films about science, which were broadcast on prime-time network television and attracted a substantial audience. After broadcast, the films were distributed to schools as 16 mm prints and frequently shown to students for many years afterward. I don’t remember seeing any of the films on TV, but I eventually saw all nine in school. It was always a treat to watch one of the “Bell Telephone Movies” instead of hearing another boring lecture.

For educational films, the production values were uncommonly high. Remarkably, the first four were all written and directed by the legendary Frank Capra (a Caltech alum), in consultation with a scientific advisory board provided by Bell Labs.  Those four (Our Mr. Sun, Hemo the Magnificent, The Strange Case of the Cosmic Rays, and Unchained Goddess, originally broadcast in 1956-58) are the ones I remember most vividly. DVDs of these films exist, but I have not watched any of them since I was a kid.

The star of the first eight films was Dr. Frank Baxter, who played Dr. Research, the science expert. Baxter was actually an English professor at USC who had previous television experience as the popular host of a show about Shakespeare, but he made a convincing and pleasingly avuncular scientist. (The ninth film, Restless Sea, was produced by Disney, and Walt Disney himself served as host.) The other lead role was Mr. Writer, a skeptical and likeable Everyman who learned from Dr. Research’s clear explanations and sometimes translated them into vernacular.

The first film, Our Mr. Sun, debuted in 1956 (broadcast in color, a rarity at that time) and was seen by 24 million prime-time viewers. Mr. Writer was Eddie Albert, a well-known screen actor who later achieved greater fame as the lead on the 1960s TV situation comedy Green Acres. Lionel Barrymore appeared in a supporting role.

Dr. Frank Baxter and Eddie Albert in Our Mr. Sun.

Dr. Frank Baxter and Eddie Albert in Our Mr. Sun. (Source: Wikipedia)

Our Mr. Sun must have been the primary (unacknowledged) source for my article in the Pres-stein Gazette.  Though I learned from Wikipedia that Capra insisted (to the chagrin of some of his scientific advisers) on injecting some religious themes into the film, I don’t remember that aspect at all. The scientific content was remarkably sophisticated for a film that could be readily enjoyed by elementary school students, and I remember (or think I do) clever animations teaching me about the carbon cycle in stellar nuclear furnaces and photosynthesis as the ultimate source of all food sustaining life on earth. But I was especially struck by Dr. Baxter’s dire warning that, as the earth’s population grows, our planet will face shortages of food and fuel. On a more upbeat note he suggested that advanced technologies for harnessing the power of the sun would be the key to our survival, which inspired the optimistic conclusion of my article.

A lavishly produced prime-time show about science was a real novelty in 1956, many years before NOVA or the Discovery Channel. I wonder how many readers remember seeing the Dr. Frank Baxter movies when you were kids, either on TV or in school. Or was there another show that inspired you like Our Mr. Sun inspired me? I hope some of you will describe your experiences in the comments.

And I also wonder what resource could have a comparable impact on an eleven-year-old in today’s very different media environment. The obvious comparison is with Neil deGrasse Tyson’s revival of Cosmos, which aired on Fox in 2014. The premiere episode of Cosmos drew 8.5 million viewers on the night it was broadcast, but that is a poor measure of impact nowadays. Each episode has been rebroadcast many times, not just in the US and Canada but internationally as well, and the whole series is now available in DVD and Blu-ray. Will lots of kids in the coming years own it and watch it? Is Cosmos likely to be shown in classrooms as well?

Science is accessible to the curious through many other avenues today, particularly on YouTube. One can watch TED talks, or Minute Physics, or Veritasium, or Khan Academy, or Lenny Susskind’s lectures, not to mention our own IQIM videos on PHD Comics. And there are many other options. Maybe too many?

But do kids watch this stuff? If not, what online sources inspire them? Do they get as excited as I did when I watched Dr. Frank Baxter at age 11?

I don’t know. What do you think?


Tommaso DorigoThe Quote Of The Week - Richter On The Landscape

"If you have seen the movie Particle Fever about the discovery of the Higgs boson, you have heard the theorists saying that the only choices today are between Super-symmetry and the Landscape. Don’t believe them. Super-symmetry says that every fermion has a boson partner and vice versa. That potentially introduces a huge number of new arbitrary constants which does not seem like much progress to me. However, in its simpler variants the number of new constants is small and a problem at high energy is solved. But, experiments at the LHC already seem to have ruled out the simplest variants.

read more

Tim GowersICM2014 — Ian Agol plenary lecture

On the second day of the congress I hauled myself out of bed in time, I hoped, to have a shower and find some breakfast before the first plenary lecture of the congress started at 9am. The previous day in the evening I had chanced upon a large underground shopping mall directly underneath the conference centre, so I thought I’d see if I could find some kind of café there. However, at 8:30 in the morning it was more or less deserted, and I found myself wandering down very long empty passages, constantly looking at my watch and worrying that I wouldn’t have time to retrace my steps, find somewhere I could have breakfast, have breakfast, and walk the surprisingly long distance it would be to the main hall, all by 9am.

Eventually I just made it, by going back to a place that was semi-above ground (meaning that it was below ground but you entered it a sunken area that was not covered by a roof) that I had earlier rejected on the grounds that it didn’t have a satisfactory food option, and just had an espresso. Thus fortified, I made my way to the talk and arrived just in time, which didn’t stop me getting a seat near the front. That was to be the case at all talks — if I marched to the front, I could get a seat. I think part of the reason was that there were “Reserved” stickers on several seats, which had been there for the opening ceremony and not been removed. But maybe it was also because some people like to sit some way back so that they can zone out of the talk if they want to, maybe even getting out their laptops. (However, although wireless was in theory available throughout the conference centre, in practice it was very hard to connect.)

The first talk was by Ian Agol. I was told before the talk that I would be unlikely to understand it — the comment was about Agol rather than about me — and the result of this lowering of my expectations was that I enjoyed the talk. In fact, I might even have enjoyed it without the lowering of expectations. Having said that, I did hear one criticism afterwards that I will try to explain, since it provides a good introduction to the content of the lecture.

When I first heard of Thurston’s famous geometrization conjecture, I thought of it as the ultimate aim of the study of 3-manifolds: what more could you want than a complete classification? However, this view was not correct. Although a proof of the geometrization conjecture would be (and later was) a massive step forward, it wouldn’t by itself answer all the questions that people really wanted to answer about 3-manifolds. But some very important work by Agol and others since Perelman’s breakthrough has, in some sense that I don’t understand, finished off some big programme in the subject. The criticism I heard was that Agol didn’t really explain what this programme was. I hadn’t really noticed that as a problem during the talk — I just took it on trust that the work Agol was describing was considered very important by the experts (and I was well aware of Agol’s reputation) — but perhaps he could have done a little more scene setting.

What he actually did by way of introduction was to mention two questions from a famous 1982 paper of Thurston (Three-dimensional manifolds, Kleinian groups and hyperbolic geometry) in which he asked 24 questions. The ones Agol mentioned were questions 16-18. I’ve just had a look at the Thurston paper, and it’s well worth a browse, as it’s a relatively gentle survey written for the Bulletin of the AMS. It also has lots of nice pictures. I didn’t get a sense from my skim through it that questions 16-18 were significantly more important than the others (apart from the geometrization conjecture), but perhaps the story is that when the dust had settled after Perelman’s work, it was those questions that were still hard. Maybe someone who knows what they’re talking about can give a better explanation in a comment.

One definition I learned from the lecture is this: a 3-manifold is said to have a property P virtually if it has a finite-sheeted cover with property P. I presume that a finite-sheeted cover is another 3-manifold and a suitable surjection to the first one such that each point in the first has k preimages for some finite k (that doesn’t depend on the point).

Thurston’s question 16 asks whether every aspherical 3-manifold (I presume that just means that it isn’t a 3-sphere) is virtually Haken.

A little later in the talk, Agol told us what “Haken” meant, other than being the name of a very well-known mathematician. Here’s the definition he gave, which left me with very little intuitive understanding of the concept. A compact 3-manifold with hyperbolic interior is Haken if it contains an embedded \pi_1-injective surface. An example, if my understanding of my rapidly scrawled notes is correct, is a knot complement, one of the standard ways of constructing interesting 3-manifolds. If you take the complement of a knot in \mathbb{R}^3 you get a 3-manifold, and if you take a tubular neighbourhood of that knot, then its boundary will be your \pi_1-injective surface. (I’m only pretending to know what \pi_1-injective means here.)

Thurston, in the paper mentioned earlier, describes Haken manifolds in a different, and for me more helpful, way. Let me approach the concept in top-down fashion: that is, I’ll define it in terms of other mysterious concepts, then work backwards through Thurston’s paper until everything is defined (to my satisfaction at least).

Thurston writes, “A 3-manifold M^3 is called a Haken manifold if it is prime and it contains a 2-sided incompressible surface (whose boundary, if any, is on \partial M) which is not a 2-sphere.”

Incidentally, one thing I picked up during Agol’s talk is that it seems to be conventional to refer to a 3-manifold as M^3 the first time you mention it and as M thereafter.

Now we need to know what “prime” and “incompressible” mean. The following paragraph of Thurston defines “prime” very nicely.

The decomposition referred to really has two stages. The first stage is the prime decomposition, obtained by repeatedly cutting a 3-manifold M^3 along 2-spheres embedded in M^3 so that they separate the manifold into two parts neither of which is a 3-ball, and then gluing 3-balls to the resulting boundary components, thus obtaining closed 3-manifolds which are “simpler”. Kneser proved that this process terminates after a finite number of steps. The resulting pieces, called the prime summands of M^3 , are uniquely determined by M^3 up to homeomorphism.

Hmm, perhaps the rule is more general: you refer to it as M^3 to start with and after that it’s sort of up to you whether you want to call it M^3 or M.

The equivalent process in two dimensions could be used to simplify a two-holed torus. You first identify a circle that cuts it into two pieces and doesn’t bound a disc: basically what you get if you chop the surface into two with one hole on each side. Then you have two surfaces with circles as boundaries. You fill in those circles with discs and then you have two tori. At this point you can’t chop the surface in two in a non-trivial way, so a torus is prime. Unless my intuition is all wrong, that’s more or less telling us that the prime decomposition of an arbitrary orientable surface (without boundary) is into tori, one for each hole, except that the sphere would be prime.

What about “incompressible”? Thurston offers us this.

A surface N^2 embedded in a 3-manifold M^3 is two-sided if N^2 cuts a regular neighborhood of N^2 into two pieces, i.e., the normal bundle to N^2 is oriented. Since we are assuming that M^3 is oriented, this is equivalent to the condition that N^2 is oriented. A two-sided surface is incompressible if every simple curve on N^2 which bounds a disk in M^3 with interior disjoint from N^2 also bounds a disk on N^2.

I think we can forget the first part there: just assume that everything in sight is oriented. Let’s try to think what it would mean for an embedded surface not to be incompressible. Consider for example a copy of the torus embedded in the 3-sphere. Then a loop that goes round the torus bounds a disc in the 3-sphere with no problem, but it doesn’t bound a disc in the torus. So that torus fails to be incompressible. But suppose we embedded the torus into a 3-dimensional torus in a natural way, by taking the 3D torus to be the quotient of \mathbb{R}^3 by \mathbb{Z}^3 and the 2D torus to be the set of all points with x-coordinate an (equivalence class of an) integer. Then the loops that don’t bound discs in the 2-torus don’t bound discs in the 3-torus either, so that surface is — again if what seems likely to be true actually is true — incompressible. It seems that an incompressible surface sort of spans the 3-manifold in an essential way rather than sitting inside a boring part of the 3-manifold and pretending that it isn’t boring.

OK, that’s what Haken manifolds are, but for the non-expert that’s not enough. We want to know why we should care about them. Thurston gives us an answer to this too. Here is a very useful paragraph about them.

It is hard to say how general the class of Haken manifolds is. There are many closed manifolds which are Haken and many which are not. Haken manifolds can be analyzed by inductive processes, because as Haken proved, a Haken manifold can be cut successively along incompressible surfaces until one is left with a collection of 3-balls. The condition that a 3-manifold has an incompressible surface is useful in proving that it has a hyperbolic structure (when it does), but intuitively it really seems to have little to do with the question of existence of a hyperbolic structure.

To put it more vaguely, Haken manifolds are good because they can be chopped into pieces in a way that makes them easy to understand. So I’d guess that the importance of showing that every aspherical 3-manifold is virtually Haken is that finite-sheeted coverings are sufficiently nice that even knowing that a manifold is virtually Haken means that in some sense you understand it.

One very nice thing Agol did was give us some basic examples of 3-manifolds, by which I mean not things like the 3-sphere, but examples of the kind that one wouldn’t immediately think of and that improve one’s intuition about what a typical 3-manifold looks like.

The first one was a (solid) dodecahedron with opposite faces identified — with a twist. I meant the word “twist” literally, but I suppose you could say that the twist is that there is a twist, meaning that given two opposite faces, you don’t identify each vertex with the one opposite it, but rather you first rotate one of the faces through 2\pi/5 and then identify opposite vertices. (Obviously you’ll have to do that in a consistent way somehow.)

There are some questions here that I can’t answer in my head. For example, if you take a vertex of the dodecahedron, then it belongs to three faces. Each of these faces is identified in a twisty way with the opposite face, so if we want to understand what’s going on near the vertex, then we should glue three more dodecahedra to our original one at those faces, keeping track of the various identifications. Now do the identifications mean that those dodecahedra all join up nicely so that the point is at the intersection of four copies of the dodecahedron? Or do we have to do some more gluing before everything starts to join together? One thing we don’t have to worry about is that there isn’t room for all those dodecahedra, which in a certain sense would be the case if the solid angle at a vertex is greater than 1. (I’m defining, I hope standardly, the solid angle of a cone to be the size of the intersection of that cone with a unit sphere centred at the apex, or whatever one calls it. Since a unit sphere has surface area 4\pi, the largest possible solid angle is 4\pi.)

Anyhow, as I said, this doesn’t matter. Indeed, far from mattering, it is to be positively welcomed, since if the solid angles of the dodecahedra that meet at a point add up to more than 4\pi, then it indicates that the geometry of the resulting manifold will be hyperbolic, which is exactly what we want. I presume that another way of defining the example is to start with a tiling of hyperbolic 3-space by regular dodecahedra and then identify neighbouring dodecahedra using little twists. I’m guessing here, but opposite faces of a dodecahedron are parallel, while not being translates of one another. So maybe as you come out of a face, you give it the smallest (anticlockwise, say) twist you can to make it a translate of the opposite face, which will be a rotation by an angle of \pi/5, and then re-enter the opposite face by the corresponding translated point. But it’s not clear to me that that is a consistent definition. (I haven’t said which dodecahedral tiling I’m even taking. Perhaps the one where all the pentagons have right angles at their vertices.)

The other example was actually a pair of examples. One was a figure-of-eight-knot complement, and the other was the complement of the Whitehead link. Agol showed us drawings of the knot and link: I’ll leave you to Google for them if you are interested.

How does a knot complement give you a 3-manifold? I’m not entirely sure. One thing that’s clear is that it gives you a 3-manifold with boundary, since you can take a tubular neighbourhood of the knot/link and take the complement of that, which will be a 3D region whose boundary is homeomorphic to a torus but sits in \mathbb{R}^3 in a knotted way. I also know (from Thurston, but I’ve seen it before) that you can produce lots of 3-manifolds by defining some non-trivial homeomorphism from a torus to itself, removing a tubular neighbourhood of a knot from \mathbb{R}^n and gluing it back in again, but only after applying the homeomorphism to the boundary. That is, given your solid knot and your solid-knot-shaped hole, you identify the boundary of the knot with the boundary of the hole, but not in the obvious way. This process is called Dehn surgery, and in fact can be used to create all 3-manifolds.

But I still find myself unable to explain how a knot complement is itself a 3-manifold, unless it is a 3-manifold with boundary, or one compactifies it somehow, or something. So I had the illusion of understanding during the talk but am found out now.

The twisted-dodecahedron example was discovered by Seifert and Weber, and is interesting because it is a non-Haken manifold (a discovery of Burton, Rubinstein and Tillmann) that is virtually Haken.

Going back to the question of why the geometrization conjecture didn’t just finish off the subject, my guess is that it is probably possible to construct lots of complicated 3-manifolds that obviously satisfy the geometrization conjecture because they are already hyperbolic, but that are not by virtue of that fact alone easy to understand. What Agol appeared to say is that the role of the geometrization conjecture is essentially to reduce the whole problem of understanding 3-manifolds to that of understanding hyperbolic 3-manifolds. He also said something that is more or less a compulsory remark in a general lecture on 3-manifolds, namely that although they are topological objects, they are studied by geometrical means. (The corresponding compulsory remark for 4-manifolds is that 4D is the odd dimension out, where lots of weird things happen.)

As I’ve said, Agol discussed two other problems. I think the virtual Haken conjecture was the big one (after all, that was the title of his lecture), but the other two were, as he put it, stronger statements that were easier to think about. Question 17 asks whether every aspherical 3-manifold virtually has positive first Betti number, and question 18 asks whether it virtually fibres over the circle. I’ll pass straight to the second of these questions.

A 3-manifold M^3 fibres over the circle if there is a (suitably nice) map \eta:M^3\to S^1 such that the preimage of every point in S^1 is a surface S (the fibre at that point).

Let me state Agol’s main results without saying what they mean. In 2008 he proved that if M^3 is virtually special cubulated, then it is virtually fibred. In 2012 he proved that cubulations with hyperbolic fundamental group are virtually special, answering a 2011 conjecture of Wise. A corollary is that every closed hyperbolic 3-manifold virtually fibres over the circle, which answers questions 16-18.

There appears to be a missing step there, namely to show that every closed hyperbolic 3-manifold has a cubulation with hyperbolic fundamental group. That I think must have been the main message of what he said in a fairly long discussion about cubulations that preceded the statements of these big results, and about which I did not take detailed notes.

What I remember about the discussion was a number of pictures of cube complexes made up of cubes of different dimensions. An important aspect of these complexes was a kind of avoidance of positive curvature, which worked something like this. (I’ll discuss a low-dimensional situation, but it generalizes.) Suppose you have three squares that meet at a vertex just as they do if they are faces of a cube. Then at that vertex you’ve got some positive curvature, which is what you want to avoid. So to avoid it, you’re obliged to fill in the entire cube, and now the positive curvature is rendered harmless because it’s just the surface of some bit of 3D stuff. (This feels a bit like the way we don’t pay attention to embedded surfaces unless they are incompressible.)

I haven’t given the definition because I don’t remember it. The term CAT(0) came up a lot. At the time I felt I was following what was going on reasonably well, helped by the fact that I had seen an excellent talk by my former colleague Vlad Markovic on similar topics. (Markovic was mentioned in Agol’s talk, and himself was an invited speaker at the ICM.) The main message I remember now is that there is some kind of dictionary between cube complexes and 3-manifolds, so you try to find “cubulations” with particular properties that will enable you to prove that your 3-manifolds have corresponding properties. Note that although the manifolds are three-dimensional, the cubes in the corresponding cube complexes are not limited to three dimensions.

That’s about all I can remember, even with the help of notes. In case I have given the wrong impression, let me make clear that I very much enjoyed this lecture and thought it got the “working” part of the congress off to a great start. And it’s clear that the results of Agol and others are a big achievement. If you want to watch the lecture for yourself, it can be found here.

Update. I have found a series of three nice-looking blog posts by Danny Calegari about the virtual Haken conjecture and Agol’s proof. Here are the links: part 1, part 2 and part 3.


September 07, 2014

Tim GowersICM2014 — Bhargava laudatio

I ended up writing more than I expected to about Avila. I’ll try not to fall into the same trap with Bhargava, not because there isn’t lots to write about him, but simply because if I keep writing at this length then by the time I get on to some of the talks I’ve been to subsequently I’ll have forgotten about them.

Dick Gross also gave an excellent talk. He began with some of the basic theory of binary quadratic forms over the integers, that is, expressions of the form ax^2+bxy+cy^2. One assumes that they are primitive (meaning that a, b and c don’t have some common factor). The discriminant of a binary quadratic form is the quantity b^2-4ac. The group SL_2(\mathbb{Z}) then acts on these by a change of basis. For example, if we take the matrix \begin{pmatrix}2&1\\5&3\end{pmatrix}, we’ll replace (x,y) by (2x+y, 5x+3y) and end up with the form a(2x+y)^2+b(2x+y)(5x+3y)+c(5x+3y)^2, which can be rearranged to
(4a+10b+25c)x^2+(4a+11b+30c)xy+(a+3b+9c)y^2
(modulo any mistakes I may have made). Because the matrix is invertible over the integers, the new form can be transformed back to the old one by another change of basis, and hence takes the same set of values. Two such forms are called equivalent.

For some purposes it is more transparent to write a binary quadratic form as
\begin{pmatrix}x&y\end{pmatrix}\begin{pmatrix}a&b/2\\b/2&c\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}.
If we do that, then it is easy to see that replacing a form by an equivalent form does not change its discriminant since it is just -4 times the determinant of the matrix of coefficients, which gets multiplied by a couple of matrices of determinant 1 (the base-change matrix and its transpose).

Given any equivalence relation it is good if one can find nice representatives of each equivalence class. In the case of binary quadratic forms, there is a unique representative such that -a<b<a<c or 0\leq b\leq a\leq c. From this it follows that up to equivalence there are finitely many forms with any given discriminant. The question of how many there are with discriminant D is a very interesting one.

Even more interesting is that the equivalence classes form an Abelian group under a certain composition law that was defined by Gauss. Apparently it occupied about 30 pages of the Disquisitiones, which are possibly the most difficult part of the book.

Going back to the number of forms of discriminant D, Gauss did some calculations and stated (without proof) the formula

\displaystyle \sum_{|D|<T}H(D)\sim\frac\pi{18}T^{3/2}

There was, however, a heuristic justification for the formula. (I can’t remember whether Dick Gross said that Gauss had explicitly stated this justification or whether it was simply a reconstruction of what he must have been thinking.) It turns out that the sum on the left-hand side works out as the number of integer points in a certain region of \mathbb{R}^3 (or at least I assume it is \mathbb{R}^3 since the binary form has three coefficients), and this region has volume (\pi/18)T^{3/2}. Unfortunately, however, the region is not convex, or even bounded, so this does not by itself prove anything. What one has to do is show that certain cusps don’t accidentally contain lots of integer points, and that is quite delicate.

One rather amazing thing that Bhargava did, though it isn’t his main result, was show that if a binary quadratic form represents all the positive integers up to 290 then it represents all positive integers, and that this bound is best possible. (I may have misremembered the numbers. Also, one doesn’t have to know that it represents every single number up to 290 in order to prove the result: there is some proper subset of \{1,2,\dots,290\} that does the job.)

But the first of his Fields-medal-earning results was quite extraordinary. As a PhD student, he decided to do what few people do, and actually read the Disquisitiones. He then did what even fewer people do: he decided that he could improve on Gauss. More precisely, he felt that Gauss’s definition of the composition law was hard to understand and that it should be possible to replace it by something better and more transparent.

I should say that there are more modern ways of understanding the composition law, but they are also more abstract. Bhargava was interested in a definition that would be computational but better than Gauss’s. I suppose it isn’t completely surprising that Gauss might have produced something suboptimal, but what is surprising is that it was suboptimal and nobody had improved it in 200 years.

The key insight came to Bhargava, if we are to believe the story he tells us, when he was playing with a Rubik’s cube. He realized that if he put the letters a to h at the vertices of the cube, then there were three ways of slicing the cube to produce two 2\times 2 matrices. One could then do something with their determinants, the details of which I have forgotten, and end up producing three binary quadratic forms that are related, and this relationship leads to a natural way of defining Gauss’s composition law. Unfortunately, I couldn’t keep the precise definitions in my head.

Here’s a fancier way that Dick Gross put it. Bhargava reinvented the composition law by studying the action of SL_2(\mathbb{Z})^3 on M=\mathbb{Z}^2\otimes\mathbb{Z}^2\otimes\mathbb{Z}^2. The orbits are in bijection with triples of ideal classes (I_1,I_2,I_3) for the ring R=\mathbb{Z}[(D+\sqrt{D})/2] that satisfy I_1.I_2.I_3=1. That’s basically the abstract way of thinking about what Bhargava did computationally.

In this way, Bhargava found a symmetric reformulation of Gauss composition. And having found the right way of thinking about it, he was able to do what Gauss couldn’t, namely generalize it. He found 14 more integral representations on objects like M above, which gave composition laws for higher degree forms.

He was also able to enumerate number fields of small degree, showing that the number of fields of degree n and discriminant less than D grows like c_n|D|. This Gross described as a fantastic generalization of Gauss’s work.

I spent the academic years 2000-2002 at Princeton and as a result had the privilege of attending Bhargava’s thesis defence, at which he presented these results. It must have been one of the best PhD theses ever written. Are there any reasonable candidates for better ones? Perhaps Simon Donaldson’s would offer decent competition.

It’s not clear whether those results would have warranted a Fields medal on their own, but the matter was put beyond the slightest doubt when Bhargava and Shankar proved a spectacular result about elliptic curves. Famously, an elliptic curve comes with a group law: given two points, you take the line through them, see where it cuts the elliptic curve again, and define that to be the inverse of the product. This gives an Abelian group. (Associativity is not obvious: it can be proved by direct computation, but I don’t know what the most conceptual argument is.) The group law takes rational points to rational points, and a famous theorem of Mordell states that the rational points form a finitely generated subgroup. The structure theorem for Abelian groups tells us that for some d it must be a product of \mathbb{Z}^d with a finite group. The integer d is called the rank of the curve.

It is conjectured that the rank can be arbitrarily large, but not everyone agrees with that conjecture. The record so far is held by the curve

y^2 + xy + y = x^3 - x^2 +
31368015812338065133318565292206590792820353345x +
302038802698566087335643188429543498624522041683874493
555186062568159847

discovered by Noam Elkies (who else?) and shown to have rank 19. According to Wikipedia, from which I stole that formula, there are curves of unknown rank that are known to have rank at least 28, so in another sense the record is 28, in that that is the highest known integer for which there is proved to be an elliptic curve of rank at least that integer.

Bhargava and Shankar proved that the average rank is less than 1. Previously this was not even known to be finite. They also showed that at least 80% of elliptic curves have rank 0 or 1.

The Birch–Swinnerton-Dyer conjecture concerns ranks of elliptic curves, and one consequence of their results (or perhaps it is a further result — I’m not quite sure) is that the conjecture is true for at least 66% of elliptic curves. Gross said that there was some hope of improving 66% to 100%, but cautioned that that would not prove the conjecture, since 0% of all elliptic curves doesn’t mean no elliptic curves. But it is still a stunning advance. As far as I know, nobody had even thought of trying to prove average statements like these.

I think I also picked up that there were connections between the delicate methods that Bhargava used to enumerate number fields (which again involved counting lattice points in unbounded sets) and his more recent work with Shankar.

Finally, Gross reminded us that Faltings showed that for hyperelliptic curves (a curve of the form y^2=p(x) for a polynomial p — when p is a cubic you get an elliptic curve) the number of rational points is finite. Another result of Bhargava is that for almost all hyperelliptic curves there are in fact no rational points.

While it is clear from what people have said about the work of the four medallists that they have all proved amazing results and changed their fields, I think that in Bhargava’s case it is easiest for the non-expert to understand just why his work is so amazing. I can’t wait to see what he does next.


Update. Andrew Granville emailed me some corrections to what I had written above, which I reproduce with his permission.

A couple of major things — certainly composition was much better understood by Dirichlet (Gauss’s student) and his version is quite palatable (in fact rather easier to understand, I would say, than that of Bhargava). It also led, fairly easily, to re-interpretation in terms of ideals, and inspired Dedekind’s development of (modern) algebraic number theory. Where Bhargava’s version is interesting is that

1) It is the most extraordinarily surprising re-interpretation.

2) It is a beautiful example of an algebraic phenomenon (involving group actions on representations) that he has been able to develop in many extraordinary and surprising directions.

2/ 66% was proved by Bhargava, Skinner and Wei Zhang and goes some way beyond Bhargava/Shankar, involving some very deep ideas of Skinner (whereas most of Bhargava’s work is accessible to a widish audience).


ResonaancesWeekend Plot: ultimate demise of diphoton Higgs excess

This weekend's plot is the latest ATLAS measurement of the Higgs signal strength μ in the diphoton channel:

Together with the CMS paper posted earlier this summer, this is probably the final word on Higgs-to-2-photons decays in the LHC run-I. These measurements have had an eventful history. The diphoton final state was one of the Higgs discovery channels back in 2012. Initially, both ATLAS and CMS were seeing a large excess of the Higgs signal strength compared to the standard model prediction. That was very exciting, as it was hinting at new charged particles  with masses near 100 GeV. But in nature, sooner or later, everything has to converge to the standard model.  ATLAS and CMS chose different strategies to get there. In CMS, the central value μ(t) displays an oscillatory behavior, alternating between excess and deficit. Each iteration brings it closer to the standard model limit μ = 1, with the latest reported value of μ= 1.14 ± 0.26.  In ATLAS, on the other hand, μ(t) decreases monotonically, from μ = 1.8 ± 0.5 in summer 2012 down to μ = 1.17 ± 0.27 today (that precise value corresponds to the Higgs mass of 125.4 GeV, but from the plot one can see that the signal strength is similar anywhere in the 125-126 GeV range). At the end of the day, both strategies have led to almost identical answers :)

Scott AaronsonRaise a martini glass for Google and Martinis!

We’ve already been discussing this in the comments section of my previous post, but a few people emailed me to ask when I’d devote a separate blog post to the news.

OK, so for those who haven’t yet heard: this week Google’s Quantum AI Lab announced that it’s teaming up with John Martinis, of the University of California, Santa Barbara, to accelerate the Martinis group‘s already-amazing efforts in superconducting quantum computing.  (See here for the MIT Tech‘s article, here for Wired‘s, and here for the WSJ‘s.)  Besides building some of the best (if not the best) superconducting qubits in the world, Martinis, along with Matthias Troyer, was also one of the coauthors of two important papers that found no evidence for any speedup in the D-Wave machines.  (However, in addition to working with the Martinis group, Google says it will also continue its partnership with D-Wave, in an apparent effort to keep reality more soap-operatically interesting than any hypothetical scenario one could make up on a blog.)

I have the great honor of knowing John Martinis, even once sharing the stage with him at a “Physics Cafe” in Aspen.  Like everyone else in our field, I profoundly admire the accomplishments of his group: they’ve achieved coherence times in the tens of microseconds, demonstrated some of the building blocks of quantum error-correction, and gotten tantalizingly close to the fault-tolerance threshold for the surface code.  (When, in D-Wave threads, people have challenged me: “OK Scott, so then which experimental quantum computing groups should be supported more?,” my answer has always been some variant of: “groups like John Martinis’s.”)

So I’m excited about this partnership, and I wish it the very best.

But I know people will ask: apart from the support and well-wishes, do I have any predictions?  Alright, here’s one.  I predict that, regardless of what happens, commenters here will somehow make it out that I was wrong.  So for example, if the Martinis group, supported by Google, ultimately succeeds in building a useful, scalable quantum computer—by emphasizing error-correction, long coherence times (measured in the conventional way), “gate-model” quantum algorithms, universality, and all the other things that D-Wave founder Geordie Rose has pooh-poohed from the beginning—commenters will claim that still most of the credit belongs to D-Wave, for whetting Google’s appetite, and for getting it involved in superconducting QC in the first place.  (The unstated implication being that, even if there were little or no evidence that D-Wave’s approach would ever lead to a genuine speedup, we skeptics still would’ve been wrong to state that truth in public.)  Conversely, if this venture doesn’t live up to the initial hopes, commenters will claim that that just proves Google’s mistake: rather than “selling out to appease the ivory-tower skeptics,” they should’ve doubled down on D-Wave.  Even if something completely different happens—let’s say, Google, UCSB, and D-Wave jointly abandon their quantum computing ambitions, and instead partner with ISIS to establish the world’s first “Qualiphate,” ruling with a niobium fist over California and parts of Oregon—I would’ve been wrong for having failed to foresee that.  (Even if I did sort of foresee it in the last sentence…)

Yet, while I’ll never live to see the blog-commentariat acknowledge the fundamental reasonableness of my views, I might live to see scalable quantum computers become a reality, and that would surely be some consolation.  For that reason, even if for no others, I once again wish the Martinis group and Google’s Quantum AI Lab the best in their new partnership.


Unrelated Announcement: Check out a lovely (very basic) introductory video on quantum computing and information, narrated by John Preskill and Spiros Michalakis, and illustrated by Jorge Cham of PhD Comics.

September 06, 2014

Quantum DiariesDark Skies (Or, A Very Brief Guide to Indirect Detection)

For my inaugural post a few months ago I discussed dark matter direct detection and the search for WIMPs deep underground. As a graduate student on the Large Underground Xenon (LUX) experiment, this is the area that I am most familiar with, but it is by no means the only way to hunt for these elusive particles. The very idea of dark matter was first motivated by problems in astronomy (such as understanding the rotation curves of galaxies), so what better way to look for it than to turn our telescopes to the skies?

The best way to get an intuition for the physics behind dark matter detection is to look at the Feynman diagrams representing interactions between dark matter particles and standard model particles. For example, the relevant interactions in the direct detection of WIMPS have the general form:

qd_dark_skies_directfeyman1

Feynman diagrams are conventionally drawn with time as the horizontal axis, increasing as you go from left to right. In this particular diagram a WIMP χ and a standard model particle, which I somewhat un-creatively call sm, come in from the left, interact at the vertex of the diagram, and then a WIMP χ and a standard model particle sm, leave on the right. (Here I have deliberately obscured the vertex, since there are many possible interactions and combinations of interactions that yield Feynman diagrams with the same initial and final particle states.) More succinctly, we can think of this diagram as a WIMP χ and a standard model particle sm scattering off each other.  Direct detection experiments like LUX or the Cryogenic Dark Matter Search (CDMS) look specifically for WIMPs scattering off protons or neutrons in an atomic nucleus, so the relevant Feynman diagrams are:

qd_dark_skies_directfeyman2

Feynman diagrams are kind of beautiful in that you can draw a diagram for most any particle interaction you can think of; you can flip it, rotate it, and smoosh it around; and because of certain symmetry considerations you will in general still end up with something representing a completely valid, physically-allowed particle interaction.

Let’s do this with our direct detection diagram. If we just rotate it a quarter-turn, we end up with the following:

qd_dark_skies_directfeymanrotate

We can interpret this as a two WIMPs colliding and annihilating to form standard model particles in a way analogous to how electron-positron annihilation produces photons. WIMPs might be Majorana particles, i.e. their own antiparticles, or they might be Dirac particles, that is, distinct from anti-WIMPs, but the bottom line is still the same: the detection of the annihilation products can be used to deduce the presence of the initial WIMPs. (It might also be that WIMPs are unstable and therefore decay into standard model particles, in which case we could also look for their decay products.)

“Indirect” detection is the rather apt name for the technique of searching for WIMPs by trying to detect the products of their annihilation to standard model particles.

This strategy presents an entirely different set of challenges than direct detection. For one thing, you can’t shield against backgrounds in the same way that you can with direct detection experiments. After all, your signal consists of ordinary standard model particles, albeit standard model particles from an exotic origin, so any attempt to shield your experiment will just block out your desired signal along with the background. So where LUX is a “zero-background” experiment, indirect detection experiments look for signals that manifest themselves as tiny excesses of events over and above a large background. Additionally, indirect detection requires that WIMPs in the universe be both abundant enough and close enough together that there is a non-negligible probability for annihilation to occur. If in fact WIMPs are the answer to the dark matter problem then this was most certainly true in the early universe, but today, cosmologists estimate the local density of WIMPs to be approximately 0.3 GeV/c2/cm3. This corresponds to only about three WIMPs per cubic meter! This is a challenge indeed, but luckily there are a few places in the universe where gravity helps us out.

First of all, we can look for WIMPs in the centers of galaxies, where gravity helps coalesce both standard model and exotic massive particles into higher-density clumps. Here there are a number of annihilation processes we can search for. For instance, we can look for WIMPs annihilating directly into gamma rays, in which case the signal would be a mono-energetic peak in the gamma ray spectrum:

qd_dark_skies_gammas

Note that as in my direct detection diagrams I have deliberately obscured the vertex of this diagram. Because WIMPs by definition do not interact electromagnetically they cannot convert directly into photons. However, the interaction represented in this particular diagram could take place if it contains an intermediate step where WIMPs convert first into a non-photon standard model particle. Then this intermediate particle could produce a photon final state.

The galactic center is not the easiest place to search for rare events. Here, the hunt for gammas from WIMP annihilations is complicated by the existence of many bright, diffuse gamma backgrounds in the from astrophysical processes that are not necessarily well-understood. In addition, the density profile of our WIMP halo is not well-understood near the center of our galaxy. It might be that our dark matter halo has a very dense “cusp” near the center; on the other hand it might very well be that the dark matter density in our halo increases up to a point but then plateaus to a constant density toward the center of the galaxy. Regarding the latter, understanding this density profile is an active area of research in computational and observational cosmology today. After all, if we don’t know how much dark matter is in the center of our galaxy, then how can we predict what an annihilation signal in that location might look like?

In order to mitigate the first of these complications, we can look to galaxies other than our own. In the Milky Way’s Local Group there are a number of galaxies called “dwarf spheroidals” which have extremely low luminosities, little to no interstellar gas and dust, and as a result, much less overall background than in our own galaxy. This sort of environment might therefore be very conducive to the indirect detection of WIMPs.

We can also look for WIMPs annihilating into heavy standard model particles. Generally these decay rapidly, producing jets that in turn yield a whole continuous spectrum of gammas and other particles. Schematically, we can summarize this process as:

qd_dark_skies_jets

Perhaps the most interesting products of these annihilations are the antimatter particles produced in these jets. The matter/antimatter asymmetry in the universe is a whole other mystery to be solved, but it does provide for us a fairly conclusive smoking-gun WIMP signal. Antimatter in the universe is rare enough that a large flux of antimatter particles could suggest WIMP annihilation events are taking place. Some classes of indirect detection experiments look for positron excesses; others look for antiprotons or antideuterons. On the other hand, these experiments are also complicated by the existence of other cosmic-ray backgrounds and the diffusion of these annihilation products in the Earth’s atmosphere. Understanding and modeling the (non-WIMP-related) processes that produce cosmic rays is also a very active area of research.

Finally, we expect there to be high WIMP densities in the sun’s gravitational potential well. This means that we could conceivably hunt for WIMPs much closer to home and not have to worry about backgrounds from other sources in the galaxy. There is a catch, however. The sun is so incredibly dense that the mean free path of, say, a photon inside its center is only about a centimeter. Each particle that escapes to its surface can only do so after going through a random walk of many, many absorptions and re-emissions. On average, this can take as many as hundreds of thousands or even millions of years! Neutrinos are the sole exception: they interact so weakly with other standard model particles that for the most part they just zip straight through the sun with no problem. Searches for dark matter annihilations in the sun therefore focus on neutrino-producing processes.

qd_dark_skies_jets_sun

Neutrinos themselves are difficult to detect, but fortunately we do have technologies that are capable of doing so.

***

Over the next decade or so, I predict that indirect detection will be a very hot topic in particle physics (and not just because I really like dark matter!) There are a number of clever experiments that have already produced some interesting results, and several more scheduled to be constructed over the next few years. Stay tuned, because there will be a Part II to this article that will look at some of these experiments in detail.

Jordan EllenbergSo… yeah

Lately CJ has a habit of ending every story he tells by saying

“So… yeah.”

I first noticed it this summer, so I think he picked it up from his camp counselors. What does it mean? I tend to read it as something like

“I have told my story — what conclusions can we draw from it? Who can say? It is what it is.”

Is that roughly right? Per the always useful Urban Dictionary the phrase is

“used when relating a past event and teller is unsure or too lazy to think of a good way to conclude it”

but I feel like it has more semantic content than that. Though I just asked CJ and he says it’s just his way of saying “That’s all.” Like “Over and out.”

So yeah.


n-Category Café Ronnie Brown in Paris

Ronnie Brown has brought to my attention a talk he gave recently at the Workshop Constructive Mathematics and Models of Type Theory, IHP Paris, 02 June 2014 - 06 June 2014.

Title: Intuitions for cubical methods in nonabelian algebraic topology

Abstract: The talk will start from the 1-dimensional Seifert-van Kampen Theorem for the fundamental group, then groupoid, and so to a use of strict double groupoids for higher versions. These allow for some precise nonabelian calculations of some homotopy types, obtained by a gluing process. Cubical methods are involved because of the ease of writing multiple compositions, leading to “algebraic inverses to subdivision”, relevant to higher dimensional local-to-global problems. Also the proofs involve some ideas of 2-dimensional formulae and rewriting. The use of strict multiple groupoids is essential to obtain precise descriptions as colimits and hence precise calculations. Another idea is to use both a “broad” and a “narrow” model of a particular kind of homotopy types, where the broad model is used for conjectures and proofs, while the narrow model is used for calculations and relation to classical methods. The algebraic proof of the equivalence of the two models then gives a powerful tool.

Slides are available from his preprint page.

September 05, 2014

Tim GowersICM2014 — Kollár, Conlon, Katz, Krivelevich, Milnor

As the ICM recedes further into the past, these posts start to feel less and less fresh. I’ve had an enforced break from them as over the course of three days I drove my family from the south of France back to Cambridge. So I think I’ll try to do what I originally intended to do with all these posts, and be quite a lot briefer about each talk.

As I’ve already mentioned, Day 3 started with Jim Arthur’s excellent lecture on the Langlands programme. (In a comment on that post, somebody questioned my use of “Jim” rather than “James”. I’m pretty sure that’s how he likes to be known, but I can’t find any evidence of that on the web.) The next talk was by Demetrios Christodoulou, famous for some extraordinarily difficult results he has proved in general relativity. I’m not going to say anything about the talk, other than that I didn’t follow much of it, because he had a series of dense slides that he read word for word. The slides may even have been a suitably chopped up version of his article for the ICM proceedings, but I have not been able to check that. Anyhow, after a gentle introduction of about three or four minutes, I switched off.

I switched on again for János Kollár’s lecture, which was, like some of the others, what I feel a plenary lecture should be: a lecture that gives the non-expert a feel for what is important in the area being talked about. The first thing I wrote down was his brief description of the minimal model problem, one of the central questions in algebraic geometry. I think that by that time he had spent a while telling us what algebraic sets were, explaining why the picture you get if you just work over the reals is somewhat incomplete (for example, you may get a graph with two components, when if you work over the extended complex plane you have a torus), and so on.

The minimal model problem is this: given an algebraic variety X, find a variety X^m (the “minimal model” of X) such that the space of meromorphic functions on X is isomorphic to the space of meromorphic functions on X^m and the geometry of X^m is as simple as possible. The condition that the function spaces are isomorphic seems (from a glance at Wikipedia) to be another way of saying that the two varieties are birationally equivalent, which is a fundamental notion of equivalence in algebraic geometry. So one is trying to find a good representative of each equivalence class.

The problem was solved for curves by Riemann in 1851, for surfaces by Enriques in 1914 and by Kodaira in 1966 (I don’t know exactly what that means, but I suppose Enriques made major inroads into the problem and Kodaira finished it off). And for higher dimensions there was the Mori program of 1981. As I understand it, Mori made huge progress towards understanding the three-dimensional case, and Christopher Hacon and James McKernan, much more recently, made huge progress in higher dimensions.

Another major focus of research is the moduli problem. This, Kollár told us, asks what are the simplest families of algebraic varieties, and how can we transform any family into a simplest one? I don’t know what this means, but I would guess that when he said “families of algebraic varieties” he was talking about some kind of moduli space (partly because that seems the most likely meaning, and partly because of the word “moduli” in the name of the problem). So perhaps the problem is sort of like a “family version” of the minimal model problem: you want to find a simplest moduli space that is in some sense similar to the one you started with.

Anyhow, whatever the problem is, it was done for curves by Deligne and Mumford in 1969, for surfaces by Kollár and Shepherd-Barron in 1988 and Alexeev in 1996 (again I don’t know who did what), and apparently in higher dimensions the Kollár-Shepherd-Barron-Alexeev method works, but there are technical details. (Does that mean that Kollár is confident that the method works but that a full proof has not yet been written out? He may well have told us, but my notes don’t tell me now.)

Kollár then explained to us a third problem. A general technique for studying a variety X is to find a variety Y that is birationally equivalent to X and study the question for Y instead. Under these circumstances, there will be lower dimensional subvarieties Z\subset X and W\subset Y such that X\setminus Z\cong Y\setminus W. So one is left needing to answer a similar question for Z and W, and since these are of lower dimension, one has the basis for an inductive proof. But for that to work, we want Y to be adapted to the problem, so the question, “When is a variety simple?” arises.

Apparently this was not even a precisely formulated question until work of Mori and Reid (1980-2) and Kollár, Miyaoka and Mori (1992). The precise formulation involves the first Chern class.

And that’s all I have, other than a general memory that this lecture continued the generally high standard of plenary lectures at the congress.

At 2pm, Avila gave his Fields medallist’s lecture. As with Hairer, I don’t feel I have much to say that I have not already said when describing the laudatio, so I’ll move on to 3pm, or rather 3:05 — by today the conference organizers had realized that it took a non-zero amount of time to get from one talk to another — when David Conlon was speaking.

David Conlon at the start of his lecture

David Conlon at the start of his lecture

David is a former student and collaborator of mine, and quite a bit of what he talked about concerned that collaboration. I’ll very briefly describe our main result.

There are many combinatorial theorems that can be regarded as questions about arbitrary subsets of nice structures such as the complete graph on n vertices or the cyclic group of order p. For example, Ramsey’s theorem says that if you 2-colour the edges of the complete graph on n vertices, then (as long as n is large enough) one of the colour classes will contain a complete graph on r vertices. And Szemerédi’s theorem is equivalent to the assertion that for every \delta>0 and every positive integer k there exists p such that for every subset A of the integers mod p of size at least \delta p there exist a and d\ne 0 such that all of a, a+d,\dots, a+(k-1)d belong to A.

For many such questions, one can generalize them from the “nice” structures to arbitrary structures. For instance, one can ask of a given graph G whether if you colour its edges with two colours then one of those colours must contain a complete subgraph with r vertices. Obviously, the answer will be yes for some G and no for others, but to make it an interesting question, one can ask what happens for a random G. More precisely, how sparse can a random graph G be and still have the Ramsey property?

This question was answered in full by Rödl and Rucinski, but our method gives a new proof of the upper bound (on how dense the random graph needs to be), and also gives a very general method that solves many problems of this type that were previously unsolved. For example, for Szemerédi’s theorem it tells us the following. Define a subset A of \mathbb{Z}_N to be (\delta,k)-Szemerédi if every subset B\subset A of size at least \delta|A| contains an arithmetic progression of length k. Then if C is large enough (depending on \delta and k only), then a random subset of \mathbb{Z}_N where elements are chosen independently with probability p=Cn^{-1/(k-1)} is (\delta,k)-Szemerédi with high probability.

This bound is within a constant of best possible, since if the probability dips below n^{-1/(k-1)}/2, around half the elements of the random set will not even belong to an arithmetic progression of length k, so those elements form a dense set that proves that A is not (\delta,k)-Szemerédi.

The method David and I used was inspired by the “transference principle” that Green and Tao used to prove their famous result about arithmetic progressions in the primes, though it involved several additional ingredients. A completely different approach was discovered independently by Mathias Schacht. Like ours, his approach established a large number of previously open “sparse random versions” of well-known combinatorial theorems.

David always gives very nice talks, and this one was no exception.

After his talk, I went to hear Nets Katz — with some regret as it meant missing Maria Chudnovski, who followed on from David in the combinatorics section. I must try to watch the video of her talk some time, though I’m bad at finding time to watch videos on the internet if they last for more than about three minutes.

Nets talked about work related to his famous solution with Larry Guth of the Erdős distance problem. That problem asks how many distinct distances there must be if you have n points in the plane. If you put them evenly spaced along a line, you get n-1 distinct distances. You can do a bit better than that by putting them in a \sqrt n \times \sqrt n grid: because the density of numbers that can be expressed as a sum of two squares is roughly 1/\sqrt{\log n}, one gets around n/\sqrt{\log n} distinct distances this way.

Erdős asked whether this was anywhere near to being best possible. More precisely, he asked whether there was a lower bound of n^{1-o(1)}, and that is what Guth and Katz proved. This was a great result that answered a question that many people had worked on, but it is also notable because the proof was very interesting. One of the main tools they used was the polynomial method, which I will not attempt to describe here, but if you are curious, then Terence Tao has posted on it several times. Nets Katz’s talk is here.

Then it was back (quite some way) to the combinatorics room to hear Michael Krivelevich talking about games. (This link is quite hard to find because they’ve accidentally put his name as Michael Krivelerich.) By “games” I mean two-player positional games, which are defined as follows. You have a set X (the board) and a collection \mathcal A of subsets of X (the winning positions). There are then two kinds of games that are typically studied. In both kinds, a move consists in choosing a point of X that has not yet been chosen. In the first kind of game, the players alternate choosing points and the winner is the first player who can make a set in \mathcal A out of his/her points. (If neither player can do this by the time the entire board is filled up, then the result is a draw.) Noughts and crosses (or tic-tac-toe) is an example of this: X is a 3-by-3 grid and \mathcal A consists of all lines of three points in that grid.

A well-known argument that goes back (at least) to John Nash when he was thinking about the game of Hex proves that the second player cannot have a winning strategy for this game. The argument, referred to as strategy stealing is as follows. Suppose that the second player does have a winning strategy. Then the first player has a winning strategy as well, which works like this. First choose an arbitrary x\in X. Then ignore x, pretend that your opponent is the first player and play the second player’s winning strategy. If you ever find that you have already played the point that the strategy dictates, then play an arbitrary unoccupied point instead.

This contradiction (a contradiction since it is not possible for both players to have winning strategies) proves that the first player can guarantee a draw, but it is a highly inexplicit argument, so it gives no clue about how the first player can do that. An interesting open problem that Krivelevich mentioned relates this to the Hales-Jewett theorem. A consequence of the Hales-Jewett theorem is that if you play noughts and crosses on an n-dimensional board where each side has length k, then provided n is large enough in terms of k, it is not possible for the outcome to be a draw — since there is no 2-colouring of the points of the grid that does not give rise to a monochromatic line. So we know that the first player has a winning strategy. However, no explicit strategy is known, even if n is allowed to be ridiculously large. (I am talking here about general k: for small k such as 3, and perhaps even 4, a winning strategy is known for fairly small n.)

I asked Krivelevich about this problem, and his opinion was that it was probably very hard. The difficulty is that the first player has to devote too much attention to stopping the second player from winning, so cannot concentrate on trying to build up a line.

Another open problem is to find an explicit strategy that proves the following statement: there exist positive integers t and n_0 such that for every n\geq n_0, if the game is played on the complete graph on n vertices (that is, players are alternately choosing edges), then the first player can create the first clique of size 5 in at most t moves.

A moment I enjoyed in the talk was when Krivelevich mentioned something called the extra set paradox, which is the statement that if you add to the set of winning positions, a game that was previously a win for the first player can become a draw.

At first that seemed to me obviously false. When that happens, it is always interesting to try to analyse one’s thoughts and formulate the incorrect proof that has sprung to mind. The argument I had was something like that adding an extra set only increased the options available to the first player, so could not make it harder to win. And that argument is complete garbage, because it increases the options for the second player too. So if, for example, the first player plays as though the extra winning positions didn’t exist, the second player could potentially win by reaching one of those positions. The extra effort required to stop this can potentially (and sometimes does) kill the first player’s winning strategy.

Games of the kind I’ve just been discussing seem to be very hard to analyse, so attention has turned to a different kind of game, called a maker-breaker game. Here, the first player’s objective is to occupy a winning position, and the second player’s objective is to stop that happening. Also, the number of moves allotted to the two players is often different: we may allow one player to take m moves for each move that the other player takes.

A typical question looked at is to take a graph property such as “contains a Hamilton cycle” and to try to find the threshold at which breaker can win. That is, if breaker gets m moves for each move of maker, how large does m need to be in order for breaker to be able to stop maker from making a Hamilton cycle? The answer to this, discovered by Krivelevich in 2011, is that the threshold is at m=n/\log n, in the sense that if m\leq (1-\epsilon)n/\log n then maker wins, while if m\geq (1+\epsilon)n/\log n then breaker wins.

What makes this result particularly interesting is that the threshold occurs when the number of edges that maker gets to put down is (approximately) equal to the number of edges a random graph needs to have in order to contain a Hamilton cycle. This is the so-called random paradigm that allows one to guess the answers to many of these questions. (It was Erdős who first conjectured that this paradigm should hold.) It seems to be saying that if both players play optimally, then the graph formed by maker will end up looking like a random graph. It is rather remarkable that this has in some sense actually been proved.

Next up, at 6pm (this was a very long day) was the Abel lecture. This is a tradition started in 2010, where one of the last four Abel Prize winners gives a lecture at the ICM. The chosen speaker this time was John Milnor, whose title was “Topology through four centuries.” DSC00672I did not take notes during this lecture, so I have to rely on my memory. Here’s what I remember. First of all, he gave us a lot of very interesting history. A moment I enjoyed was when he discussed the proof of a certain result and said that he liked it because it was the first example he knew of of the use of Morse theory. A long time ago, when I had very recently got my PhD, I thought about a problem about convex bodies that caused me to look at Milnor’s famous book on Morse theory. I can’t now remember what the problem was, but I think I was trying to think hard about what happens if you take the surface of a symmetric convex body with a sphere inside, gradually shrink it until it is inside the sphere, and look at the intersection of the two surfaces. That gives you (generically) a codimension-1 subset of the sphere that appears, moves about, and eventually vanishes again. That’s exactly the kind of situation studied by Morse theory.

Much more recently, indeed, since the talk, I have had acute personal experience of Morse theory in the outdoor unheated swimming pool where I was staying in France. Because I am worried about setting my heart out of rhythm if I give it too much of a shock, I get into cold swimming pools very slowly, rather than jumping in and getting the discomfort over all at once. This results in what my father describes as a ring of pain: the one-dimensional part of the surface of your body that is not yet used to the water and not safely outside it. Of course, the word “ring” is an oversimplification. Ignoring certain details that are inappropriate for a family post such as this, what I actually experience is initially two rings that after a while fuse to become a figure of eight, which then instantly opens out into a single large ring, to be joined by two more small rings that fuse with the large ring to make a yet larger ring that then becomes a lot smaller before increasing in size for a while and finally shrinking down to a point.

It is clear that if you are given the cross-sections of a surface with all the planes in a certain direction that intersect it, then you can reconstruct the surface. As I understand it, the basic insight of Morse theory is that what really matters if you want to know about the topology of the surface is what happens at the various singular moments such as when there is a figure of eight, or when a ring first appears, etc. The bits in between where the rings are just moving about and minding their own business don’t really affect anything. How this insight plays out in detail I don’t know.

As one would expect from Milnor, the talk was a beautiful one. In traditional fashion, he talked about surfaces, then 3-manifolds, and finally 4-manifolds. I think he may even have started in one dimension with a discussion of the bridges-of-Königsberg problem, but my memory of that is hazy. Anyhow, an indication of just how beautiful the talk was is what happened at the end. He misjudged the time, leaving himself about two minutes to discuss 4-manifolds. So he asked the chairman what he should do about it, and the chairman (who was Helge Holden) told him to take as much time as he wanted. Normally that would be the cause for hate rays to emanate towards the chairman and the speaker from the brains of almost the entire audience. But with this talk, the idea of missing out on the 4-manifold equivalent of what we had just heard for 2-manifolds and 3-manifolds was unthinkable, and there was a spontaneous burst of applause for the decision. I’ve never seen anything like it.

The one other thing I remember was a piece of superhuman modesty. When Milnor discussed examples of extraordinary facts about differentiable structures on 4-manifolds, the one he mentioned was the fact that there are uncountably many distinct such structures on \mathbb{R}^4, which was discovered by Cliff Taubes. The way Milnor presented it, one could have been forgiven for thinking that the fact that there can be distinct differentiable structures on a differentiable manifold was easy, and the truly remarkable thing was getting uncountably many, whereas in fact one of Milnor’s most famous results was the first example of a manifold with more than one differentiable structure. (The result of Taubes is remarkable even given what went before it: the first exotic structures on \mathbb{R}^4 were discovered by Freedman and Kirby.)

Just to finish off the description of the day, I’ll mention that in the evening I went to a reception hosted by the Norwegians (so attending the Abel lecture was basically compulsory, though I’d have done so anyway). Two things I remember about that are a dish that contained a high density of snails and the delightful sight of Maryam Mirzakhani’s daughter running about in a forest of adult legs. Then it was back to my hotel room to try to gather energy for one final day.


Terence TaoNarrow progressions in the primes

Tamar Ziegler and I have just uploaded to the arXiv our paper “Narrow progressions in the primes“, submitted to the special issue “Analytic Number Theory” in honor of the 60th birthday of Helmut Maier. The results here are vaguely reminiscent of the recent progress on bounded gaps in the primes, but use different methods.

About a decade ago, Ben Green and I showed that the primes contained arbitrarily long arithmetic progressions: given any {k}, one could find a progression {n, n+r, \dots, n+(k-1)r} with {r>0} consisting entirely of primes. In fact we showed the same statement was true if the primes were replaced by any subset of the primes of positive relative density.

A little while later, Tamar Ziegler and I obtained the following generalisation: given any {k} and any polynomials {P_1,\dots,P_k: {\bf Z} \rightarrow {\bf Z}} with {P_1(0)=\dots=P_k(0)}, one could find a “polynomial progression” {n+P_1(r),\dots,n+P_k(r)} with {r>0} consisting entirely of primes. Furthermore, we could make this progression somewhat “narrow” by taking {r = n^{o(1)}} (where {o(1)} denotes a quantity that goes to zero as {n} goes to infinity). Again, the same statement also applies if the primes were replaced by a subset of positive relative density. My previous result with Ben corresponds to the linear case {P_i(r) = (i-1)r}.

In this paper we were able to make the progressions a bit narrower still: given any {k} and any polynomials {P_1,\dots,P_k: {\bf Z} \rightarrow {\bf Z}} with {P_1(0)=\dots=P_k(0)}, one could find a “polynomial progression” {n+P_1(r),\dots,n+P_k(r)} with {r>0} consisting entirely of primes, and such that {r \leq \log^L n}, where {L} depends only on {k} and {P_1,\dots,P_k} (in fact it depends only on {k} and the degrees of {P_1,\dots,P_k}). The result is still true if the primes are replaced by a subset of positive density {\delta}, but unfortunately in our arguments we must then let {L} depend on {\delta}. However, in the linear case {P_i(r) = (i-1)r}, we were able to make {L} independent of {\delta} (although it is still somewhat large, of the order of {k 2^k}).

The polylogarithmic factor is somewhat necessary: using an upper bound sieve, one can easily construct a subset of the primes of density, say, {90\%}, whose arithmetic progressions {n,n+r,\dots,n+(k-1)r} of length {k} all obey the lower bound {r \gg \log^{k-1} n}. On the other hand, the prime tuples conjecture predicts that if one works with the actual primes rather than dense subsets of the primes, then one should have infinitely many length {k} arithmetic progressions of bounded width for any fixed {k}. The {k=2} case of this is precisely the celebrated theorem of Yitang Zhang that was the focus of the recently concluded Polymath8 project here. The higher {k} case is conjecturally true, but appears to be out of reach of known methods. (Using the multidimensional Selberg sieve of Maynard, one can get {m} primes inside an interval of length {O( \exp(O(m)) )}, but this is such a sparse set of primes that one would not expect to find even a progression of length three within such an interval.)

The argument in the previous paper was unable to obtain a polylogarithmic bound on the width of the progressions, due to the reliance on a certain technical “correlation condition” on a certain Selberg sieve weight {\nu}. This correlation condition required one to control arbitrarily long correlations of {\nu}, which was not compatible with a bounded value of {L} (particularly if one wanted to keep {L} independent of {\delta}).

However, thanks to recent advances in this area by Conlon, Fox, and Zhao (who introduced a very nice “densification” technique), it is now possible (in principle, at least) to delete this correlation condition from the arguments. Conlon-Fox-Zhao did this for my original theorem with Ben; and in the current paper we apply the densification method to our previous argument to similarly remove the correlation condition. This method does not fully eliminate the need to control arbitrarily long correlations, but allows most of the factors in such a long correlation to be bounded, rather than merely controlled by an unbounded weight such as {\nu}. This turns out to be significantly easier to control, although in the non-linear case we still unfortunately had to make {L} large compared to {\delta} due to a certain “clearing denominators” step arising from the complicated nature of the Gowers-type uniformity norms that we were using to control polynomial averages. We believe though that this an artefact of our method, and one should be able to prove our theorem with an {L} that is uniform in {\delta}.

Here is a simple instance of the densification trick in action. Suppose that one wishes to establish an estimate of the form

\displaystyle  {\bf E}_n {\bf E}_r f(n) g(n+r) h(n+r^2) = o(1) \ \ \ \ \ (1)

for some real-valued functions {f,g,h} which are bounded in magnitude by a weight function {\nu}, but which are not expected to be bounded; this average will naturally arise when trying to locate the pattern {(n,n+r,n+r^2)} in a set such as the primes. Here I will be vague as to exactly what range the parameters {n,r} are being averaged over. Suppose that the factor {g} (say) has enough uniformity that one can already show a smallness bound

\displaystyle  {\bf E}_n {\bf E}_r F(n) g(n+r) H(n+r^2) = o(1) \ \ \ \ \ (2)

whenever {F, H} are bounded functions. (One should think of {F,H} as being like the indicator functions of “dense” sets, in contrast to {f,h} which are like the normalised indicator functions of “sparse” sets). The bound (2) cannot be directly applied to control (1) because of the unbounded (or “sparse”) nature of {f} and {h}. However one can “densify” {f} and {h} as follows. Since {f} is bounded in magnitude by {\nu}, we can bound the left-hand side of (1) as

\displaystyle  {\bf E}_n \nu(n) | {\bf E}_r g(n+r) h(n+r^2) |.

The weight function {\nu} will be normalised so that {{\bf E}_n \nu(n) = O(1)}, so by the Cauchy-Schwarz inequality it suffices to show that

\displaystyle  {\bf E}_n \nu(n) | {\bf E}_r g(n+r) h(n+r^2) |^2 = o(1).

The left-hand side expands as

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s \nu(n) g(n+r) h(n+r^2) g(n+s) h(n+s^2).

Now, it turns out that after an enormous (but finite) number of applications of the Cauchy-Schwarz inequality to steadily eliminate the {g,h} factors, as well as a certain “polynomial forms condition” hypothesis on {\nu}, one can show that

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s (\nu-1)(n) g(n+r) h(n+r^2) g(n+s) h(n+s^2) = o(1).

(Because of the polynomial shifts, this requires a method known as “PET induction”, but let me skip over this point here.) In view of this estimate, we now just need to show that

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s g(n+r) h(n+r^2) g(n+s) h(n+s^2) = o(1).

Now we can reverse the previous steps. First, we collapse back to

\displaystyle  {\bf E}_n | {\bf E}_r g(n+r) h(n+r^2) |^2 = o(1).

One can bound {|{\bf E}_r g(n+r) h(n+r^2)|} by {{\bf E}_r \nu(n+r) \nu(n+r^2)}, which can be shown to be “bounded on average” in a suitable sense (e.g. bounded {L^4} norm) via the aforementioned polynomial forms condition. Because of this and the Hölder inequality, the above estimate is equivalent to

\displaystyle  {\bf E}_n | {\bf E}_r g(n+r) h(n+r^2) | = o(1).

By setting {F} to be the signum of {{\bf E}_r g(n+r) h(n+r^2)}, this is equivalent to

\displaystyle  {\bf E}_n {\bf E}_r F(n) g(n+r) h(n+r^2) = o(1).

This is halfway between (1) and (2); the sparsely supported function {f} has been replaced by its “densification” {F}, but we have not yet densified {h} to {H}. However, one can shift {n} by {r^2} and repeat the above arguments to achieve a similar densificiation of {h}, at which point one has reduced (1) to (2).


Filed under: math.NT, paper Tagged: densification, Tamar Ziegler

September 04, 2014

Clifford JohnsonMeanwhile, Somewhere Down South…

hotel_down_south_1st_Sept_2014So while at a hotel somewhere down South for a few days (pen and watercolour pencil sketch on the right), I finally found time to sit and read Graham Farmelo's book "The Strangest Man", a biography of Dirac. (It has a longer subtitle as well, but the book is way over in the next room far from my cosy spot...) You may know from reading here (or maybe even have guessed) that if I were to list a few of my favourite 20th century physicists, in terms of the work they did and their approach and temperament, Dirac would be a strong contender for being at the top of the list. I am not a fan of the loudmouth and limelight-seeking school of doing physics that seems all so popular, and I much prefer the approach of quietly chipping away at interesting (not always fashionable) problems to see what might turn up, guided by a mixture of physical intuition, aesthetics, and a bit of pattern-spotting. It works, as Dirac showed time and again. I've read a lot about Dirac over the years, and was, especially in view of the title of the book, a little wary of reading the book when I got it four years ago, as I am not a fan of going for the "weren't they weird?" approach to biographies of scientists since they serve too [...] Click to continue reading this post

ResonaancesHiggs Recap

On the occasion of summer conferences the LHC experiments dumped a large number of new Higgs results. Most of them have already been advertised on blogs, see e.g. here or here or here. In case you missed anything, here I summarize the most interesting updates of the last few weeks.

1. Mass measurements.
Both ATLAS and CMS recently presented improved measurements of the Higgs boson mass in the diphoton and 4-lepton final states. The errors shrink to 400 MeV in ATLAS and 300 MeV in CMS. The news is that Higgs has lost some weight (the boson, not Peter). A naive combination of the ATLAS and CMS results yields the central value 125.15 GeV. The profound consequence is that, for another year at least,  we will call it the 125 GeV particle, rather than the 125.5 GeV particle as before ;)

While the central values of the Higgs mass combinations quoted by ATLAS and CMS are very close, 125.36 vs 125.03 GeV, the individual inputs are still a bit apart from each other. Although the consistency of the ATLAS measurements in the  diphoton and 4-lepton channels has improved, these two independent mass determinations differ by 1.5 GeV, which corresponds to a 2 sigma tension. Furthermore, the central values of the Higgs mass quoted by ATLAS and CMS differ by 1.3 GeV in the diphoton channel and by 1.1 in the 4-lepton channel, which also amount to 2 sigmish discrepancies. This could be just bad luck, or maybe the systematic errors are slightly larger than the experimentalists think.

2. Diphoton rate update.
CMS finally released a new value of the Higgs signal strength in the diphoton channel.  This CMS measurement was a bit of a roller-coaster: initially they measured an excess, then with the full dataset they reported a small deficit. After more work and more calibration they settled to the value 1.14+0.26-0.23 relative to the standard model prediction, in perfect agreement with the standard model. Meanwhile ATLAS is also revising the signal strength in this channel towards the standard model value.  The number 1.29±0.30 quoted  on the occasion of the mass measurement is not yet the final one; there will soon be a dedicated signal strength measurement with, most likely, a slightly smaller error.  Nevertheless, we can safely announce that the celebrated Higgs diphoton excess is no more.

3. Off-shell Higgs.
Most of the LHC searches are concerned with an on-shell Higgs, that is when its 4-momentum squared is very close to its mass. This is where Higgs is most easily recognizable, since it can show as a bump in invariant mass distributions. However Higgs, like any quantum particle, can also appear as a virtual particle off-mass-shell and influence, in a subtler way, the cross section or differential distributions of various processes. One place where an off-shell Higgs may visible contribute is the pair production of on-shell Z bosons. In this case, the interference between gluon-gluon → Higgs → Z Z process and  the non-Higgs one-loop Standard Model contribution to gluon-gluon → Z Z process can influence the cross section in a non-negligible way.  At the beginning, these off-shell measurements were advertised as a model-independent Higgs width measurement, although now it is recognized the "model-independent" claim does not stand. Nevertheless, measuring the ratio of the off-shell and on-shell Higgs production provides qualitatively new information  about the Higgs couplings and, under some specific assumptions, can be interpreted an indirect constraint on the Higgs width. Now both ATLAS and CMS quote the constraints on the Higgs width at the level of 5 times the Standard Model value.  Currently, these results are not very useful in practice. Indeed, it would require a tremendous conspiracy to reconcile the current data with the Higgs width larger than 1.3 the standard model  one. But a new front has been opened, and one hopes for much more interesting results in the future.


4. Tensor structure of Higgs couplings.
Another front that is being opened as we speak is constraining higher order Higgs couplings with a different tensor structure. So far, we have been given the so-called spin/parity measurements. That is to say, the LHC experiments imagine a 125 GeV particle with a different spin and/or parity than the Higgs, and the couplings to matter consistent with that hypothesis. Than they test  whether this new particle or the standard model Higgs better describes the observed differential  distributions of Higgs decay products. This has some appeal to general public and nobel committees but little practical meaning. That's because the current data, especially the Higgs signal strength measured in multiple channels, clearly show that the Higgs is, in the first approximation, the standard model one. New physics, if exists, may only be a small perturbation on top of the standard model couplings. The relevant  question is how well we can constrain these perturbations. For example, possible couplings of the Higgs to the Z boson are

In the standard model only the first type of coupling is present in the Lagrangian, and all the a coefficients are zero. New heavy particles coupled to the Higgs and Z bosons could be indirectly detected by measuring non-zero a's, In particular, a3 violates the parity symmetry and could arise from mixing of the standard model Higgs with a pseudoscalar particle. The presence of non-zero a's would show up, for example,  as a modification of the lepton momentum distributions in the Higgs decay to 4 leptons. This was studied by CMS in this note. What they do is not perfect yet, and the results are presented in an unnecessarily complicated fashion. In any case it's a step in the right direction: as the analysis improves and more statistics is accumulated in the next runs these measurements will become an important probe of new physics.

5. Flavor violating decays.
In the standard model, the Higgs couplings conserve flavor, in both the quark and the lepton sectors. This is a consequence of the assumption that the theory is renormalizable and that only 1 Higgs field is present.  If either of these assumptions is violated, the Higgs boson may mediate transitions between different generations of matter. Earlier, ATLAS and CMS  searched for top quark decay to charm and Higgs. More recently, CMS turned to lepton flavor violation, searching for Higgs decays to τμ pairs. This decay cannot occur in the standard model, so the search is a clean null test. At the same time, the final state is relatively simple from the experimental point of view, thus this decay may be a sensitive probe of new physics. Amusingly, CMS sees a 2.5 sigma significant  excess corresponding to the h→τμ branching fraction of order 1%. So we can entertain a possibility that Higgs holds the key to new physics and flavor hierarchies, at least until ATLAS comes out with its own measurement.

Sean CarrollTroublesome Speech and the UIUC Boycott

Self-indulgently long post below. Short version: Steven Salaita, an associate professor of English at Virginia Tech who had been offered and accepted a faculty job at the University of Illinois Urbana-Champaign, had his offer rescinded when the administration discovered that he had posted inflammatory tweets about Israel, such as “At this point, if Netanyahu appeared on TV with a necklace made from the teeth of Palestinian children, would anybody be surprised? #Gaza.” Many professors in a number of disciplines, without necessarily agreeing with Salaita’s statements, believe strongly that academic norms give him the right to say them without putting his employment in jeopardy, and have organized a boycott of UIUC in response. Alan Sokal of NYU is supporting the boycott, and has written a petition meant specifically for science and engineering faculty, who are welcome to sign if they agree.

Everyone agrees that “free speech” is a good thing. We live in a society where individual differences are supposed to be respected, and we profess admiration for the free market of ideas, where competing claims are discussed and subjected to reasonable critique. (Thinking here of the normative claim that free speech is a good thing, not legalistic issues surrounding the First Amendment and government restrictions.) We also tend to agree that such freedom is not absolute; you don’t have the right to come into my house (or the comment section of my blog) and force me to listen to your new crackpot theory of physics. A newspaper doesn’t have an obligation to print something just because you wrote it. Biology conferences don’t feel any need to give time to young-Earth creationists. In a classroom, teachers don’t have to sit quietly if a student wants to spew blatantly racist invective (and likewise for students while teachers do so).

So there is a line to be drawn, and figuring out where to draw it isn’t an easy task. It’s not hard to defend people’s right to say things we agree with; the hard part is defending speech we disagree with. And some speech, in certain circumstances, really isn’t worth defending — organizations have the right to get rid of employees who are (for example) consistently personally abusive to their fellow workers. The hard part — and it honestly is difficult — is to distinguish between “speech that I disagree with but is worth defending” and “speech that is truly over the line.”

To complicate matters, people who disagree often become — how to put this delicately? — emotional and polemical rather than dispassionate and reasonable. People are very people-ish that way. Consequently, we are often called upon to defend speech that we not only disagree with, but whose tone and connotation we find off-putting or even offensive. Those who would squelch disagreeable speech therefore have an easy out: “I might not agree with what they said, but what I really can’t countenance is the way they said it.” If we really buy the argument that ideas should be free and rational discourse between competing viewpoints is an effective method of discovering truth and wisdom, we have to be especially willing to defend speech that is couched in downright objectionable terms.

As an academic and writer, in close cases I will almost always fall on the side of defending speech even if I disagree with it (or how it is said). Recently several different cases have illustrated just how tricky this is — but in each case I think that the people in question have been unfairly punished for things they have said.

The first case is Ashutosh Jogalekar, who writes the Curious Wavefunction blog at Scientific American. Or “wrote,” I should say, since Ash has been fired, and is now blogging independently. (Full disclosure: I don’t know Ash, but he did write a nice review of my book; my wife Jennifer is also a blogger at SciAm, although I’m not privy to any inside info.)

The offenses for which he was let go amount to three controversial posts. One was actually a guest post by Chris Martin, arguing that Larry Summers was right when he said that “innate ability” “intrinsic aptitude” (my mistake — see comments) was a major determinant of women’s underrepresentation in science and math. It’s a dispiritingly self-righteous and sloppy argument, as well as one that has been thoroughly debunked; there may very well be innate differences, but the idea that they explain underrepresentation is laughably contradicted by the data, while the existence of discrimination is frighteningly demonstrable. The next post was a positive review of Nicholas Wade’s book A Troublesome Inheritance. Again, not very defensible on the merits; Wade’s book lurches incoherently from “genetic markers objectively correlate with geographical populations” to “therefore Chinese people may be clever, but they’ll never really understand democracy.” In a great Marshall McLuhan moment, over a hundred population geneticists — many of whom Wade relied on for the “scientific” parts of his book — wrote a scathing letter to the New York Times to make sure everyone understood “there is no support from the field of population genetics for Wade’s conjectures.” Finally, a post on Richard Feynman chronicled Ash’s feelings about the physicist, from young hero-worshiper to the eventual realization that Feynman could be quite disturbingly sexist, to ultimately feeling that we should understand Feynman’s foibles in the context of his time and not let his personal failings detract from our admiration for his abilities as a scientist. I didn’t really object to this one myself; I read it as someone grappling in good faith with the contradictions of a complex human being, even if in spots it came of as offering excuses for Feynman’s bad behavior. Others disagreed.

So SciAm decided to deal with the problem by letting Jogalekar go. In my mind, a really dumb decision. I disagree very strongly with some of the stuff Ash (or his guest poster) has said, but I never thought it came close to some standard of horribleness and offensiveness that would countenance firing him. I want to be challenged by people I disagree with, not just surrounded by fellow-travelers. I didn’t find much of interest in Ash’s three controversial posts, but overall his blog was often thought-provoking and enjoyable. SciAm had every “right” to fire him, as a legal matter. They are under no obligation to stand by their employees when those employees take controversial stances. But it was still the wrong thing to do; nothing Ash said was anywhere close to falling outside the realm of reasonable things to talk about, disagree with them though I may.

It breaks my heart. In the interminable arguments about gender and IQ and genetics, a favorite strategy of people who like to promote lazy arguments in favor of genetic determinism is to bemoan their victimhood status, claiming that even asking such questions is deemed unacceptable by the liberal thought police. (Wade’s defenders, for example, eagerly jumped on a rumor that he had been fired from the Times because of his book, when the truth is he had left the paper some years earlier.) Usually I have just laughed in response, pointing out that these questions are investigated all the time; the only real danger these people face is that others point out how superficial their arguments are, not that they are punished or lose their jobs for reaching the wrong conclusions. But I was wrong, and they were right, at least to some extent. You really can lose your job for holding the wrong view of these issues. (Sometimes the attitude is completely out in the open, as in this Harvard Crimson op-ed urging that we “give up on academic freedom in favor of justice.”) As a liberal and a feminist myself, I think we should be the ones who protect speech rights most vociferously, the ones who are happy to counter arguments with which we disagree with better arguments rather than blunt instruments of punishment. It’s a difficult standard to live up to.

The second case is less specific: the growing penchant for disinviting speakers with whom we disagree. It’s been bugging me for a while, but I won’t say too much about it here, especially since Massimo Pigliucci has already done a good job. I’m not a big fan of Condoleezza Rice’s contributions to US foreign policy, and I can understand that it might be disillusioning to hear that she was scheduled to be the featured speaker at your commencement at Rutgers. But I would advocate putting up with this mild inconvenience — unless you think that conservative students should also have the right to veto commencement talks by Democratic politicians.

I’ve expressed similar feelings before, in the even more straightforward case where Larry Summers (he keeps popping up in these conversations) was disinvited from giving a talk to the Regents of the University of California. That seemed completely wrong to me — the idea apparently being that Summers, having once said incorrect things about one topic, should be prevented from speaking to any audience about any topic. At the same time, I had no problem at all with Harvard faculty working to remove Summers as President of the university after his problematic speech. The difference being that what he said had a direct bearing on his performance in the office. He clearly misunderstood the situation of many women in modern academia, especially the sciences, and that’s something a modern university president really needs to understand. And it turns out that — shockingly, I know — the number of women hired as senior faculty under Summers was in fact noticeably smaller than it had been under his predecessors. But that, of course, doesn’t mean he should have been fired from his position as a tenured professor of economics — as indeed he was not. (Although I have to say his teaching load seems pretty light.)

The third case, most recent and newsworthy, is Salaita. The specifics were listed up at the top: he was offered a position, accepted it, and had it withdrawn before he could actually move, when the administration learned about some inflammatory tweets. I am not completely conversant with all of the details of his contract — apparently the job had been offered, and he had resigned from Virginia Tech, but the step of having his contract approved by the Board of Trustees (usually a formality) hadn’t yet gone through, giving the UIUC Chancellor an opportunity to step in by deciding not to submit the appointment to the board. (I had read about the issue in various places, but special thanks to Paul Boghossian at NYU for nudging me to pay attention to it.)

There’s little question in my mind that some of Salaita’s remarks were ugly. In addition to the Netanyahu tweet at the top, he said things like like “Let’s cut to the chase: If you’re defending Israel right now you’re an awful human being” and “You may be too refined to say it, but I’m not: I wish all the fucking West Bank settlers would go missing.” Statements like this don’t have anything very useful to offer in terms of rational discourse and the free market of ideas. (Even if, as always, context matters.) But I’m perfectly willing to believe that his other work has something to offer. We don’t judge academics by their least-academic utterances. And one-liners like this, as off-putting as they might be when read in isolation, shouldn’t disqualify someone from participating in the wider discourse. (Salaita has also tweeted things like “I refuse to conceptualize #Israel/#Palestine as Jewish-Arab acrimony. I am in solidarity with many Jews and in disagreement with many Arabs” and “#ISupportGaza because I believe that Jewish and Arab children are equal in the eyes of God.”)

When a professor has already been vetted and approved by a department and essentially offered a position, there should be an extremely high bar indeed for the administration to step in at the last minute and attempt to reverse the decision. It would be one thing if new evidence had come to light that indicated the person would be incompetent at their job; in Salaita’s case there was nothing of the sort, and indeed he received excellent teaching evaluations at Virginia Tech. And what would be really bad would be if administrators were making decisions for non-academic reasons, in ways that threaten to truly undermine academic freedom. That seems to be the case here. It is clear, at least, that UIUC Chancellor Phyllis Wise was directly contacted by prominent donors who threatened to stop donating if Salaita were hired.

Most worrisome of all was Chacellor Wise’s statement about the controversy, which included remarks such as this:

What we cannot and will not tolerate at the University of Illinois are personal and disrespectful words or actions that demean and abuse either viewpoints themselves or those who express them.

It’s easy to let your eyes glaze over that, but the statement itself is clear: the UIUC administration thinks it is not permissible to “demean and abuse … viewpoints themselves.” At Urbana-Champaign, you can be fired if you make fun of creationism, racism, or sexism. Those are viewpoints, and viewpoints cannot be demeaned or abused!

Perhaps Wise, or whoever drafted the statement, dashed something off and wasn’t thinking about it too carefully. If that’s their defense, it’s not much of one; these are crucially important issues for a university, which warrant some careful thought and precise formulation. And if they stand by it, a statement like this is straightforwardly antithetical to everything that universities are supposed to stand for.

I am therefore in support of the call for a boycott, until Salaita’s position is restored, even if (and in fact, especially because of the fact that) I don’t agree with his positions. I don’t really like boycotts in general — again, always preferring to err on the side of engagement rather than disengagement. I think the idea of an academic boycott of Israel is silly and counterproductive, for example. But a boycott is one of the few things that academics can do to put pressure on a university administration. And in this case there are signs it might actually be having an effect on UIUC — it seems that the Chancellor has forwarded the case to the Board of Trustees after all, although they have yet to actually vote on it. So if you are a science/engineering faculty member and inclined to support it, feel free to sign the petition. (There are also petitions organized by other disciplines, of course.)

Having said all that, it’s not like I’m very gleeful in my support of all these people with whom I disagree. It’s necessary and, I think, honorable, but also uncomfortable. Some people disagree with my own stance, of course: while the American Association of University Professors is firmly in Salaita’s corner, former AAUP president Cary Nelson takes a dissenting view.

One of the reasons why these issues are difficult is because of the emotional component I’ve already mentioned. In particular, it’s easy for me to say “You people who are being denigrated by these statements should just buck up and take it in the name of free speech and inquiry.” I have never personally had to suffer from sexism, racism, or anti-Semitism (or having my neighborhood bombed, for that matter). I’m a straight, white, upper-class, lapsed-Episcopalian Anglo-American male — in the sweepstakes of being privileged, I just about hit the jackpot. It’s no problem for me to sit back from this position of comfort and extoll the virtues of unfettered speech. The way our society is set up, it would be very difficult for anyone to write a blog post or series of tweets that would call into question my self-worth simply because of a group to which I happened to belong.

And yet, I don’t think that my position disqualifies me from having an opinion. It just means that I should try to be cognizant of my biases, be thoughtful about how other people might feel, and try especially hard to actually listen to what they have to say.

So to me, the most effective statements I’ve read on the Salaita controversy have been those from Jonathan Judaken and Bonny Honig. Here is Judaken:

As a scholar familiar with Judeophobic imagery, Salaita’s one-liner veered dangerously close to the myth of blood-libel. For a thousand years, Jews have been accused of desiring the blood of non-Jewish children. If the depiction of Netanyahu as savage and barbaric was applied to President Obama (as it has been) the racism would be patent.

Having grown up as a Jewish person in South Africa under apartheid — a dominant racial group and a religious minority — Judaken understands this language when he hears it. But here’s the thing: he supports the boycott, “on the basis of the principles of faculty governance, academic freedom, and freedom of speech.” Protecting the right of scholars to have an express unpopular opinions is too important to compromise. And here’s Honig, a political theorist at Brown:

I found that tweet painful and painfully funny. It struck home with me, a Jew raised as a Zionist. Too many of us are too committed to being uncritical of Israel. Perhaps tweets like Prof. Salaita’s, along with images of violence from Gaza and our innate sense of fair play, could wake us from our uncritical slumbers. It certainly provoked ME, and I say “provoked” in the best way – awakened to thinking.

That’s such a great statement of the true academic mindset. Provoke me! Say something I disagree with, even in an intentionally disagreeable way. Make me think, force me to re-examine my cherished presuppositions. Probably I will come out with my basic opinions intact — as an empirical matter, that’s usually what happens. But if you provoke me well, I’ll understand those presuppositions even better, and be more prepared to defend them next time. And who knows? You might even change my mind. One way or another, let the disagreements fly.

Very short version: I wish I lived in a world where I could spend my time disagreeing with people whose views I found disagreement-worthy, rather than fighting for their right to say disagreeable things.

Quantum DiariesIs the Understandability of the Universe a Mirage?

Isaac Asimov (1920 – 1992) “expressed a certain gladness at living in a century in which we finally got the basis of the universe straight”. Albert Einstein (1870 – 1955) claimed: “The most incomprehensible thing about the world is that it is comprehensible”. Indeed there is general consensus in science that not only is the universe comprehensible but is it mostly well described by our current models. However, Daniel Kahneman counters: “Our comforting conviction that the world makes sense rests on a secure foundation: our almost unlimited ability to ignore our ignorance”.

Well, that puts a rather different perspective on Asimov’s and Einstein’s claims.  So who is this person that is raining on our parade? Kahneman is a psychologist who won the 2002 Nobel Prize in economics for his development of prospect theory. A century ago everyone quoted Sigmund Freud (1856 – 1939) to show how modern they were. Today, Kahneman seems to have assumed that role.[1]

Kahneman’s Nobel Prize winning prospect theory, developed with Amos Tversky (1937 –1996), replaced expected utility theory. The latter assumed that people made economic choices based on the expected utility of the results, that is they would behave rationally. In contrast, Kahneman and company have shown that people are irrational in well-defined and predictable ways. For example, it is understood that the phrasing of a question can (irrationally) change how people answer, even if the meaning of the question is the same.

Kahneman’s book, Thinking, Fast and Slow, really should be required reading for everyone. It explains a lot of what goes on (gives the illusion of comprehension?) and provides practical tips for thinking rationally. For example, when I was on a visit in China, the merchants would hand me a calculator to type in what I would pay for a given item. Their response to the number I typed in was always the same: You’re joking, right?  Kahneman would explain that they were trying to remove the anchor set by the first number entered in the calculator. Anchoring is a common aspect of how we think.

Since, as Kahneman argues, we are inherently irrational one has to wonder about the general validity of the philosophic approach to knowledge; an approach based largely on rational argument. Science overcomes our inherent irrationality by constraining our rational arguments by frequent, independently-repeated observations.  Much as with project management, we tend to be irrationally overconfident of our ability to estimate resource requirements.  Estimates of project resource requirements not constrained by real world observations leads to the project being over budget and delivered past deadlines. Even Kahneman was not immune to this trap of being overly optimistic.

Kahneman’s cynicism has been echoed by others. For example, H.L. Mencken (1880 –1956) said:  “The most common of all follies is to believe passionately in the palpably not true. It is the chief occupation of mankind”. Are the cynics correct? Is our belief that the universe is comprehensible, and indeed mostly understood, a mirage based on our unlimited ability to ignore our ignorance? A brief look at history would tend to support that claim.  Surely the Buddha, after having achieved enlightenment, would have expressed relief and contentment for living in a century in which we finally got the basis of the universe straight. Saint Paul, in his letters, echoes the same claim that the universe is finally understood. René Descartes, with the method laid out in the Discourse on the Method and Principles of Philosophy, would have made the same claim.  And so it goes, almost everyone down through history believes that he/she comprehends how the universe works. I wonder if the cow in the barn has the same illusion. Unfortunately, each has a different understanding of what it means to comprehend how the universe works, so it is not even possible to compare the relative validity of the different claims. The unconscious mind fits all it knows into a coherent framework that gives the illusion of comprehension in terms of what it considers important. In doing so, it assumes that what you see is all there is.  Kahneman refers to this as WYSIATI (What You See Is All There Is).

To a large extent the understandability of the universe is mirage based on WYSIATI—our ignorance of our ignorance. We understand as much as we are aware of and capable of understanding; blissfully ignoring the rest. We do not know how quantum gravity works, if there is intelligent life elsewhere in the universe[2], or for that matter what the weather will be like next week. While our scientific models correctly describe much about the universe, they are, in the end, only models and leave much beyond their scope, including the ultimate nature of reality.

To receive a notice of future posts follow me on Twitter: @musquod.

[1] Let’s hope time is kinder to Kahneman than it was to Freud.

[2] Given our response to global warming, one can debate if there is intelligent life on earth.

John PreskillThe Graphene Effect

Spyridon Michalakis, Eryn Walsh, Benjamin Fackrell, Jackie O'Sullivan

Lunch with Spiros, Eryn, and Jackie at the Athenaeum (left to right).

Sitting and eating lunch in the room where Einstein and many others of turbo charged, ultra-powered acumen sat and ate lunch excites me. So, I was thrilled when lunch was arranged for the teachers participating in IQIM’s Summer Research Internship at the famed Athenaeum on Caltech’s campus. Spyridon Michalakis (Spiros), Jackie O’Sullivan, Eryn Walsh and I were having lunch when I asked Spiros about one of the renowned “Millennium” problems in Mathematical Physics I heard he had solved. He told me about his 18 month epic journey (surely an extremely condensed version) to solve a problem pertaining to the Quantum Hall effect. Understandably, within this journey lied many trials and tribulations ranging from feelings of self loathing and pessimistic resignation to dealing with tragic disappointment that comes from the realization that a victory celebration was much ado about nothing because the solution wasn’t correct. An unveiling of your true humanity and the lengths one can push themselves to find a solution. Three points struck me from this conversation. First, there’s a necessity for a love of the pain that tends to accompany a dogged determinism for a solution. Secondly, the idea that a person’s humanity is exposed, at least to some degree, when accepting a challenge of this caliber and then refusing to accept failure with an almost supernatural steadfastness towards a solution. Lastly, the Quantum Hall effect. The first two on the list are ideas I often ponder as a teacher and student, and probably lends itself to more of a philosophical discussion, which I do find very interesting, however, will not be the focus of this posting.

The Yeh research group, which I gratefully have been allowed to join the last three summers, researches (among other things) different applications of graphene encompassing the growth of graphene, high efficiency graphene solar cells, graphene component fabrication and strain engineering of graphene where, coincidentally for the latter, the quantum Hall effect takes center stage. The quantum Hall effect now had my attention and I felt it necessary to learn something, anything, about this recently recurring topic. The quantum Hall effect is something I had put very little thought into and if you are like I was, you’ve heard about it, but surely couldn’t explain even the basics to someone. I now know something on the subject and, hopefully, after reading this post you too will know something about the very basics of both the classical and the quantum Hall effect, and maybe experience a spark of interest regarding graphene’s fascinating ability to display the quantum Hall effect in a magnetic field-free environment.

Let’s start at the beginning with the Hall effect. Edwin Herbert Hall discovered the appropriately named effect in 1879. The Hall element in the diagram is a flat piece of conducting metal with a longitudinal current running through. When a magnetic field is introduced normal to the Hall element the charge carriers moving through the Hall element experience a Lorentz force. If we think of the current as being conventionHallEffectal (direction flow of positively charged ions), then the electrons (negative charge carriers) are traveling in the opposite direction of the green arrow shown in the diagram. Referring to the diagram and using the right hand rule you can conclude a buildup of electrons at the long bottom edge of the Hall element running parallel to the longitudinal current, and an opposing positively charged edge at the long top edge of the Hall element. This separation of charge will produce a transverse potential difference and is labeled on the diagram as Hall voltage (VH). Once the electric force (acting towards the positively charged edge perpendicular to both current and magnetic field) from the charge build up balances with the Lorentz force (opposing the electric force), the result is a negative charge carrier with a straight line trajectory in the opposite direction of the green arrow. Essentially, Hall conductance is the longitudinal current divided by the Hall voltage.

Now, let’s take a look at the quantum Hall effect. On February 5th, 1980 Klaus von Klitzing was investigating the Hall effect, in particular, the Hall conductance of a two-dimensional electron gas plane (2DEG) at very low temperatures around 4 Kelvin (- 4520 Fahrenheit). von Klitzing found when a magnetic field is applied normal to the 2DEG, and Hall conductance is graphed as a function of magnetic field strength, a staircase looking graph emerges. The discovery that earned von Klitzing’s Nobel Prize in 1985 was as unexpected as it is intriguing. For each step in the staircase the value of the function was an integer multiple of e2/h, where e is the elementary charge and h is Planck’s constant. Since conductance is the reciprocal of resistance we can view this data as h/ie2. When i (integer that describes each plateau) equals one, h/ie2 is approximately 26,000 ohms and serves as a superior standard of electrical resistance used worldwide to maintain and compare the unit of resistance.

Before discussing where graphene and the quantum Hall effect cross paths, let’s examine some extraordinary characteristics of graphene. Graphene is truly an amazing material for many reasons. We’ll look at size and scale things up a bit for fun. Graphene is one carbon atom thick, that’s 0.345 nanometers (0.000000000345 meters). Envision a one square centimeter sized graphene sheet, which is now regularly grown. Imagine, somehow, we could thicken the monolayer graphene sheet equal to that of a piece of printer paper (0.1 mm) while appropriately scaling up the area coverage. The graphene sheet that originally covered only one square centimeter would now cover an area of about 2900 meters by 2900 meters or roughly 1.8 miles by 1.8 miles. A paper thin sheet covering about 4 square miles. The Royal Swedish Academy of Sciences at nobelprize.org has an interesting way of scaling the tiny up to every day experience. They want you to picture a one square meter hammock made of graphene suspending a 4 kg cat, which represents the maximum weight such a sheet of graphene could support. The hammock would be nearly invisible, would weigh as much as one of the cat’s whiskers, and incredibly, would possess the strength to keep the cat suspended. If it were possible to make the exact hammock out of steel, its maximum load would be less than 1/100 the weight of the cat. Graphene is more than 100 times stronger than the strongest steel!

Graphene sheets possess many fascinating characteristics certainly not limited to mere size and strength. Experiments are being conducted at Caltech to study the electrical properties of graphene when draped over a field of gold nanoparticles; a discipline appropriately termed “strain engineering.” The peaks and valleys that form create strain in the graphene sheet, changing its electrical properties. The greater the curvature of the graphene over the peaks, the greater the strain. The electrons in graphene in regions experiencing strain behave as if they are in a magnetic field despite the fact that they are not. The electrons in regions experiencing the greatest strain behave as they would in extremely strong magnetic fields exceeding 300 tesla. For some perspective, the largest magnetic field ever created has been near 100 tesla and it only lasted for a few milliseconds. Additionally, graphene sheets under strain experience conductance plateaus very similar to those observed in the quantum Hall effect. This allows for great control of electrical properties by simply deforming the graphene sheet, effectively changing the amount of strain. The pseudo-magnetic field generated at room temperature by mere deformation of graphene is an extremely promising and exotic property that is bound to make graphene a key component in a plethora of future technologies.

Graphene and its incredibly fascinating properties make it very difficult to think of an area of technology where it won’t have a huge impact once incorporated. Caltech is at the forefront in research and development for graphene component fabrication, as well as the many aspects involved in the growth of high quality graphene. This summer I was involved in the latter and contributed a bit in setting up an experimenKodak_Camera 1326t that will attempt to grow graphene in a unique way. My contribution included the set-up of the stepper motor (pictured to the right) and its controls, so that it would very slowly travel down the tube in an attempt to grow a long strip of graphene. If Caltech scientist David Boyd and graduate student Chen-Chih Hsu are able to grow the long strips of graphene, this will mark yet another landmark achievement for them and Caltech in graphene research, bringing all of us closer to technologies such as flexible electronics, synthetic nerve cells, 500-mile range Tesla cars and batteries that allow us to stream Netflix on smartphones for weeks on end.


Jordan EllenbergSqueeze! Squeeze!

I hope the world never runs out of awesome Earl Weaver stories.

I saw Earl Weaver put on a suicide squeeze bunt, in Milwaukee. It worked. Everybody asked him, ‘Wait, we thought you told us you didn’t even have a sign for a suicide squeeze, because you hated it so much.’ Earl said, ‘I still don’t.’ I asked him, ‘How did you put it on then?’ He said, ‘I whistled at Cal Ripken, Sr., my third base coach. Then I shouted at him, ‘Squeeze! Squeeze! Then I motioned a bunt.’ I said, ‘Paul Molitor was playing third. Didn’t he hear you?’ Earl said, ‘If he did, I’m sure he thought there was no way we were putting it on, or I wouldn’t have been yelling for it.’

This is from the Fangraphs interview with the greatest announcer of our time, Jon Miller. His memoir, Confessions of a Baseball Purist, is full of great stuff like this. I didn’t know until just this second that it had been reissued by Johns Hopkins University Press.


September 03, 2014

Tim GowersICM2014 — Hairer laudatio

I haven’t kept up anything like the frequency of posts at this ICM that I managed at the last one. There are at least three reasons for this. One is that I was in the middle of writing up a result, so I devoted some of my rare free moments to that. Another is that the schedule was so full of good talks that I hardly skipped any sessions. And the third is that on the last day I was taken ill: I won’t go into too much detail, but let’s say that what I had sort of rhymed with “Korea”, but also left me feeling fairly terrible. So I didn’t much enjoy the conference banquet — at least from the food point of view — and then the next day, which I can’t quite believe was actually yesterday, when I got up at 5am in order to catch the bus from the hotel to the airport in time for my 9:30 flight back to Paris, I felt sufficiently terrible that I wasn’t quite sure how I would get through the 11-hour flight, four-hour stopover in Paris and four-and-a-half-hour train journey from Paris to Béziers.

I was rescued by an extraordinary piece of luck. When I got to the gate with my boarding card, the woman who took it from me tore it up and gave me another one, curtly informing me that I had been upgraded. I have no idea why. I wonder whether it had anything to do with the fact that in order to avoid standing any longer than necessary I waited until almost the end before boarding. But perhaps the decision had been made well before that: I have no idea how these things work. Anyhow, it meant that I could make my seat pretty well horizontal and I slept for quite a lot of the journey. Unfortunately, I wasn’t feeling well enough to make full use of all the perks, one of which was a bar where one could ask for single malt whisky. I didn’t have any alcohol or coffee and only picked at my food. I also didn’t watch a single film or do any work. If I’d been feeling OK, the day would have been very different. However, perhaps the fact that I wasn’t feeling OK meant that the difference it made to me to be in business class was actually greater than it would have been otherwise. I rather like that way of looking at it.

An amusing thing happened when we landed in Paris. We landed out on the tarmac and were met by buses. They let the classy people off first (even we business-class people had to wait for the first-class people, just in case we got above ourselves), so that they wouldn’t have to share a bus with the riff raff. One reason I had been pleased to be travelling business class was that it meant that I had after all got to experience the top floor of an Airbus 380. But when I turned round to look, there was only one row of windows, and then I saw that it had been a Boeing 777. Oh well. It was operated by Air France. I’ve forgotten the right phrase: something like “shared code”. A number of little anomalies resolved themselves, such as that that take-off didn’t feel like the one in Paris, that the slope of the walls didn’t seem quite correct if we were on the top floor, etc.

I thought that as an experiment I would see what I could remember about the laudatio for Martin Hairer without the notes I took, and then after that I would see how much more there was to say with the notes. So here goes. The laudatio was given by Ofer Zeitouni, one of the people on the Fields Medal committee. Early on, he made a link with what Ghys had said about Avila, by saying that Hairer too studied situations where physicists don’t know what the equation is. However, these situations were somewhat different: instead of studying typical dynamical systems, Hairer studied stochastic PDEs. As I understand it, an important class of stochastic PDEs is conventional PDEs with a noise term added, which is often some kind of Brownian motion term.

Unfortunately, Brownian motion can’t be differentiated, but that isn’t by itself a huge problem because it can be differentiated if you allow yourself to work with distributions. However, while distributions are great for many purposes, there are certain things you can’t do with them — notably multiply them together.

Hairer looked at a stochastic PDE that modelled a physical situation that gives rise to a complicated fractal boundary between two regions. I think the phrase “interface dynamics” may have been one of the buzz phrases here. The naive approach to this stochastic PDE led quickly to the need to multiply two distributions together, so it didn’t work. So Hairer added a “mollifier” — that is, he smoothed the noise slightly. Associated with this mollifier was a parameter \epsilon: the smaller \epsilon was, the less smoothing took place. So he then solved the smoothed system, let \epsilon tend to zero, showed that the smoothed solutions tended to a limit, and defined that limit to be the solution of the original equation.

The way I’ve described it, that sounds like a fairly obvious thing to do, so what was so good about it?

A first answer is that in this particular case it was far from obvious that the smoothed solutions really did tend to a limit. In order to show this, it was necessary to do a renormalization (another thematic link with Avila), which involved subtracting a constant C_\epsilon. The only other thing I remember was that the proof also involved something a bit like a Taylor expansion, but that a key insight of Hairer was that instead of expanding with respect to a fixed basis of functions, one should instead let the basis of functions depend on the function was expanding — or something like that anyway.

I was left with the feeling that a lot of people are very excited about what Hairer has done, because with his new theoretical framework he has managed to go a long way beyond what people thought was possible.

OK, now let me look at the notes and see whether I want to add anything.

My memory seems to have served me quite well. Here are a couple of extra details. An important one is that Zeitouni opened with a brief summary of Hairer’s major contributions, which makes them sound like much more than a clever trick to deal with one particular troublesome stochastic PDE. These were

1. a theory of regularity structures, and

2. a theory of ergodicity for infinite-dimensional systems.

I don’t know how those two relate to the solution of the differential equation, which, by the way, is called the KPZ equation, and is the following.

\partial_th = \Delta h + (\partial_xh)^2 +\xi.

It models the evolution of interfaces. (So maybe “interface dynamics” was not after all the buzz phrase.)

When I said that the noise was Brownian, I should have said that the noise was completely uncorrelated in time, and therefore makes no sense pointwise, but it integrates to Brownian motion.

The mollifiers are functions \xi_\epsilon that replace the noise term \xi. The constants C_\epsilon I mentioned earlier depend on your choice of mollifier, but the limit doesn’t (which is obviously very important).

What Zeitouni actually said about Taylor expansion was that one should measure smoothness by expansions that are tailored (his word not mine) to the equation, rather than with respect to a universal basis. This was a key insight of Hairer.

One of the major tools introduced by Hairer is a generalization of something called rough-path theory, due to Terry Lyons. Another is his renormalization procedure.

Zeitouni summarized by saying that Hairer had invented new methods for defining solutions to PDEs driven by rough noise, and that these methods were robust with respect to mollification. He also said something about quantitative behaviour of solutions.

If you find that account a little vague and unsatisfactory, bear in mind that my aim here is not to give the clearest possible presentation of Hairer’s work, but rather to discuss what it was like to be at the ICM, and in particular to attend this laudatio. One doesn’t usually expect to come out of a maths talk understanding it so well that one could give the same talk oneself. As I’ve mentioned in another post, there are some very good accounts of the work of all the prizewinners here. (To see them, follow the link and then follow further links to press releases.)

Update: if you want to appreciate some of these ideas more fully, then here is a very nice blog post: it doesn’t say much more about Hairer’s work, but it does a much better job than this post of setting his work in context.


Mark Chu-CarrollGÖDEL PART 4: The Payoff

After a bit of a technical delay, it’s time to finish the repost of incompleteness! Finally, we’re at the end of our walkthrough of Gödel great incompleteness proof. As a refresher, the basic proof sketch is:

  1. Take a simple logic. We’ve been using a variant of the Principia Mathematica’s logic, because that’s what Gödel used.
  2. Show that any statement in the logic can be encoded as a number using an arithmetic process based on the syntax of the logic. The process of encoding statements numerically is called Gödel numbering.
  3. Show that you can express meta-mathematical properties of logical statements in terms of arithemetic properties of their Gödel numbers. In particular, we need to build up the logical infrastructure that we need to talk about whether or not a statement is provable.
  4. Using meta-mathematical properties, show how you can create an unprovable statement encoded as a Gödel number.

What came before:

  1. Gödel numbering: The logic of the Principia, and how to encode it as numbers. This was step 1 in the sketch.
  2. Arithmetic Properties: what it means to say that a property can be expressed arithemetically. This set the groundwork for step 2 in the proof sketch.
  3. Encoding meta-math arithmetically: how to take meta-mathematical properties of logical statements, and define them as arithmetic properties of the Gödel numberings of the statements. This was step 2 proper.

So now we can move on to step three, where we actually see why mathematical logic is necessarily incomplete.

What I did in the last post was walk through a very laborious process that showed how we could express meta-mathematical properties of logical statements as primitive recursive functions and relations. Using that, we were able to express a non-primitive-recursive predicate provable, which is true for a particular number if and only if that number is the Gödel number representation of a statement which is provable.

pred provable(x) =
  some y {
    proofFor(y, x)
  }
}

The reason for going through all of that was that we really needed to show how we could capture all of the necessary properties of logical statements in terms of arithmetic properties of their Gödel numbers.

Now we can get to the target of Gödel’s effort. What Gödel was trying to do was show how to defeat the careful stratification of the Principia’s logic. In the principia, Russell and Whitehead had tried to avoid problems with self-reference by creating a very strict type-theoretic stratification, where each variable or predicate had a numeric level, and could only reason about objects from lower levels. So if natural numbers were the primitive objects in the domain being reasoned about, then level-1 objects would be things like specific natural numbers, and level-1 predicates could reason about specific natural numbers, but not about sets of natural numbers or predicates over the natural numbers. Level-2 objects would be sets of natural numbers, and level-2 predicates could reason about natural numbers and sets of natural numbers, but not about predicates over sets of natural numbers, or sets of sets of natural numbers. Level-3 objects would be sets of sets of natural numbers… and so on.

The point of this stratification was to make self-reference impossible. You couldn’t make a statement of the form “This predicate is true”: the predicate would be a level-N predicate, and only a level N+1 predicate could reason about a level-N predicate.

What Gödel did in the arithmetic process we went through in the last post is embed a model of logical statements in the natural numbers. That’s the real trick: the logic of the principia is designed to work with a collection of objects that are a model of the natural numbers. By embedding a model of logical statements in the natural numbers, he made it possible for a level-1 predicate (a predicate about a specific natural number) to reason about any logical statement or object. A level-1 predicate can now reason about a level-7 object! A level-1 predicate can reason about the set defined by a level-1 predicate: a level-1 predicate can reason about itself!. A level-1 predicate can, now, reason about any logical statement at all – itself, a level-2 predicate, or a level-27 predicate. Gödel found a way to break the stratification.

Now, we can finally start getting to the point of all of this: incompleteness! We’re going to use our newfound ability to nest logical statements into numbers to construct an unprovable true statement.

In the last post, one of the meta-mathematical properties that we defined for the Gödel-numbered logic was immConseq, which defines when some statement x is an immediate consequence of a set of statements S. As a reminder, that means that x can be inferred from statements in S in one inferrence step.

We can use that property to define what it means to be a consequence of a set of statements: it’s the closure of immediate consequence. We can define it in pseudo-code as:

def conseq(κ) = {
  K = κ + axioms
  added_to_k = false
  do {
    added_to_k = false
    for all c in immConseq(K) {
      if c not in K {
        add c to K
        added_to_k = true
      }
    }
  } while added_to_k
  return K
}

In other words, Conseq(κ) is the complete set of everything that can possibly be inferred from the statements in κ and the axioms of the system. We can say that there’s a proof for a statement x in κ if and only if x ∈ Conseq(κ).

We can take the idea of Conseq use that to define a strong version of what it means for a logical system with a set of facts to be consistent. A system is ω-consistent if and only if there is not a statement a such that: a ∈ Conseq(κ) ∧ not(forall(v, a)) ∈ Conseq(κ).

In other words, the system is ω-consistent as long as it’s never true that both a universal statement and it. But for our purposes, we can treat it as being pretty much the same thing. (Yes, that’s a bit hand-wavy, but I’m not trying to write an entire book about Gödel here!)

(Gödel’s version of the definition of ω-consistency is harder to read than this, because he’s very explicit about the fact that Conseq is a property of the numbers. I’m willing to fuzz that, because we’ve shown that the statements and the numbers are interchangable.)

Using the definition of ω-consistency, we can finally get to the actual statement of the incompleteness theorem!

Gödel’s First Incompleteness Theorem: For every ω-consistent primitive recursive set κ of formulae, there is a primitive-recursive predicate r(x) such that neither forall(v, r) nor not(forall(v, r)) is provable.

To prove that, we’ll construct the predicate r.

First, we need to define a version of our earlier isProofFigure that’s specific to the set of statements κ:

pred isProofFigureWithKappa(x, kappa) = {
  all n in 1 to length(x) {
    isAxiom(item(n, x)) or
    item(n, x) in kappa or
    some p in 0 to n {
      some q in 0 to n {
        immedConseq(item(n, x), item(p, x), item(q, x))
      }
    }
  } and length(x) > 0
}

This is the same as the earlier definition – just specialized so that it ensures that every statement in the proof figure is either an axiom, or a member of κ.

We can do the same thing to specialize the predicate proofFor and provable:

pred proofForStatementWithKappa(x, y, kappa) = {
  isProofFigureWithKappa(x, kappa) and
  item(length(x), x) = y
}

pred provableWithKappa(x, kappa) = {
  some y {
    proofForStatementWithKappa(y, x, kappa)
  }
}

If κ is the set of basic truths that we can work with, then provable in κ is equivalent to provable.

Now, we can define a predicate UnprovableInKappa:

pred NotAProofWithKappa(x, y, kappa) = {
  not (proofForKappa(x, subst(y, 19, number(y))))
}

Based on everything that we’ve done so far, NotAProofWithKappa is primitive recursive.

This is tricky, but it’s really important. We’re getting very close to the goal, and it’s subtle, so let’s take the time to understand this.

  • Remember that in a Gödel numbering, each prime number is a variable. So 19 here is just the name of a free variable in y.
  • Using the Principia’s logic, the fact that variable 19 is free means that the statement is parametric in variable 19. For the moment, it’s an incomplete statement, because it’s got an unbound parameter.
  • What we’re doing in NotAProofWithKappa is substituting the numeric coding of y for the value of y‘s parameter. When that’s done, y is no longer incomplete: it’s unbound variable has been replaced by a binding.
  • With that substitution, NotAProofWithKappa(x, y, kappa) is true when x does not prove that y(y) is true.

What NotAProofWithKappa does is give us a way to check whether a specific sequence of statements x is not a proof of y.

We want to expand NotAProofWithKappa to something universal. Instead of just saying that a specific sequence of statements x isn’t a proof for y, we want to be able to say that no possible sequence of statements is a proof for y. That’s easy to do in logic: you just wrap the statement in a “∀ x ( )”. In Gödel numbering, we defined a function that does exactly that. So the universal form of provability is: ∀ a (NotAProofWithKappa(a, y, kappa)).

In terms of the Gödel numbering, if we assume that the Gödel number for the variable a is 17, and the variable y is numbered as 19, we’re talking about the statement p = forall(17, ProvableInKappa(17, 19, kappa).

p is the statement that for some logical statement (the value of variable 19, or y in our definition), there is no possible value for variable 17 (a) where a proves y in κ.

All we need to do now is show that we can make p become self-referential. No problem: we can just put number(p) in as the value of y in UnprovableInKappa. If we let q be the numeric value of the statement UnprovableInKappa(a, y), then:

r = subst(q, 19, p)

i = subst(p, 19, r)

i says that there is no possible value x that proves p(p). In other words, p(p) is unprovable: there exists no possible proof that there is no possible proof of p!

This is what we’ve been trying to get at all this time: self-reference! We’ve got a predicate y which is able to express a property of itself. Worse, it’s able to express a negative property of itself!

Now we’re faced with two possible choices. Either i is provable – in which case, κ is inconsistent! Or else i is unprovable – in which case κ is incomplete, because we’ve identified a true statement that can’t be proven!

That’s it: we’ve shown that in the principia’s logic, using nothing but arithmetic, we can create a true statement that cannot be proven. If, somehow, it were to be proven, the entire logic would be inconsistent. So the principia’s logic is incomplete: there are true statements that cannot be proven true.

We can go a bit further: the process that we used to produce this result about the Principia’s logic is actually applicable to other logics. There’s no magic here: if your logic is powerful enough to do Peano arithmetic, you can use the same trick that we demonstrated here, and show that the logic must be either incomplete or inconsistent. (Gödel proved this formally, but we’ll just handwave it.)

Looking at this with modern eyes, it doesn’t seem quite as profound as it did back in Gödel’s day.

When we look at it through the lens of today, what we see is that in the Principia’s logic, proof is a mechanical process: a computation. If every true statement was provable, then you could take any statement S, and write a program to search for a proof of either S or ¬ S, and eventually, that program would find one or the other, and stop.

In short, you’d be able to solve the halting problem. The proof of the halting problem is really an amazingly profound thing: on a very deep level, it’s the same thing as incompleteness, only it’s easier to understand.

But at the time that Gödel was working, Turing hadn’t written his paper about the halting problem. Incompletess was published in 1931; Turing’s halting paper was published in 1936. This was a totally unprecedented idea when it was published. Gödel produced one of the most profound and surprising results in the entire history of mathematics, showing that the efforts of the best mathematicians in the world to produce the perfection of mathematics were completely futile.

September 02, 2014

n-Category Café Why It Matters

One interesting feature of the Category Theory conference in Cambridge last month was that lots of the other participants started conversations with me about the whole-population, suspicionless surveillance that several governments are now operating. All but one were enthusiastically supportive of the work I’ve been doing to try to get the mathematical community to take responsibility for its part in this, and I appreciated that very much.

The remaining one was a friend who wasn’t unsupportive, but said to me something like “I think I probably agree with you, but I’m not sure. I don’t see why it matters. Persuade me!”

Here’s what I replied.

“A lot of people know now that the intelligence agencies are keeping records of almost all their communications, but they can’t bring themselves to get worked up about it. And in a way, they might be right. If you, personally, keep your head down, if you never do anything that upsets anyone in power, it’s unlikely that your records will end up being used against you.

But that’s a really self-centred attitude. What about people who don’t keep their heads down? What about protesters, campaigners, activists, people who challenge the establishment — people who exercise their full democratic rights? Freedom from harassment shouldn’t depend on you being a quiet little citizen.

“There’s a long history of intelligence agencies using their powers to disrupt legitimate activism. The FBI recorded some of Martin Luther King’s extramarital liaisons and sent the tape to his family home, accompanied by a letter attempting to blackmail him into suicide. And there have been many many examples since then (see below).

“Here’s the kind of situation that worries me today. In the UK, there’s a lot of debate at the moment about the oil extraction technique known as fracking. The government has just given permission for the oil industry to use it, and environmental groups have been protesting vigorously.

“I don’t have strong opinions on fracking myself, but I do think people should be free to organize and protest against it without state harassment. In fact, the state should be supporting people in the exercise of their democratic rights. But actually, any anti-fracking group would be sensible to assume that it’s the object of covert surveillance, and that the police are working against it, perhaps by employing infiltrators — because they’ve been doing that to other environmental groups for years.

“It’s the easiest thing in the world for politicians to portray anti-fracking activists as a danger to the UK’s economic well-being, as a threat to national energy security. That’s virtually terrorism! And once someone’s been labelled with the T word, it immediately becomes trivial to justify using all that surveillance data that the intelligence agencies routinely gather. And I’m not exaggerating — anti-terrorism laws really have been used against environmental campaigners in the recent past.

“Or think about gay rights. Less than fifty years ago, sex between men in England was illegal. This law was enforced, and it ruined people’s lives. For instance, my academic great-grandfather Alan Turing was arrested under this law and punished with chemical castration. He’s widely thought to have killed himself as a direct result. But today, two men in England can not only have sex legally, they can marry with the full endorsement of the state.

“How did this change so fast? Not by people writing polite letters to the Times, or by going through official parliamentary channels (at least, not only by those means). It was mainly through decades of tough, sometimes dangerous, agitation, campaigning and protest, by small groups and by courageous individual citizens.

“By definition, anyone campaigning for anything to be decriminalized is siding with criminals against the establishment. It’s the easiest thing in the world for politicians to portray campaigners like this as a menace to society, a grave threat to law and order. Any nation state with the ability to monitor, infiltrate, harass and disrupt such “menaces” will be very sorely tempted to use it. And again, that’s no exaggeration: in the US at least, this has happened to gay rights campaigners over and over again, from the 1950s to nearly the present day, even sometimes — ludicrously — in the name of fighting terrorism (1, 2, 3, 4).

“So government surveillance should matter to you in a very direct way if you’re involved in any kind of activism or advocacy or protest or campaigning or dissent. It should also matter to you if you’re not, but you quietly support any of this activism — or if you reap its benefits. Even if you don’t (which is unlikely), it matters if you simply want to live in a society where people can engage in peaceful activism without facing disruption or harassment by the state. And it matters more now than it ever did before, because government surveillance powers are orders of magnitude greater than they’ve ever been before.”


That’s roughly what I said. I think we then talked a bit about mathematicians’ role in enabling whole-population surveillance. Here’s Thomas Hales’s take on this:

If privacy disappears from the face of the Earth, mathematicians will be some of the primary culprits.

Of course, there are lots of other reasons why the activities of the NSA, GCHQ and their partners might matter to you. Maybe you object to industrial espionage being carried out in the name of national security, or the NSA supplying data to the CIA’s drone assassination programme (“we track ‘em, you whack ‘em”), or the raw content of communications between Americans being passed en masse to Israel, or the NSA hacking purely civilian infrastructure in China, or government agencies intercepting lawyer-client and journalist-source communications, or that the existence of mass surveillance leads inevitably to self-censorship. Or maybe you simply object to being watched, for the same reason you close the bathroom door: you’re not doing anything to be ashamed of, you just want some privacy. But the activism point is the one that resonates most deeply with me personally, and it seemed to resonate with my friend too.

You may think I’m exaggerating or scaremongering — that the enormous power wielded by the US and UK intelligence agencies (among others) could theoretically be used against legitimate citizen activism, but hasn’t been so far.

There’s certainly an abstract argument against this: it’s simply human nature that if you have a given surveillance power available to you, and the motive to use it, and the means to use it without it being known that you’ve done so, then you very likely will. Even if (for some reason) you believe that those currently wielding these powers have superhuman powers of self-restraint, there’s no guarantee that those who wield them in future will be equally saintly.

But much more importantly, there’s copious historical evidence that governments routinely use whatever surveillance powers they possess against whoever they see as troublemakers, even if this breaks the law. Without great effort, I found 50 examples in the US and UK alone — read on.

Six overviews

If you’re going to read just one thing on government surveillance of activists, I suggest you make it this:

Among many other interesting points, it reminds us that this isn’t only about “leftist” activism — three of the plaintiffs in this case are pro-gun organizations.

Here are some other good overviews:

And here’s a short but incisive comment from journalist Murtaza Hussain.

50 episodes of government surveillance of activists

Disclaimer   Journalism about the activities of highly secretive organizations is, by its nature, very difficult. Even obtaining the basic facts can be a major feat. Obviously, I can’t attest to the accuracy of all these articles — and the entries in the list below are summaries of the articles linked to, not claims I’m making myself. As ever, whether you believe what you read is a judgement you’ll have to make for yourself.

1940s

1. FBI surveillance of War Resisters League (1, 2), continuing in 2010 (1)

1950s

2. FBI surveillance of the National Association for the Advancement of Colored People (1)

3. FBI “surveillance program against homosexuals” (1)

1960s

4. FBI’s Sex Deviate programme (1)

5. FBI’s Cointelpro projects aimed at “surveying, infiltrating, discrediting, and disrupting domestic political organizations”, and NSA’s Project Minaret targeted leading critics of Vietnam war, including senators, civil rights leaders and journalists (1)

6. FBI attempted to blackmail Martin Luther King into suicide with surveillance tape (1)

7. NSA intercepted communications of antiwar activists, including Jane Fonda and Dr Benjamin Spock (1)

8. Harassment of California student movement (including Stephen Smale’s free speech advocacy) by FBI, with support of Ronald Reagan (1, 2)

1970s

9. FBI surveillance and attempted deportation of John Lennon (1)

10. FBI burgled the office of the psychiatrist of Pentagon Papers whistleblower Daniel Ellsberg (1)

1980s

11. Margaret Thatcher had the Canadian national intelligence agency CSEC surveil two of her own ministers (1, 2, 3)

12. MI5 tapped phone of founder of Women for World Disarmament (1)

13. Ronald Reagan had the NSA tap the phone of congressman Michael Barnes, who opposed Reagan’s Central America policy (1)

1990s

14. NSA surveillance of Greenpeace (1)

15. UK police’s “undercover work against political activists” and “subversives”, including future home secretary Jack Straw (1)

16. UK undercover policeman Peter Francis “undermined the campaign of a family who wanted justice over the death of a boxing instructor who was struck on the head by a police baton” (1)

17. UK undercover police secretly gathered intelligence on 18 grieving families fighting to get justice from police (1, 2)

18. UK undercover police spied on lawyer for family of murdered black teenager Stephen Lawrence; police also secretly recorded friend of Lawrence and his lawyer (1, 2)

19. UK undercover police spied on human rights lawyers Bindmans (1)

20. GCHQ accused of spying on Scottish trade unions (1)

2000s

21. US military spied on gay rights groups opposing “don’t ask, don’t tell” (1)

22. Maryland State Police monitored nonviolent gay rights groups as terrorist threat (1)

23. NSA monitored email of American citizen Faisal Gill, including while he was running as Republican candidate for Virginia House of Delegates (1)

24. NSA surveillance of Rutgers professor Hooshang Amirahmadi and ex-California State professor Agha Saeed (1)

25. NSA tapped attorney-client conversations of American lawyer Asim Ghafoor (1)

26. NSA spied on American citizen Nihad Awad, executive director of the Council on American-Islamic Relations, the USA’s largest Muslim civil rights organization (1)

27. NSA analyst read personal email account of Bill Clinton (date unknown) (1)

28. Pentagon counterintelligence unit CIFA monitored peaceful antiwar activists (1)

29. Green party peer and London assembly member Jenny Jones was monitored and put on secret police database of “domestic extremists” (1, 2)

30. MI5 and UK police bugged member of parliament Sadiq Khan (1, 2)

31. Food Not Bombs (volunteer movement giving out free food and protesting against war and poverty) labelled as terrorist group and infiltrated by FBI (1, 2, 3)

32. Undercover London police infiltrated green activist groups (1)

33. Scottish police infiltrated climate change activist organizations, including anti-airport expansion group Plane Stupid (1)

34. UK undercover police had children with activists in groups they had infiltrated (1)

35. FBI infiltrated Muslim communities and pushed those with objections to terrorism (and often mental health problems) to commit terrorist acts (1, 2, 3)

2010s

36. California gun owners’ group Calguns complains of chilling effect of NSA surveillance on members’ activities (1, 2, 3)

37. GCHQ and NSA surveilled Unicef and head of Economic Community of West African States (1)

38. NSA spying on Amnesty International and Human Rights Watch (1)

39. CIA hacked into computers of Senate Intelligence Committee, whose job it is to oversee the CIA
(1, 2, 3, 4, 5, 6; bonus: watch CIA director John Brennan lie that it didn’t happen, months before apologizing)

40. CIA obtained legally protected, confidential email between whistleblower officials and members of congress, regarding CIA torture programme (1)

41. Investigation suggests that CIA “operates an email surveillance program targeting senate intelligence staffers” (1)

42. FBI raided homes and offices of Anti-War Committee and Freedom Road Socialist Organization, targeting solidarity activists working with Colombians and Palestinians (1)

43. Nearly half of US government’s terrorist watchlist consists of people with no recognized terrorist group affiliation (1)

44. FBI taught counterterrorism agents that mainstream Muslims are “violent” and “radical”, and used presentations about the “inherently violent nature of Islam” (1, 2, 3)

45. GCHQ has developed tools to manipulate online discourse and activism, including changing outcomes of online polls, censoring videos, and mounting distributed denial of service attacks (1, 2)

46. Green member of parliament Caroline Lucas complains that GCHQ is intercepting her communications (1)

47. GCHQ collected IP addresses of visitors to Wikileaks websites (1, 2)

48. The NSA tracks web searches related to privacy software such as Tor, as well as visitors to the website of the Linux Journal (calling it an “extremist forum”) (1, 2, 3)

49. UK police attempt to infiltrate anti-racism, anti-fascist and environmental groups, anti-tax-avoidance group UK Uncut, and politically active Cambridge University students (1, 2)

50. NSA surveillance impedes work of investigative journalists and lawyers (1, 2, 3, 4, 5).

Back to mathematics

As mathematicians, we spend much of our time studying objects that don’t exist anywhere in the world (perfect circles and so on). But we exist in the world. So, being a mathematician sometimes involves addressing real-world concerns.

For instance, Vancouver mathematician Izabella Laba has for years been writing thought-provoking posts on sexism in mathematics. That’s not mathematics, but it’s a problem that implicates every mathematician. On this blog, John Baez has written extensively on the exploitative practices of certain publishers of mathematics journals, the damage it does to the universities we work in, and what we can do about it.

I make no apology for bringing political considerations onto a mathematical blog. The NSA is a huge employer of mathematicians — over 1000 of us, it claims. Like it or not, it is part of our mathematical environment. Both the American Mathematical Society and London Mathematical Society are now regularly publishing articles on the role of mathematicians in enabling government surveillance, in recognition of our responsibility for it. As a recent New York Times article put it:

To say mathematics is political is not to diminish it, but rather to recognize its greater meaning, promise and responsibilities.

Terence TaoAvila, Bhargava, Hairer, Mirzakhani

The 2014 Fields medallists have just been announced as (in alphabetical order of surname) Artur Avila, Manjul Bhargava, Martin Hairer, and Maryam Mirzakhani (see also these nice video profiles for the winners, which is a new initiative of the IMU and the Simons foundation). This time four years ago, I wrote a blog post discussing one result from each of the 2010 medallists; I thought I would try to repeat the exercise here, although the work of the medallists this time around is a little bit further away from my own direct area of expertise than last time, and so my discussion will unfortunately be a bit superficial (and possibly not completely accurate) in places. As before, I am picking these results based on my own idiosyncratic tastes, and they should not be viewed as necessarily being the “best” work of these medallists. (See also the press releases for Avila, Bhargava, Hairer, and Mirzakhani.)

Artur Avila works in dynamical systems and in the study of Schrödinger operators. The work of Avila that I am most familiar with is his solution with Svetlana Jitormiskaya of the ten martini problem of Kac, the solution to which (according to Barry Simon) he offered ten martinis for, hence the name. The problem involves perhaps the simplest example of a Schrödinger operator with non-trivial spectral properties, namely the almost Mathieu operator {H^{\lambda,\alpha}_\omega: \ell^2({\bf Z}) \rightarrow \ell^2({\bf Z})} defined for parameters {\alpha,\omega \in {\bf R}/{\bf Z}} and {\lambda>0} by a discrete one-dimensional Schrödinger operator with cosine potential:

\displaystyle (H^{\lambda,\alpha}_\omega u)_n := u_{n+1} + u_{n-1} + 2\lambda (\cos 2\pi(\theta+n\alpha)) u_n.

This is a bounded self-adjoint operator and thus has a spectrum {\sigma( H^{\lambda,\alpha}_\omega )} that is a compact subset of the real line; it arises in a number of physical contexts, most notably in the theory of the integer quantum Hall effect, though I will not discuss these applications here. Remarkably, the structure of this spectrum depends crucially on the Diophantine properties of the frequency {\alpha}. For instance, if {\alpha = p/q} is a rational number, then the operator is periodic with period {q}, and then basic (discrete) Floquet theory tells us that the spectrum is simply the union of {q} (possibly touching) intervals. But for irrational {\alpha} (in which case the spectrum is independent of the phase {\theta}), the situation is much more fractal in nature, for instance in the critical case {\lambda=1} the spectrum (as a function of {\alpha}) gives rise to the Hofstadter butterfly. The “ten martini problem” asserts that for every irrational {\alpha} and every choice of coupling constant {\lambda > 0}, the spectrum is homeomorphic to a Cantor set. Prior to the work of Avila and Jitormiskaya, there were a number of partial results on this problem, notably the result of Puig establishing Cantor spectrum for a full measure set of parameters {(\lambda,\alpha)}, as well as results requiring a perturbative hypothesis, such as {\lambda} being very small or very large. The result was also already known for {\alpha} being either very close to rational (i.e. a Liouville number) or very far from rational (a Diophantine number), although the analyses for these two cases failed to meet in the middle, leaving some cases untreated. The argument uses a wide variety of existing techniques, both perturbative and non-perturbative, to attack this problem, as well as an amusing argument by contradiction: they assume (in certain regimes) that the spectrum fails to be a Cantor set, and use this hypothesis to obtain additional Lipschitz control on the spectrum (as a function of the frequency {\alpha}), which they can then use (after much effort) to improve existing arguments and conclude that the spectrum was in fact Cantor after all!

Manjul Bhargava produces amazingly beautiful mathematics, though most of it is outside of my own area of expertise. One part of his work that touches on an area of my own interest (namely, random matrix theory) is his ongoing work with many co-authors on modeling (both conjecturally and rigorously) the statistics of various key number-theoretic features of elliptic curves (such as their rank, their Selmer group, or their Tate-Shafarevich groups). For instance, with Kane, Lenstra, Poonen, and Rains, Manjul has proposed a very general random matrix model that predicts all of these statistics (for instance, predicting that the {p}-component of the Tate-Shafarevich group is distributed like the cokernel of a certain random {p}-adic matrix, very much in the spirit of the Cohen-Lenstra heuristics discussed in this previous post). But what is even more impressive is that Manjul and his coauthors have been able to verify several non-trivial fragments of this model (e.g. showing that certain moments have the predicted asymptotics), giving for the first time non-trivial upper and lower bounds for various statistics, for instance obtaining lower bounds on how often an elliptic curve has rank {0} or rank {1}, leading most recently (in combination with existing work of Gross-Zagier and of Kolyvagin, among others) to his amazing result with Skinner and Zhang that at least {66\%} of all elliptic curves over {{\bf Q}} (ordered by height) obey the Birch and Swinnerton-Dyer conjecture. Previously it was not even known that a positive proportion of curves obeyed the conjecture. This is still a fair ways from resolving the conjecture fully (in particular, the situation with the presumably small number of curves of rank {2} and higher is still very poorly understood, and the theory of Gross-Zagier and Kolyvagin that this work relies on, which was initially only available for {{\bf Q}}, has only been extended to totally real number fields thus far, by the work of Zhang), but it certainly does provide hope that the conjecture could be within reach in a statistical sense at least.

Martin Hairer works in at the interface between probability and partial differential equations, and in particular in the theory of stochastic differential equations (SDEs). The result of his that is closest to my own interests is his remarkable demonstration with Jonathan Mattingly of unique invariant measure for the two-dimensional stochastically forced Navier-Stokes equation

\displaystyle \partial_t u + (u \cdot \nabla u) = \nu \Delta u - \nabla p + \xi

\displaystyle \nabla \cdot u = 0

on the two-torus {({\bf R}/{\bf Z})^2}, where {\xi} is a Gaussian field that forces a fixed set of frequencies. It is expected that for any reasonable choice of initial data, the solution to this equation should asymptotically be distributed according to Kolmogorov’s power law, as discussed in this previous post. This is still far from established rigorously (although there are some results in this direction for dyadic models, see e.g. this paper of Cheskidov, Shvydkoy, and Friedlander). However, Hairer and Mattingly were able to show that there was a unique probability distribution to almost every initial data would converge to asymptotically; by the ergodic theorem, this is equivalent to demonstrating the existence and uniqueness of an invariant measure for the flow. Existence can be established using standard methods, but uniqueness is much more difficult. One of the standard routes to uniqueness is to establish a “strong Feller property” that enforces some continuity on the transition operators; among other things, this would mean that two ergodic probability measures with intersecting supports would in fact have a non-trivial common component, contradicting the ergodic theorem (which forces different ergodic measures to be mutually singular). Since all ergodic measures for Navier-Stokes can be seen to contain the origin in their support, this would give uniqueness. Unfortunately, the strong Feller property is unlikely to hold in the infinite-dimensional phase space for Navier-Stokes; but Hairer and Mattingly develop a clean abstract substitute for this property, which they call the asymptotic strong Feller property, which is again a regularity property on the transition operator; this in turn is then demonstrated by a careful application of Malliavin calculus.

Maryam Mirzakhani has mostly focused on the geometry and dynamics of Teichmuller-type moduli spaces, such as the moduli space of Riemann surfaces with a fixed genus and a fixed number of cusps (or with a fixed number of boundaries that are geodesics of a prescribed length). These spaces have an incredibly rich structure, ranging from geometric structure (such as the Kahler geometry given by the Weil-Petersson metric), to dynamical structure (through the action of the mapping class group on this and related spaces), to algebraic structure (viewing these spaces as algebraic varieties), and are thus connected to many other objects of interest in geometry and dynamics. For instance, by developing a new recursive formula for the Weil-Petersson volume of this space, Mirzakhani was able to asymptotically count the number of simple prime geodesics of length up to some threshold {L} in a hyperbolic surface (or more precisely, she obtained asymptotics for the number of such geodesics in a given orbit of the mapping class group); the answer turns out to be polynomial in {L}, in contrast to the much larger class of non-simple prime geodesics, whose asymptotics are exponential in {L} (the “prime number theorem for geodesics”, developed in a classic series of works by Delsart, Huber, Selberg, and Margulis); she also used this formula to establish a new proof of a conjecture of Witten on intersection numbers that was first proven by Kontsevich. More recently, in two lengthy papers with Eskin and with Eskin-Mohammadi, Mirzakhani established rigidity theorems for the action of {SL_2({\bf R})} on such moduli spaces that are close analogues of Ratner’s celebrated rigidity theorems for unipotently generated groups (discussed in this previous blog post). Ratner’s theorems are already notoriously difficult to prove, and rely very much on the polynomial stability properties of unipotent flows; in this even more complicated setting, the unipotent flows are no longer tractable, and Mirzakhani instead uses a recent “exponential drift” method of Benoist and Quint with as a substitute. Ratner’s theorems are incredibly useful for all sorts of problems connected to homogeneous dynamics, and the analogous theorems established by Mirzakhani, Eskin, and Mohammadi have a similarly broad range of applications, for instance in counting periodic billiard trajectories in rational polygons.


Filed under: math.DS, math.MP, math.NT, math.PR Tagged: Artur Avila, Fields medals, Manjul Bhargava, Martin Hairer, Maryam Mirzakhani

Quantum DiariesFinding tomorrow’s scientists

Last week I was at a family reunion where I had the chance to talk to one of my more distant relations, Calvin. At 10 years old he seems to know more about particle physics and cosmology than most adults I know. We spent a couple of hours talking about the LHC, the big bang, trying to solve the energy crisis, and even the role of women in science . It turns out that Calvin had wanted to speak with a real scientist for quite a while, so I agreed to have a chat next time I was in the area. To be honest when I first agreed I was rolling my eyes at the prospect. I’ve had so many parents tell me about their children who are “into science” only to find out that they merely watch Mythbusters, or enjoyed reading a book about dinosaurs. However when I spoke to Calvin I found he had huge concentration and insight for someone of his age, and that he was enthusiastically curious about physics to the point where I felt he would never tire of the subject. Each question would lead to another, in the meantime he’d wait patiently for the answer, giving the discussion his full attention. He seemed content with the idea that we don’t have answers to some of these questions yet, or that it can take decades for someone to understand just one of the answers properly. The road to being a scientist is a long one and you’ve got to really want it and work hard to get there, and Calvin has what it takes.

Real scientists don't merely observe, they don't merely interact, they create.  (Child at the Science Museum London, studying an optical exhibit.  Nevit Dilmen 2008)

Real scientists don’t merely observe, they don’t merely interact, they create. (Child at the Science Museum London, studying an optical exhibit. Nevit Dilmen 2008)

Next month Calvin will start his final year in primary school and his teacher will be the same teacher I had at that age, Mark (a great name for a teacher!) From an early age I was fascinated by mathematics and computation, and without Mark I would not have discovered how much fun it was to play with numbers and shapes, something I’ve enjoyed ever since. Without his influence I probably would not have chosen to be a scientist. So once I found out Mark was going to teach Calvin I got in touch and told him that Calvin had the spark within him to get to university, but only if he had the right help along the way. In the area we are from, an industrial town in the North West of England, it is not usual for children to go to university, and there’s often strong peer pressure to not study hard. In this kind of environment it’s important to give encouragement to the children who can do well in academia. (Of course it would be better to change the environments in schools, but changing attitudes and cultures takes decades.)

All this made me think about my own experiences on the way to university, and I’m sure everyone had their own memories of the teachers who inspired them, and the frustrations of how much of high school focuses on learning facts instead of critical thinking. At primary school I had exhausted the mathematics textbooks very early on, under the guidance of Maggie Miller. From there Mark took over and taught me puzzles that went beyond anything I was taught in maths classes at high school. It was unfortunate that I was assigned a rather uninspiring maths teacher who would struggle to understand what I said at times, and it took the school about four years to organise classes that stretched its top students. This was mostly a matter of finding the resources than anything else; the school was caught in the middle of a regional educational crisis, and five small schools were fighting to stay open in a region that could only support four larger schools. One of the schools had to close and that would mean a huge upheaval for everyone. Challenging the brightest students became one of the ways that the school could show its worth and boost its statistics, so the pupils and school worked together to improve both their prospects. Since then the school has encouraged pupils to on extra subjects and exams if they want to, and I’m glad to stay that not only has it stayed open but it’s now going from strength to strength, and I’m glad to have played a very small part in that success.

By the time I was at college there was a whole new level of possibilities, as they had teams dedicated to helping students get to university, and some classes were arranged to fit around the few students that needed them, rather than the other way around. Some of the support still depended on individuals putting in extra effort though, including staff pulling strings to arrange a visit to Oxford where we met with tutors and professors who could give us practice interviews. I realised there was quite a coincidence, because one of the people who gave a practice interview, Bobbie Miller, was the son of Maggie Miller, one of my primary school teachers. At the same time one of my older and more dedicated tutors, Lance, had to take time off for ill health. He invited me and two others over to his house in the evenings for extra maths lessons, some of which went far beyond the scope of the syllabus and instead explored critical and creative mathematical thinking to give us a much deeper understanding of what we were studying. After one of my exams I heard the sad news that he’d passed away, but we knew that he was confident of our success and all three of us got the university positions we wanted, largely thanks to his help.

Unable to thank Lance, I went to visit Maggie Miller and thanked her. It was a surreal experience to go into her classroom and see how small the tables and chairs were, but it brings me back to the main point. Finding tomorrow’s scientists means identifying and encouraging them from an early age. The journey from primary school to university is long, hard, full of distractions and it’s easy to become unmotivated. It’s only through the help of dozens of people putting in extra effort that I got to where I am today, and I’m going to do what I can to help Calvin have the same opportunities. Looking back I am of course very grateful for this, but I also shudder to think of all the pupils who weren’t so lucky, and never got a chance to stretch their intellectual muscles. It doesn’t benefit anyone to let these children fall through the cracks of the educational system simply because it’s difficult to identify those who have the drive to be scientists, or because it’s hard work to give them the support they need. Once we link them up to the right people it’s a pleasure to give them the support they need.

There have always been scientists who have come from impoverished or unlikely backgrounds, from Michael Faraday to Sophie Germaine, who fought hard to find their own way, often educating themselves. Who knows how many more advances we would have today if more of their contemporaries had access to a university education? In many cases the knowledge of children quickly outpaces that of their parents, and since parents can’t be expected to find the right resources the support must come from the schools. On the other hand there are many parents who desperately want their children to do well at school and encourage them to excel in as many subjects as possible (hence my initial skepticism when I first heard Calvin was “into science”.) This means that we also need to be wary of imposing our own biases on children. I can talk about particle physics with Calvin all day, but if he wants to study acoustic engineering then nobody should try to dissuade him from that. Nobody has a crystal ball that can tell them what path Calvin will choose to take, not even Calvin, so he needs the freedom to explore his interests in his own way.

Michael Faraday, a self-taught physicist from a poor background, giving a Royal Society Christmas Lecture, perhaps inspiring aspiring scientists in the audience. (Alexander Blaikley)

Michael Faraday, a self-taught physicist from a poor background, giving a Royal Society Christmas Lecture, perhaps inspiring aspiring scientists in the audience. (Alexander Blaikley)

So how can we encourage young scientists-in-the-making? It can be a daunting task, but from my own experience the key is to find the right people to help encourage the child. Finding someone who can share their joy and experiences of science is not easy, and it may mean second or third hand acquaintances. At the same time, there are many resources online you can use. Give a child a computer, a book of mathematical puzzles, and some very simple programming knowledge, and see them find their own solutions. Take them to museums, labs, and universities where they can meet real scientists who love to talk about their work. The key is to engage them and allow them to take part in the process. They can watch all the documentaries and read all the science books in the world, but that’s a passive exercise, and being a scientist is never passive. If a child wants to be an actor it’s not enough to ask them to read plays, they want to perform them. You’ll soon find out if your child is interested in science because they won’t be able to stop themselves being interested. The drive to solve problems and seek answers is not something that can be taught or taken away, but it can be encouraged or frustrated. Encouraging these interests is a long term investment, but one that is well worth the effort in every sense. Hopefully Calvin will be one of tomorrow’s scientists. He certainly has the ability, but more importantly he has the drive, and that means given the right support he’ll do great things.


“Girls aren’t good at science!”, Calvin said. So I told him that some of the best physicists I know are women. I explained how Marie Curie migrated from Poland to France about a century ago to study the new science of radioactivity, how she faced fierce sexism, and despite all that still became the first person in history to win two Nobel Prizes, for chemistry and physics. If a 10 year old thinks that only men can be good scientists then either the message isn’t getting through properly, or as science advocates we’re failing in our role to make it accessible to everyone. We need to move beyond the images of Einstein, Feynman, Cox, and Tyson in the public image of science.

Matt StrasslerBe Careful Waking Up a Sleeping Blog

After a very busy few months, in which a move to a new city forced me to curtail all work on this website, I’m looking to bring the blog gradually out of hibernation.  [Wordsmiths and Latinists: what is the summer equivalent?] Even so, a host of responsibilities, requirements, grant  applications, etc. will force me to ramp up the frequency of posts rather slowly.  In the meantime I will be continuing for a second year as a Visiting Scholar at the Harvard physics department, where I am doing high-energy physics research, most of it related to the Large Hadron Collider [LHC].

Although the LHC won’t start again until sometime next year (at 60% more energy per proton-proton collision than in 2012), the LHC experimenters have not been sleeping through the summer of 2014… far from it.  The rich 2011-2012 LHC data set is still being used for new particle physics measurements by ATLAS, CMS, and LHCb. These new and impressive results are mostly aimed at answering a fundamental question that faces high-energy physics today: Is the Standard Model* the full description of particle physics at the energies accessible to the LHC?  Our understanding of nature at the smallest distances, and the future direction of high-energy physics, depend crucially on the answer.  But an answer can only be obtained by searching for every imaginable chink in the Standard Model’s armor, and thus requires a great diversity of measurements. Many more years of hard and clever work lie ahead, and — at least for the time being — this blog will help you follow the story.

———————

*The “Standard Model” is the theory — i.e., the set of mathematical equations — used to describe and predict the behavior of all the known elementary particles and forces of nature, excepting gravity. We know the Standard Model doesn’t describe everything, not only because of gravity’s absence, but because dark matter and neutrino masses aren’t included; and also the Standard Model fails to explain lots of other things, such as the overall strengths of the elementary forces, and the pattern of elementary particle types and particle masses. But its equations might be sufficient, with those caveats, to describe everything the LHC experiments can measure.  There are profound reasons that many physicists will be surprised if it does… but so far the Standard Model is working just fine, thank you.


Filed under: Housekeeping

September 01, 2014

John PreskillMacroscopic quantum teleportation: the story of my chair

In the summer of 2000, a miracle occurred: The National Science Foundation decided to fund a new Institute for Quantum Information at Caltech with a 5 million dollar award from their Information Technology Research program. I was to be the founding director of the IQI.

Jeff Kimble explained to me why we should propose establishing the IQI. He knew I had used my slice of our shared DARPA grant to bring Alexei Kitaev to Caltech as a visiting professor, which had been wonderful. Recalling how much we had both benefited from Kitaev’s visit, Jeff remarked emphatically that “This stuff’s not free.” He had a point. To have more fun we’d need more money. Jeff took the lead in recruiting a large team of Caltech theorists and experimentalists to join the proposal we submitted, but the NSF was primarily interested in supporting the theory of quantum computation rather than the experimental part of the proposal. That was how I wound up in charge, though I continued to rely on Jeff’s advice and support.

This was a new experience for me and I worried a lot about how directing an institute would change my life. But I had one worry above all: space. We envisioned a thriving institute brimming over with talented and enthusiastic young scientists and visitors drawn from the physics, computer science, and engineering communities. But how could we carve out a place on the Caltech campus where they could work and interact?

To my surprise and delight, Jeff and I soon discovered that someone else at Caltech shared our excitement over the potential of IQI — Richard Murray, who was then the Chair of Caltech’s Division of Engineering and Applied Science. Richard arranged for the IQI to occupy office space in Steele Laboratory and some space we could configure as we pleased in Jorgensen Laboratory. The hub of the IQI became the lounge in Jorgensen, which we used for our seminar receptions, group meetings, and innumerable informal discussions, until our move to the beautiful Annenberg Center when it opened in 2009.

I sketched a rough plan for the Jorgensen layout, including furniture for the lounge. The furniture, I was told, was “NIC”. Though I was too embarrassed to ask, I eventually inferred this meant “Not in Contract” — I would need to go furniture shopping, one of my many burgeoning responsibilities as Director.

By this time, Ann Harvey was in place as IQI administrator, a huge relief. But furniture was something I thought I knew about, because I had designed and furnished a common area for the particle theory group a couple of years earlier. As we had done on that previous occasion, my wife Roberta and I went to Krause’s Sofa Factory to order a custom-made couch, love seat, and lounge chair, in a grayish green leather which we thought would blend well with the carpeting.

Directing an institute is not as simple as it sounds, though. Before the furniture was delivered, Krause’s declared bankruptcy! We had paid in full, but I had some anxious moments wondering whether there would be a place to sit down in the IQI lounge. In the end, after some delay, our furniture was delivered in time for the grand opening of the new space in September 2001. A happy ending, but not really the end of the story.

Before the move to Annenberg in 2009, I ordered furniture to fill our (much smaller) studio space, which became the new IQI common area. The Jorgensen furniture was retired, and everything was new! It was nice … But every once in a while I felt a twinge of sadness. I missed my old leather chair, from which I had pontificated at eight years worth of group meetings. That chair and I had been through a lot together, and I couldn’t help but feel that my chair’s career had been cut short before its time.

I don’t recall mentioning these feelings to anyone, but someone must have sensed by regrets. Because one day not long after the move another miracle occurred … my chair was baaack! Sitting in it again felt … good. For five years now I’ve been pontificating from my old chair in our new studio, just like I used to. No one told me how my chair had been returned to me, and I knew better than to ask.

My chair today. Like me, a bit worn but still far from retirement.

My chair today. Like me, a bit worn but still far from retirement.

Eventually the truth comes out. At my 60th birthday celebration last year, Stephanie Wehner and Darrick Chang admitted to being the perpetrators, and revealed the whole amazing story in their article on “Macroscopic Quantum Teleportation” in a special issue of Nature Relocations. Their breakthrough article was enhanced by Stephanie’s extraordinary artwork, which you really have to see to believe. So if your curiosity is piqued, please follow this link to find out more.

Why, you may wonder, am I reminiscing today about the story of my chair? Well, is an excuse really necessary? But if you must know, it may be because, after two renewals and 14 years of operation, I submitted the IQI Final Report to the NSF this week. Don’t worry — the Report is not really Final, because the IQI has become part of an even grander vision, the IQIM (which has given birth to this blog among other good things). Like my chair, the IQI is not quite what it was, yet it lives on.

The nostalgic feelings aroused by filing the Final Report led me to reread the wonderful volume my colleagues put together for my birthday celebration, which recounts not only the unforgettable exploits of Stephanie and Darrick, but many other stories and testimonials that deeply touched me.

Browsing through that book today, one thing that struck me is the ways we sometimes have impact on others without even being aware of it. For example, Aram Harrow, Debbie Leung, Joe Renes and Stephanie all remember lectures I gave when they were undergraduate students (before I knew them), which might have influenced their later research careers. Knowing this will make it a little harder to say no the next time I’m invited to give a talk. Yaoyun Shi has vivid memories of the time I wore my gorilla mask to the IQI seminar on Halloween, which inspired him to dress up as “a butcher threatening to cut off the ears of my students with a bloody machete if they were not listening,” thus boosting his teaching evaluations. And Alexios Polychronakos, upon hearing that I had left particle theory to pursue quantum computing, felt it “was a bit like watching your father move to Las Vegas and marry a young dancer after you leave for college,” while at the same time he appreciated “that such reinventions are within the spectrum of possibilities for physicists who still have a pulse.”

I’m proud of what the IQI(M) has accomplished, but we’re just getting started. After 14 years, I still have a pulse, and my chair has plenty of wear left. Together we look forward to many more years of pontification.