Planet Musings

August 01, 2015

Terence TaoDeducing the inverse theorem for the multidimensional Gowers norms from the one-dimensional version

This week I have been at a Banff workshop “Combinatorics meets Ergodic theory“, focused on the combinatorics surrounding Szemerédi’s theorem and the Gowers uniformity norms on one hand, and the ergodic theory surrounding Furstenberg’s multiple recurrence theorem and the Host-Kra structure theory on the other. This was quite a fruitful workshop, and directly inspired the various posts this week on this blog. Incidentally, BIRS being as efficient as it is, videos for this week’s talks are already online.

As mentioned in the previous two posts, Ben Green, Tamar Ziegler, and myself proved the following inverse theorem for the Gowers norms:

Theorem 1 (Inverse theorem for Gowers norms) Let {N \geq 1} and {s \geq 1} be integers, and let {\delta > 0}. Suppose that {f: {\bf Z} \rightarrow [-1,1]} is a function supported on {[N] := \{1,\dots,N\}} such that

\displaystyle \frac{1}{N^{s+2}} \sum_{n,h_1,\dots,h_{s+1} \in {\bf Z}} \prod_{\omega \in \{0,1\}^{s+1}} f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta.

Then there exists a filtered nilmanifold {G/\Gamma} of degree {\leq s} and complexity {O_{s,\delta}(1)}, a polynomial sequence {g: {\bf Z} \rightarrow G}, and a Lipschitz function {F: G/\Gamma \rightarrow {\bf R}} of Lipschitz constant {O_{s,\delta}(1)} such that

\displaystyle \frac{1}{N} \sum_n f(n) F(g(n) \Gamma) \gg_{s,\delta} 1.

There is a higher dimensional generalisation, which first appeared explicitly (in a more general form) in this preprint of Szegedy (which used a slightly different argument than the one of Ben, Tammy, and myself; see also this previous preprint of Szegedy with related results):

Theorem 2 (Inverse theorem for multidimensional Gowers norms) Let {N \geq 1} and {s,d \geq 1} be integers, and let {\delta > 0}. Suppose that {f: {\bf Z}^d \rightarrow [-1,1]} is a function supported on {[N]^d} such that

\displaystyle \frac{1}{N^{d(s+2)}} \sum_{n,h_1,\dots,h_{s+1} \in {\bf Z}^d} \prod_{\omega \in \{0,1\}^{s+1}} f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta. \ \ \ \ \ (1)

Then there exists a filtered nilmanifold {G/\Gamma} of degree {\leq s} and complexity {O_{s,\delta,d}(1)}, a polynomial sequence {g: {\bf Z}^d \rightarrow G}, and a Lipschitz function {F: G/\Gamma \rightarrow {\bf R}} of Lipschitz constant {O_{s,\delta,d}(1)} such that

\displaystyle \frac{1}{N^d} \sum_{n \in {\bf Z}^d} f(n) F(g(n) \Gamma) \gg_{s,\delta,d} 1.

The {d=2} case of this theorem was recently used by Wenbo Sun. One can replace the polynomial sequence with a linear sequence if desired by using a lifting trick (essentially due to Furstenberg, but which appears explicitly in Appendix C of my paper with Ben and Tammy).

In this post I would like to record a very neat and simple observation of Ben Green and Nikos Frantzikinakis, that uses the tool of Freiman isomorphisms to derive Theorem 2 as a corollary of the one-dimensional theorem. Namely, consider the linear map {\phi: {\bf Z}^d \rightarrow {\bf Z}} defined by

\displaystyle \phi( n_1,\dots,n_d ) := \sum_{i=1}^d (10 N)^{i-1} n_i,

that is to say {\phi} is the digit string base {10N} that has digits {n_d \dots n_1}. This map is a linear map from {[N]^d} to a subset of {[d 10^d N^d]} of density {1/(d10^d)}. Furthermore it has the following “Freiman isomorphism” property: if {n, h_1,\dots,h_{s+1}} lie in {{\bf Z}} with {n + \omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}} in the image set {\phi( [N]^d )} of {[N]^d} for all {\omega}, then there exist (unique) lifts {\tilde n \in {\bf Z}^d, \tilde h_1,\dots,\tilde h_{s+1} \in {\bf Z}} such that

\displaystyle \tilde n + \omega_1 \tilde h_1 + \dots + \omega_{s+1} \tilde h_{s+1} \in [N]^d


\displaystyle \phi( \tilde n + \omega_1 \tilde h_1 + \dots + \omega_{s+1} \tilde h_{s+1} ) = n + \omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}

for all {\omega}. Indeed, the injectivity of {\phi} on {[N]^d} uniquely determines the sum {\tilde n + \omega_1 \tilde h_1 + \dots + \omega_{s+1} \tilde h_{s+1}} for each {\omega}, and one can use base {10N} arithmetic to verify that the alternating sum of these sums on any {2}-facet of the cube {\{0,1\}^{s+1}} vanishes, which gives the claim. (In the language of additive combinatorics, the point is that {\phi} is a Freiman isomorphism of order (say) {8} on {[N]^d}.)

Now let {\tilde f: {\bf Z} \rightarrow [-1,1]} be the function defined by setting {\tilde f( \phi(n) ) := f(n)} whenever {n \in [N]^d}, with {\tilde f} vanishing outside of {\phi([N]^d)}. If {f} obeys (1), then from the above Freiman isomorphism property we have

\displaystyle \frac{1}{N^{d(s+2)}} \sum_{n, h_1,\dots,h_{s+1} \in {\bf Z}} \prod_{\omega \in \{0,1\}^{s+1}} \tilde f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta.

Applying the one-dimensional inverse theorem (Theorem 1), with {\delta} reduced by a factor of {d 10^d} and {N} replaced by {d 10^d N^d}, this implies the existence of a filtered nilmanifold {G/\Gamma} of degree {\leq s} and complexity {O_{s,\delta,d}(1)}, a polynomial sequence {g: {\bf Z} \rightarrow G}, and a Lipschitz function {F: G/\Gamma \rightarrow {\bf R}} of Lipschitz constant {O_{s,\delta,d}(1)} such that

\displaystyle \frac{1}{N^{d(s+2)}} \sum_{n \in {\bf Z}} \tilde f(n) F(g(n) \Gamma) \gg_{s,\delta,d} 1

which by the Freiman isomorphism property again implies that

\displaystyle \frac{1}{N^{d(s+2)}} \sum_{n \in {\bf Z}^d} f(n) F(g(\phi(n)) \Gamma) \gg_{s,\delta,d} 1.

But the map {n \mapsto g(\phi(n))} is clearly a polynomial map from {{\bf Z}^d} to {G} (the composition of two polynomial maps is polynomial, see e.g. Appendix B of my paper with Ben and Tammy), and the claim follows.

Remark 3 This trick appears to be largely restricted to the case of boundedly generated groups such as {{\bf Z}^d}; I do not see any easy way to deduce an inverse theorem for, say, {\bigcup_{n=1}^\infty {\mathbb F}_p^n} from the {{\bf Z}}-inverse theorem by this method.

Remark 4 By combining this argument with the one in the previous post, one can obtain a weak ergodic inverse theorem for {{\bf Z}^d}-actions. Interestingly, the Freiman isomorphism argument appears to be difficult to implement directly in the ergodic category; in particular, there does not appear to be an obvious direct way to derive the Host-Kra inverse theorem for {{\bf Z}^d} actions (a result first obtained in the PhD thesis of Griesmer) from the counterpart for {{\bf Z}} actions.

Filed under: expository, math.CO Tagged: Ben Green, Freiman isomorphism, Gowers uniformity norms, Nikos Frantzikinakis

Terence TaoAnalytic Number Theory program at MSRI: Jan-May 2017

Chantal David, Andrew Granville, Emmanuel Kowalski, Phillipe Michel, Kannan Soundararajan, and I are running a program at MSRI in the Spring of 2017 (more precisely, from Jan 17, 2017 to May 26, 2017) in the area of analytic number theory, with the intention to bringing together many of the leading experts in all aspects of the subject and to present recent work on the many active areas of the subject (the discussion on previous blog posts here have mostly focused on advances in the study of the distribution of the prime numbers, but there have been many other notable recent developments too, such as refinements of the circle method, a deeper understanding of the asymptotics of bounded multiplicative functions and of the “pretentious” approach to analytic number theory, more “analysis-friendly” formulations of the theorems of Deligne and others involving trace functions over fields, and new subconvexity theorems for automorphic forms, to name a few).  Like any other semester MSRI program, there will be a number of workshops, seminars, and similar activities taking place while the members are in residence.  I’m personally looking forward to the program, which should be occurring in the midst of a particularly productive time for the subject.  Needless to say, I (and the rest of the organising committee) plan to be present for most of the program.

Applications for Postdoctoral Fellowships, Research Memberships, and Research Professorships for this program (and for other MSRI programs in this time period, namely the companion program in Harmonic Analysis and the Fall program in Geometric Group Theory, as well as the complementary program in all other areas of mathematics) have just opened up today.  Applications are open to everyone (until they close on Dec 1), but require supporting documentation, such as a CV, statement of purpose, and letters of recommendation from other mathematicians; see the application page for more details.

Filed under: advertising, math.NT, non-technical Tagged: Andrew Granville, Chantal David, Emmanuel Kowalski, Kannan Soundararajan, MSRI, Phillipe Michel

David Hoggstellar ages and fish-eye cameras

Ness and I are getting red-giant ages from APOGEE spectroscopy using The Cannon and a training set from Kepler. We worked on making plots that would help us understand where this age information is coming from. Options include: Chromospheric activity (which decreases with time in stars as the magnetic field decays), dredge-up of C, N, O from the nucleosynthetic core (which pollutes the surface abundances over time), trace element abundances (which might indicate birth place and time beyond the information in gross indicators like metallicity and alpha-enhancement), and non-LTE effects (which might be different in different stars since convection patterns and scale are a function of mass).

Tom Herbst (MPIA) showed me his fish-eye all-sky camera and data acquisition system, and we discussed science projects for it with Markus Pössel (MPIA). The whole system is on the roof, so its computer and controller and everything are isolated from all the building systems to protect the building and its IT infrastructure from lightning strikes!

Rix and I asked Yuan-Sen Ting (Harvard) to compute derivatives with respect to elements for the stellar models being used to analyze the APOGEE data. The idea is that we want to ask how well we can linearize the models around fiducial points and then model (and therefore make measurements from) the spectra.

David Hoggstellar ages in the Milky Way disk

OMG I am so stoked! I made plots of the ages of the APOGEE red-clump stars as a function of position in the Milky Way and got the image below (one plot of many). You can see the thin disk jump out but you can also see a flaring of the young disk at large radius that Rix and Bovy told Ness and me we should expect.

Jonathan Bird (Vanderbilt) crashed our party and asked if he could plot the age-metallicity relation that is implied by our red-clump-star ages. We said “yes” of course! Bird has ideas about the thickness of the disk being set not just by heating or scattering but also in part by the way disks form, which should be dynamically hotter at higher redshifts.

David Hoggnot very much

I gave the Königstuhl Colloquium on data-driven models. With preparing for that and a long lunch, no other research today!

David Hogg#streams2015, day 5

Today was the last (and half) day of the Ringberg meeting. At breakfast, we discussed an argument James Binney (Oxford) had made the previous day about n-body simulations not being the answer to physics questions: His view is that simulations should just inspire theories that are continuous, large-scale, and manipulable with other tools. He gave some examples. I, in the end, disagreed with Binney's argument and his examples: There are many fields of science where all of the predictions come from small-scale (n-body or molecular or finite-element) simulations and all the “understanding” is pure story-time about those simulations. For example, structural and aerospace engineering is all about finite-element analysis. And climate and weather are all about simulations. And many of the beautiful, effective theories for solid-state physics and so on are not working very well as we push to new regimes; we have to go back to molecular-level simulations. All that said, I am sympathetic to Binney's point, which is that we don't always get a lot of understanding out of the crank-turning on n-body simulations.

Today's talks were very lively, with Carlberg (Toronto) and Küpper talking about stream substructure created by dark-matter substructure, and Bonaca talking about our work on streams in realistic (rather than analytic) potentials. Bonaca shows that our conlusions from analytic potential models can be very wrong, even in gross properties, but that things improve as we include more and more streams into the analysis. She got some lively questioning from Evans, who was not buying it. I'm with Bonaca, of course!

The drive home was a long discussion between Rix and me about all that had happened and what to do next. We are fired up about new approaches to stream searching, to use now on SDSS and PanSTARRS data, but also to have very ready for (what we are calling) Gaia DR2.

July 31, 2015

Tommaso DorigoBang! Meet The Highest-Energy Hadron Collision Ever Imaged!

The 13 TeV data from LHC collisions taken this summer is quickly going through analysis programs and being used for new physics results, and everybody is wondering whether there are surprises in store... Of course that will require some more time to be ascertained.
For the time being, I can offer a couple of very inspiring pictures. CMS recorded a spectacular event featuring two extremely high-energy jets in the first 40 inverse picobarns of data that was collected and reconstructed by the experiment with all detector component properly working.

read more

Sean CarrollSpacetime, Storified

I had some spare minutes the other day, and had been thinking about the fate of spacetime in a quantum universe, so I took to the internet to let my feelings be heard. Only a few minutes, though, so I took advantage of Twitter rather than do a proper blog post. But through the magic of Storify, I can turn the former into the latter!

Obviously the infamous 140-character limit of Twitter doesn’t allow the level of precision and subtlety one would always like to achieve when talking about difficult topics. But restrictions lead to creativity, and the results can actually be a bit more accessible than unfettered prose might have been.

Anyway, spacetime isn’t fundamental, it’s just a useful approximation in certain regimes. Someday we hope to know what it’s an approximation to.

Clifford JohnsonAlmost Done

So Tuesday night, I decided that it was imperative that I paid a visit to one really good restaurant (at least) before leaving Santiago. My duties at ICMP2015 were over, and I was tired, so did not want to go too far, but I'd heard there were good ones in the area, so I asked the main organizer and he made a recommendation. imageIt was an excellent choice. One odd thing: the hotel is in two separate towers, and I'd noticed this upon arrival and started calling it The Two Towers in my mind for the time was there. Obviously, right? Well, anyway, the restaurant is right around the corner from it plus a two minute walk, and.... Wait for is called Le Due Tonni, which translates into The Two Towers, but apparently it has nothing to do with my observation about the hotel, since it got that name from a sister restaurant in a different part of town, I am told. So... An odd coincidence. I will spare you the details of what I had for dinner save to say that if you get the fettuccini con salmon you're on to a sure thing, and to warn that you don't end up accidentally ordering a whole bottle of wine instead of a glass of it because you're perhaps used to over-inflated wine prices in LA restaurants (caught it before it was opened and so saved myself having to polish off a whole bottle on my own)... Another amusing note is that one of my problems with getting my rusty Spanish out for use only occasionally is that I get logjams in my head because vocabulary from Spanish, French, and Italian all come to me mid sentence and I freeze sometimes. I'd just been getting past doing that by Tuesday, but then got very confused in the restaurant at one point until I realized my waiter was, oddly, speaking to me in Italian at times. I still am not sure why. It was a good conference to come to, I think, because I connected [...] Click to continue reading this post

July 30, 2015

Tommaso Dorigo(Well)-Paid PhD Position In Physics Offered In Padova, Italy

Are you a post-lauream student in Physics, interested in pursuing a career in particle physics, and maybe with interest in advanced machine learning applications, with an eye to a great job after your PhD ? Then this posting is for you.

read more

David Hogg#streams2015, day 4

[As a result of crippling disorganization, blog posting has gone pear-shaped. I will be posting out-of-date posts with correct dates (and 23:59 time stamps), but possibly considerably after the fact and out of chronological order.]

Another great day at Ringberg Castle today. For me, the highlight was an unconference session called Gaia Zero-Day Exploit, which was about what we could do immediately upon release of Gaia data in its first and second releases. So many good ideas came up, and most of them are summarized in incredibly telegraphic form on the crowd-sourced meeting minutes (search down the document for "Shovel-ready"). Gaia DR2 is much more interesting than DR1 for Galactic dynamics. Some great ideas included: Find the alleged fountain of hyper-velocity stars coming from the Galactic Center and use them to infer the shape of the MW halo. And: Look at the vertical velocities in the disk and see the pattern speed of disk warping and precession (see mention below of Widrow). And: Look for direct evidence of the process of dynamical friction—that is, look for the stellar wakes. And, simply: Get the proper motions of all Milky Way companions and streams. The list is long. After the session, Rix and I felt like we should find some way to get some of these fleshed out and quasi-published in some community forum.

David Martinez-Delgado (Heidelberg) kicked off the day with his absolutely magnificent images of galaxies, in which he uses small telescopes and long exposure times to get amazingly sensitive images of tidal features. He compared similar images from his small telescopes to those taken with huge telescopes, and showed that smaller is better. There was some discussion afterwards of "why". Some of it is that his calibration requirements are high, and he spent more time on his own instruments (he may be the best flat-fielder in all the land). But some of it is that big telescopes have big cameras with many optical surfaces, which make it hard to go for large dynamic range in intensity.

Larry Widrow (Queen's) talked about galactic plane distortions and coined the term "galactoseismology".

After watching various talks in which streams were simulated, Price-Whelan and I noticed that for the first few orbits (even in chaotic potentials), the stream stars make a "bowtie" shape when very near apocenter. And the stars spend a lot of time out there. So perhaps we should be looking for these: There might be many more bowties than streams! We discussed.

July 29, 2015

n-Category Café Internal Languages of Higher Categories

(guest post by Chris Kapulkin)

I recently posted the following preprint to the arXiv:

Btw if you’re not the kind of person who likes to read mathematical papers, I also gave a talk about the above mentioned work in Oxford, so you may prefer to watch it instead. (-:

I see this work as contributing to the idea/program of HoTT as the internal language of higher categories. In the last few weeks, there has been a lot of talk about it, prompted by somewhat provocative posts on Michael Harris’ blog.

My goal in this post is to survey the state of the art in the area, as I know it. In particular, I am not going to argue that internal languages are a solution to many of the problems of higher category theory or that they are not. Instead, I just want to explain the basic idea of internal languages and what we know about them as far as HoTT and higher category theory are concerned.

Disclaimer. The syntactic rules of dependent type theory look a lot like a multi-sorted essentially algebraic theory. If you think of sorts called types and terms then you can think of rules like Σ\Sigma-types and Π\Pi-types as algebraic operations defined on these sorts. Although the syntactic presentation of type theory does not quite give an algebraic theory (because of complexities such as variable binding), it is possible to formulate dependent type theory as an essentially algebraic theory. However, actually showing that these two presentations are equivalent has proven complicated and it’s a subject of ongoing work. Thus, for the purpose of this post, I will take dependent type theories to be defined in terms of contextual categories (a.k.a. C-systems), which are the models for this algebraic theory (thus leaving aside the Initiality Conjecture). Ultimately, we would certainly like to know that these statements hold for syntactically-presented type theories; but that is a very different question from the \infty-categorical aspects I will discuss here.

A final comment before we begin: this post derives greatly from my (many) conversations with Peter Lumsdaine. In particular, the two of us together went through the existing literature to understand precisely what’s known and what’s not. So big thanks to Peter for all his help!

Internal languages of categories

First off, what is the internal language? Without being too technical, let me say that it is typically understood as a correspondence:

(1){theories}LangCl{categories} \left\{\text{theories}\right\} \overset{\mathrm{Cl}}{\underset{\mathrm{Lang}}{\rightleftarrows}} \left\{\text{categories}\right\}

On the right hand side of this correspondence, we have a category of categories with some extra structure and functors preserving this structure. On the left hand side, we have certain type theories, which are extensions of a fixed core one, and their interpretations (which are, roughly speaking, maps taking types to types and terms to terms, preserving typing judgements and the constructors of the core theory). Notice that the core theory is the initial object in the category of theories in the above picture.

The functor Cl\mathrm{Cl} takes a theory to its initial model, built directly out of syntax of the theory: the objects are contexts and the morphisms are (sequences of) terms of the theory (this category is often called the classifying category, hence the notation Cl\mathrm{Cl}). The functor Lang\mathrm{Lang} takes a category to the theory whose types are generated in a suitable sense by the objects and whose terms are generated by the morphisms of the category. In particular, constructing Lang(C)\mathrm{Lang}(C) for some category CC from the right hand side is the same as establishing a model of the core theory in CC.

Finally, these functors are supposed to form some kind of an adjoint equivalence (with ClLang\mathrm{Cl} \dashv \mathrm{Lang}), be it an equivalence of categories, bi-equivalence, or \infty-equivalence, depending on whether the two sides of the correspondence are categories, 22-categories, or \infty-categories.

The cleanest example of this phenomenon is the correspondence between λ\lambda-theories (that is, theories in simply typed λ\lambda-calculus) and cartesian closed categories:

(2){theories in λ-calculus}LangCl{cartesian closed categories} \left\{\begin{array}{c}\text{theories in}\\ \lambda\text{-calculus}\end{array}\right\} \overset{\mathrm{Cl}}{\underset{\mathrm{Lang}}{\rightleftarrows}} \left\{\begin{array}{c}\text{cartesian closed}\\ \text{categories}\end{array}\right\}

which you can read about in Part I of the standard text by Jim Lambek and Phil Scott.

Extensional Martin-Löf Type Theory

Unfortunately, as soon as we move to dependent type theory, things are getting more complicated. Starting with the work of Robert Seely, it has become clear that one should expect Extensional Martin-Löf Type Theory (with dependent products, dependent sums, and extensional Identity types) to be the internal language of locally cartesian closed categories:

(3){Extensional Martin-Löf type theories}LangCl{locally cartesian closed categories} \left\{\begin{array}{c}\text{Extensional Martin-Löf}\\ \text{type theories}\end{array}\right\} \overset{\mathrm{Cl}}{\underset{\mathrm{Lang}}{\rightleftarrows}} \left\{\begin{array}{c}\text{locally cartesian}\\ \text{closed categories}\end{array}\right\}

Seely overlooked however an important coherence problem: since types are now allowed to depend on other types, we need to make coherent choices of pullbacks. The reason for that is that type-theoretically substitution (into a type) is a strictly functorial operation, whereas its categorical counterpart, pullback, without making any choices, is functorial only up to isomorphism. (If you find the last sentence slightly too brief, I recommend Peter Lumsdaine’s talk explaining the problem and known solutions.) The fix was later found by Martin Hofmann; but the resulting pair of functors does not form an equivalence of categories, only a biequivalence of 22-categories, as was proven in 2011 by Pierre Clairambault and Peter Dybjer.

Intensional Martin-Löf Type Theory and locally cartesian closed \infty-categories

Next let us consider Intensional Martin-Löf Type Theory with dependent products and sums, and the identity types; additionally, we will assume the (definitional) eta rule for dependent functions and functional extensionality.

Such type theory has been, at least informally, conjectured to be the internal language of locally cartesian closed \infty-categories and thus, we expect the following correspondence:

(4){Intensional Martin-Löf type theories}LangCl{locally cartesian closed -categories} \left\{\begin{array}{c}\text{Intensional Martin-Löf}\\ \text{type theories}\end{array}\right\} \overset{\mathrm{Cl}}{\underset{\mathrm{Lang}}{\rightleftarrows}} \left\{\begin{array}{c}\text{locally cartesian}\\ \text{closed }\infty\text{-categories}\end{array}\right\}

where the functors Cl\mathrm{Cl} and Lang\mathrm{Lang} should form an adjunction (an adjoint (,1)(\infty,1)-equivalence? or maybe even (,2)(\infty,2)-equivalence?) between the type-theoretic and categorical sides.

Before I summarize the state of the art, let me briefly describe what the two functors ought to be. Starting with a type theory, one can take its underlying category of contexts and regard it as category with weak equivalences (where the weak equivalences are syntactically-defined equivalences), to which one can then apply the simplicial localization. This gives a well-defined functor from type theories to \infty-categories. Of course, it is a priori not clear what the (essential) image of such a functor would be.

Conversely, given a locally cartesian closed \infty-category CC, one can look for a category with weak equivalences (called the presentation of CC) whose simplicial localization is CC and then try to establish the structure of a categorical model of type theory on such a category.

What do we know? The verification that Cl\mathrm{Cl} takes values in locally cartesian closed \infty-categories can be found in my paper. The other functor is known only partially. More precisely, if CC is a locally presentable locally cartesian closed \infty-category, then one can construct Lang(C)\mathrm{Lang}(C). As mentioned above, the construction is done in two steps. The first, that is presenting such an \infty-category by a type-theoretic model category (which is, in particular, a category with weak equivalences) was given by Denis-Charles Cisinski and Mike Shulman in these blog comments, and independently in Theorem 7.10 of this paper by David Gepner and Joachim Kock. The second step (the structure of a model of type theory) is precisely Example of the local universe model paper by Peter Lumsdaine and Michael Warren.

What don’t we know? First off, how to define Lang(C)\mathrm{Lang}(C) when CC is not locally presentable and whether the existing definition of Lang(C)\mathrm{Lang}(C) for locally presentable quasicategories is even functorial? We also need to understand what the homotopy theory of type theories is (if we’re hoping for an equivalence of homotopy theories, we need to understand the homotopy theory of the left hand side!)? In particular, what are the weak equivalences of type theories? Next in line: what is the relation between Cl\mathrm{Cl} and Lang\mathrm{Lang}? Are they adjoint and if so, can we hope that they will yield an equivalence of the corresponding \infty-categories?

Univalence, Higher Inductive Types, and (elementary) \infty-toposes

Probably the most interesting part of this program is the connection between Homotopy Type Theory and higher topos theory (HoTT vs HTT?). Conjecturally, we should have a correspondence:

(5){Homotopy Type Theories}LangCl{elementary -toposes} \left\{\begin{array}{c}\text{Homotopy}\\ \text{Type Theories}\end{array}\right\} \overset{\mathrm{Cl}}{\underset{\mathrm{Lang}}{\rightleftarrows}} \left\{\begin{array}{c}\text{elementary}\\ \infty\text{-toposes}\end{array}\right\}

This is however not a well-defined problem as it depends on one’s answer to the following two questions:

  1. What is HoTT? Obviously, it should be a system that extends Intensional Martin-Löf Type Theory, includes at least one, but possibly infinitely many univalent universes, and some Higher Inductive Types, but what exactly may largely depend on the answer to the next question…
  2. What is an elementary \infty-topos? While there exist some proposals (for example, that presented by Andr'e Joyal in 2014), this question also awaits a definite answer. By analogy with the 11-categorical case, every Grothendieck \infty-topos should be an elementary \infty-topos, but not the other way round. Moreover, the axioms of an elementary \infty-topos should imply (maybe even explicitly include) that it is locally cartesian closed and that it has finite colimits, but should not imply local presentability.

What do we know? As of today, only partial constructions of the functor Lang\mathrm{Lang} exist. More precisely, there are:

  • Theorem 6.4 of Mike Shulman’s paper contains the construction of Lang(C)\mathrm{Lang}(C) if CC is a Grothendieck \infty-topos that admits a presentation as simplicial presheaves on an elegant Reedy category and HoTT is taken the extension of Intensional Martin-Löf Type Theory but as many univalent universes 'a la Tarski as there are inaccessible cardinals greater than the cardinality of the site.
  • Remark 1.1 of the same paper can be interpreted as saying that if one considers HoTT with weak (univalent) universes instead, then the construction of Lang(C)\mathrm{Lang}(C) works for an arbitrary \infty-topos CC.
  • A forthcoming paper by Peter Lumsdaine and Mike Shulman will supplement the above two points: for some reasonable range of higher toposes, the resulting type theory Lang(C)\mathrm{Lang}(C) will be also shown to possess certain Higher Inductive Types (e.g. homotopy pushouts and truncations), although the details remain to be seen.

What don’t we know? It still remains to define Lang(C)\mathrm{Lang}(C) outside of the presentable setting, as well as give the construction of Cl\mathrm{Cl} in this case (or rather, check that the obvious functor from type theories to \infty-categories takes values in higher toposes). The formal relation between these functors (which are yet to be defined) remains wide open.

Clifford JohnsonExistence Proof

imageThe picture is evidence that bug-free Skype seminars are possible! Well, I suppose it only captured an instant, and not the full hour's worth of two separate bug-free talks each with their own Q&A, but that is what happened. The back story is that two of our invited speakers, Lara Anderson and James Gray, had flight delays that prevented them from arriving in Santiago on time and so I spent a bit of time (at the suggestion of my co-organizer Wati Taylor, who also could not make the trip) figuring out how we could save the schedule by having them give Skype seminars. (We already had to make a replacement elsewhere in the schedule since another of our speakers was ill and had to cancel his trip.) Two Skype talks seemed a long shot back on Sunday when Wati had the idea, but after some local legwork on my part it gradually because more likely, and by lunchtime today I had the local staff fully on board with the idea and we tested it all and it worked! It helps that you can send the whole of your computer screen as the video feed, and so the slides came out nicely (I'd originally planned a more complicated arrangement where we'd have the [...] Click to continue reading this post

July 28, 2015

John PreskillThe mentors that shape us

Three years and three weeks ago I started my first blog. I wasn’t quite sure what to call it, so I went to John for advice. I had several names in mind, but John quickly zeroed in on one: Quantum Frontiers. The url was available, the name was simple and to the point, it had the word quantum in it, and it was appropriate for a blog that was to provide a vantage point from which the public could view the frontiers of quantum science. But there was a problem; we had no followers and when I first asked John if he would write something for the blog, he had said: I don’t know… I will see…maybe some day… let me think about it. The next day John uploaded More to come, the first real post on Quantum Frontiers after the introductory Hello quantum world! We had agreed on a system in order to keep the quality of the posts above some basic level: we would send each other our posts for editing before we made them public. That way, we could catch any silly typos and have a second pair of eyes do some fact-checking. So, when John sent me his first post, I went to task editing away typos. But the power that comes with being editor-in-chief corrupts. So, when I saw the following sentence in More to come…

I was in awe of Wheeler. Some students thought he sucked.

I immediately changed it to…

I was in awe of Wheeler. Some students thought less of him.

And next, when I saw John write about himself,

Though I’m 59, few students seemed awed. Some thought I sucked. Maybe I did sometimes.

I massaged it into…

Though I’m 59, few students seemed awed. Some thought I was not as good. Maybe I wasn’t sometimes.

When John published the post, I read it again for any typos I might have missed. There were no typos. I felt useful! But when I saw that all mentions of sucked had been restored to their rightful place, I felt like an idiot. John did not fire a strongly-worded email back my way asking for an explanation as to my taking liberties with his own writing. He simply trusted that I would get the message in the comfort of my own awkwardness. It worked beautifully. John had set the tone for Quantum Frontier’s authentic voice with his very first post. It was to be personal, even if the subject matter was as scientifically hardcore as it got.

So when the time came for me to write my first post, I made it personal. I wrote about my time in Los Alamos as a postdoc, working on a problem in mathematical physics that almost broke me. It was Matt Hastings, an intellectual tornado, that helped me through these hard times. As my mentor, he didn’t say things like Well done! Great progress! Good job, Spiro! He said, You can do this. And when I finally did it, when I finally solved that damn problem, Matt came back to me and said: Beyond some typos, I cannot find any mistakes. Good job, Spiro. And it meant the world to me. The sleepless nights, the lonely days up in the Pajarito mountains of New Mexico, the times I had resolved to go work for my younger brother as a waiter in his first restaurant… those were the times that I had come upon a fork on the road and my mentor had helped me choose the path less traveled.

When the time came for me to write my next post, I ended by offering two problems for the readers to solve, with the following text as motivation:

This post is supposed to be an introduction to the insanely beautiful world of problem solving. It is not a world ruled by Kings and Queens. It is a world where commoners like you and me can become masters of their domain and even build an empire.


Doi-Inthananon temple in Chiang Mai, Thailand. A breathtaking city, host of this year’s international math olympiad.

It has been way too long since my last “problem solving” post, so I leave you with a problem from this year’s International Math Olympiad, which took place in gorgeous Chiang Mai, Thailand. FiverThirtyEight‘s recent article about the dominance of the US math olympic team in this year’s competition, gives some context about the degree of difficulty of this problem:

Determine all triples (a, b, c) of positive integers such that each of the numbers: ab-c, bc-a, ca-b is a power of two.

Like Fermat’s Last Theorem, this problem is easy to describe and hard to solve. Only 5 percent of the competitors got full marks on this question, and nearly half (44 percent) got no points at all.

But, on the triumphant U.S. squad, four of the six team members nailed it.

In other words, only 1 in 20 kids in the competition solved this problem correctly and about half of the kids didn’t even know where to begin. For more perspective, each national team is comprised of the top 6 math prodigies in that country. In China, that means 6 out of something like 100 million kids. And only 3-4 of these kids solved the problem.

The coach of the US national team, Po-Shen Loh, a Caltech alum and an associate professor of mathematics at Carnegie Mellon University (give him tenure already) deserves some serious props. If you think this problem is too hard, I have this to say to you: Yes, it is. But, who cares? You can do this.

Note: I will work out the solution in detail in an upcoming post, unless one of you solves it in the comments section before then!

Update: Solution posted in comments below (in response to Anthony’s comment). Thank you all who posted some of the answers below. The solution is far from trivial, but I still wonder if an elegant solution exists that gives all four triples. Maybe the best solution is geometric? I hope one of you geniuses can figure that out!

Doug NatelsonWhat are Weyl fermions? Do they really move charge 1000x faster?

A new paper (published online in Science) with experimental work from Zahid Hasan's group at Princeton has made a splash this week.  In it, they argue that they see evidence in crystals of the compound TaAs for (quasi)particles that have the properties of so-called Weyl fermions.  What does this mean?

I've written before about quasiparticles.  The idea is, in a big system of interacting  degrees of freedom (electrons for example), you can ask, how would we best describe the low energy excitations of that system?  Often the simplest, most natural description of the excitations involves entities with well-defined quantum numbers like momentum, angular momentum, electric charge, magnetic moment, and even something analogous to mass.  These low energy excitations are quasiparticles - they're "quasi" because they don't exist outside of the material medium in which they're defined, but they're "particles" because they have all these descriptive parameters that we usually think of as properties of material particles.  In this situation, when we say that a quasiparticle has a certain mass, this is code for a discussion about how the energy of the excitation depends upon its momentum.  For a non-relativistic, classical particle like a baseball, the kinetic energy \(E = p^{2}/2m\), where \(p\) is the magnitude of the momentum.  So, if a quasiparticle has an energy roughly quadratic in its momentum, we can look at the number in front of the \(p^{2}\) and define it to be \(1/2m^{*}\), where \(m^{*}\) is an "effective mass".

In some materials under certain circumstances, you end up with quasiparticles with a kinetic energy that depends linearly on the momentum, \(E \sim p\).  This is reminiscent of the situation for light in the vacuum, where \(E = p c\), with \(c\) the speed of light.  A quasiparticle with this "linear dispersion" is said to act like it's "massless", in the same way that light has no mass yet still has energy and momentum.  This doesn't mean that something in the material is truly massless - it just means that those quasiparticles propagate at a fixed speed (given by the constant of proportionality between energy and momentum).  If the quasiparticle happens to have spin-1/2 (and therefore is a fermion),  then it would be a "massless fermion".  It turns out that graphene is an example of a material where, near certain energies, the quasiparticles act like this, and mathematically are well-described by a formulation dreamed up by Dirac and others - these are "massless Dirac fermions".

Wait - it gets richer.  In materials with really strong spin-orbit coupling, you can have a situation where the massless, charged fermions have a spin that is really locked to the momentum of the quasiparticle.  That is, you can have a situation where the quasiparticles are either right-handed (picture the particle as a bullet, spinning clockwise about an axis along its direction of motion when viewed from behind) or left-handed.  If this does not happen only at particular momenta (or only at a material surface), but can happen over a more general energy and momentum range (and in the material bulk), these quasiparticles can be described in language formulated by Weyl, and are "Weyl fermions".   Thanks to their special properties, the Weyl particles are also immune to some back-scattering (the kind of thing that increases electrical resistance).  I'm being deliberately vague here rather than delving into the math.  If you are very motivated, this paper is a good pedagogical guide.

So, what did the authors actually do?  Primarily, they used a technique called angle-resolved photoemission spectroscopy (ARPES) to measure, in 3d and very precisely, the relationship between energy and momentum for the various quasiparticle excitations in really high quality crystals of TaAs.  They found all the signatures expected for Weyl fermion-like quasiparticles, which is pretty cool.

Will this lead to faster computers, with charge moving 1000x faster, as claimed in various mangled versions of the press release?  No.  I'm not even sure where the writers got that number, unless it's some statement about the mobility of charge carriers in TaAs relative to their mobility in silicon.    This system is a great example of how profound mathematical descriptions (formulated originally to deal with hypothetical "fundamental" high energy physics) can apply to emergent properties of many-body systems.  It's the kind of thing that makes you wonder how fundamental are some of the properties we see in particle physics.  Conceivably there could be some use of this material in some technology, but it is silly (and in my view unnecessary) to claim that it will speed up computers.

July 27, 2015

Terence TaoCycles of a random permutation, and irreducible factors of a random polynomial

In analytic number theory, there is a well known analogy between the prime factorisation of a large integer, and the cycle decomposition of a large permutation; this analogy is central to the topic of “anatomy of the integers”, as discussed for instance in this survey article of Granville. Consider for instance the following two parallel lists of facts (stated somewhat informally). Firstly, some facts about the prime factorisation of large integers:

  • Every positive integer {m} has a prime factorisation

    \displaystyle  m = p_1 p_2 \dots p_r

    into (not necessarily distinct) primes {p_1,\dots,p_r}, which is unique up to rearrangement. Taking logarithms, we obtain a partition

    \displaystyle  \log m = \log p_1 + \log p_2 + \dots + \log p_r

    of {\log m}.

  • (Prime number theorem) A randomly selected integer {m} of size {m \sim N} will be prime with probability {\approx \frac{1}{\log N}} when {N} is large.
  • If {m \sim N} is a randomly selected large integer of size {N}, and {p = p_i} is a randomly selected prime factor of {m = p_1 \dots p_r} (with each index {i} being chosen with probability {\frac{\log p_i}{\log m}}), then {\log p_i} is approximately uniformly distributed between {0} and {\log N}. (See Proposition 9 of this previous blog post.)
  • The set of real numbers {\{ \frac{\log p_i}{\log m}: i=1,\dots,r \}} arising from the prime factorisation {m = p_1 \dots p_r} of a large random number {m \sim N} converges (away from the origin, and in a suitable weak sense) to the Poisson-Dirichlet process in the limit {N \rightarrow \infty}. (See the previously mentioned blog post for a definition of the Poisson-Dirichlet process, and a proof of this claim.)

Now for the facts about the cycle decomposition of large permutations:

  • Every permutation {\sigma \in S_n} has a cycle decomposition

    \displaystyle  \sigma = C_1 \dots C_r

    into disjoint cycles {C_1,\dots,C_r}, which is unique up to rearrangement, and where we count each fixed point of {\sigma} as a cycle of length {1}. If {|C_i|} is the length of the cycle {C_i}, we obtain a partition

    \displaystyle  n = |C_1| + \dots + |C_r|

    of {n}.

  • (Prime number theorem for permutations) A randomly selected permutation of {S_n} will be an {n}-cycle with probability exactly {1/n}. (This was noted in this previous blog post.)
  • If {\sigma} is a random permutation in {S_n}, and {C_i} is a randomly selected cycle of {\sigma} (with each {i} being selected with probability {|C_i|/n}), then {|C_i|} is exactly uniformly distributed on {\{1,\dots,n\}}. (See Proposition 8 of this blog post.)
  • The set of real numbers {\{ \frac{|C_i|}{n} \}} arising from the cycle decomposition {\sigma = C_1 \dots C_r} of a random permutation {\sigma \in S_n} converges (in a suitable sense) to the Poisson-Dirichlet process in the limit {n \rightarrow \infty}. (Again, see this previous blog post for details.)

See this previous blog post (or the aforementioned article of Granville, or the Notices article of Arratia, Barbour, and Tavaré) for further exploration of the analogy between prime factorisation of integers and cycle decomposition of permutations.

There is however something unsatisfying about the analogy, in that it is not clear why there should be such a kinship between integer prime factorisation and permutation cycle decomposition. It turns out that the situation is clarified if one uses another fundamental analogy in number theory, namely the analogy between integers and polynomials {P \in {\mathbf F}_q[T]} over a finite field {{\mathbf F}_q}, discussed for instance in this previous post; this is the simplest case of the more general function field analogy between number fields and function fields. Just as we restrict attention to positive integers when talking about prime factorisation, it will be reasonable to restrict attention to monic polynomials {P}. We then have another analogous list of facts, proven very similarly to the corresponding list of facts for the integers:

  • Every monic polynomial {f \in {\mathbf F}_q[T]} has a factorisation

    \displaystyle  f = P_1 \dots P_r

    into irreducible monic polynomials {P_1,\dots,P_r \in {\mathbf F}_q[T]}, which is unique up to rearrangement. Taking degrees, we obtain a partition

    \displaystyle  \hbox{deg} f = \hbox{deg} P_1 + \dots + \hbox{deg} P_r

    of {\hbox{deg} f}.

  • (Prime number theorem for polynomials) A randomly selected monic polynomial {f \in {\mathbf F}_q[T]} of degree {n} will be irreducible with probability {\approx \frac{1}{n}} when {q} is fixed and {n} is large.
  • If {f \in {\mathbf F}_q[T]} is a random monic polynomial of degree {n}, and {P_i} is a random irreducible factor of {f = P_1 \dots P_r} (with each {i} selected with probability {\hbox{deg} P_i / n}), then {\hbox{deg} P_i} is approximately uniformly distributed in {\{1,\dots,n\}} when {q} is fixed and {n} is large.
  • The set of real numbers {\{ \hbox{deg} P_i / n \}} arising from the factorisation {f = P_1 \dots P_r} of a randomly selected polynomial {f \in {\mathbf F}_q[T]} of degree {n} converges (in a suitable sense) to the Poisson-Dirichlet process when {q} is fixed and {n} is large.

The above list of facts addressed the large {n} limit of the polynomial ring {{\mathbf F}_q[T]}, where the order {q} of the field is held fixed, but the degrees of the polynomials go to infinity. This is the limit that is most closely analogous to the integers {{\bf Z}}. However, there is another interesting asymptotic limit of polynomial rings to consider, namely the large {q} limit where it is now the degree {n} that is held fixed, but the order {q} of the field goes to infinity. Actually to simplify the exposition we will use the slightly more restrictive limit where the characteristic {p} of the field goes to infinity (again keeping the degree {n} fixed), although all of the results proven below for the large {p} limit turn out to be true as well in the large {q} limit.

The large {q} (or large {p}) limit is technically a different limit than the large {n} limit, but in practice the asymptotic statistics of the two limits often agree quite closely. For instance, here is the prime number theorem in the large {q} limit:

Theorem 1 (Prime number theorem) The probability that a random monic polynomial {f \in {\mathbf F}_q[T]} of degree {n} is irreducible is {\frac{1}{n}+o(1)} in the limit where {n} is fixed and the characteristic {p} goes to infinity.

Proof: There are {q^n} monic polynomials {f \in {\mathbf F}_q[T]} of degree {n}. If {f} is irreducible, then the {n} zeroes of {f} are distinct and lie in the finite field {{\mathbf F}_{q^n}}, but do not lie in any proper subfield of that field. Conversely, every element {\alpha} of {{\mathbf F}_{q^n}} that does not lie in a proper subfield is the root of a unique monic polynomial in {{\mathbf F}_q[T]} of degree {f} (the minimal polynomial of {\alpha}). Since the union of all the proper subfields of {{\mathbf F}_{q^n}} has size {o(q^n)}, the total number of irreducible polynomials of degree {n} is thus {\frac{q^n - o(q^n)}{n}}, and the claim follows. \Box

Remark 2 The above argument and inclusion-exclusion in fact gives the well known exact formula {\frac{1}{n} \sum_{d|n} \mu(\frac{n}{d}) q^d} for the number of irreducible monic polynomials of degree {n}.

Now we can give a precise connection between the cycle distribution of a random permutation, and (the large {p} limit of) the irreducible factorisation of a polynomial, giving a (somewhat indirect, but still connected) link between permutation cycle decomposition and integer factorisation:

Theorem 3 The partition {\{ \hbox{deg}(P_1), \dots, \hbox{deg}(P_r) \}} of a random monic polynomial {f= P_1 \dots P_r\in {\mathbf F}_q[T]} of degree {n} converges in distribution to the partition {\{ |C_1|, \dots, |C_r|\}} of a random permutation {\sigma = C_1 \dots C_r \in S_n} of length {n}, in the limit where {n} is fixed and the characteristic {p} goes to infinity.

We can quickly prove this theorem as follows. We first need a basic fact:

Lemma 4 (Most polynomials square-free in large {q} limit) A random monic polynomial {f \in {\mathbf F}_q[T]} of degree {n} will be square-free with probability {1-o(1)} when {n} is fixed and {q} (or {p}) goes to infinity. In a similar spirit, two randomly selected monic polynomials {f,g} of degree {n,m} will be coprime with probability {1-o(1)} if {n,m} are fixed and {q} or {p} goes to infinity.

Proof: For any polynomial {g} of degree {m}, the probability that {f} is divisible by {g^2} is at most {1/q^{2m}}. Summing over all polynomials of degree {1 \leq m \leq n/2}, and using the union bound, we see that the probability that {f} is not squarefree is at most {\sum_{1 \leq m \leq n/2} \frac{q^m}{q^{2m}} = o(1)}, giving the first claim. For the second, observe from the first claim (and the fact that {fg} has only a bounded number of factors) that {fg} is squarefree with probability {1-o(1)}, giving the claim. \Box

Now we can prove the theorem. Elementary combinatorics tells us that the probability of a random permutation {\sigma \in S_n} consisting of {c_k} cycles of length {k} for {k=1,\dots,r}, where {c_k} are nonnegative integers with {\sum_{k=1}^r k c_k = n}, is precisely

\displaystyle  \frac{1}{\prod_{k=1}^r c_k! k^{c_k}},

since there are {\prod_{k=1}^r c_k! k^{c_k}} ways to write a given tuple of cycles {C_1,\dots,C_r} in cycle notation in nondecreasing order of length, and {n!} ways to select the labels for the cycle notation. On the other hand, by Theorem 1 (and using Lemma 4 to isolate the small number of cases involving repeated factors) the number of monic polynomials of degree {n} that are the product of {c_k} irreducible polynomials of degree {k} is

\displaystyle  \frac{1}{\prod_{k=1}^r c_k!} \prod_{k=1}^r ( (\frac{1}{k}+o(1)) q^k )^{c_k} + o( q^n )

which simplifies to

\displaystyle  \frac{1+o(1)}{\prod_{k=1}^r c_k! k^{c_k}} q^n,

and the claim follows.

This was a fairly short calculation, but it still doesn’t quite explain why there is such a link between the cycle decomposition {\sigma = C_1 \dots C_r} of permutations and the factorisation {f = P_1 \dots P_r} of a polynomial. One immediate thought might be to try to link the multiplication structure of permutations in {S_n} with the multiplication structure of polynomials; however, these structures are too dissimilar to set up a convincing analogy. For instance, the multiplication law on polynomials is abelian and non-invertible, whilst the multiplication law on {S_n} is (extremely) non-abelian but invertible. Also, the multiplication of a degree {n} and a degree {m} polynomial is a degree {n+m} polynomial, whereas the group multiplication law on permutations does not take a permutation in {S_n} and a permutation in {S_m} and return a permutation in {S_{n+m}}.

I recently found (after some discussions with Ben Green) what I feel to be a satisfying conceptual (as opposed to computational) explanation of this link, which I will place below the fold.

To put cycle decomposition of permutations and factorisation of polynomials on an equal footing, we generalise the notion of a permutation {\sigma \in S_n} to the notion of a partial permutation {\sigma = (\sigma,S)} on a fixed (but possibly infinite) domain {X}, which consists of a finite non-empty subset {S} of the set {X}, together with a bijection {\sigma: S \rightarrow S} on {S}; I’ll call {S} the support of the partial permutation. We say that a partial permutation {\sigma} is of size {n} if the support {S} is of cardinality {n}, and denote this size as {|\sigma|}. And now we can introduce a multiplication law on partial permutations that is much closer to that of polynomials: if two partial permutations {\sigma, \sigma'} on the same domain {X} have disjoint supports {S, S'}, then we can form their disjoint union {\sigma \uplus \sigma'}, supported on {S \cup S'}, to be the bijection on {S \cup S'} that agrees with {\sigma} on {S} and with {\sigma'} on {S'}. Note that this is a commutative and associative operation (where it is defined), and is the disjoint union of a partial permutation of size {n} and a partial permutation of size {m} is a partial permutation of size {n+m}, so this operation is much closer in behaviour to the multiplication law on polynomials than the group law on {S_n}. There is the defect that the disjoint union operation is sometimes undefined (when the two partial permutations have overlapping support); but in the asymptotic regime where the size {n} is fixed and the set {X} is extremely large, this will be very rare (compare with Lemma 4).

Note that a partial permutation is irreducible with respect to disjoint union if and only if it is a cycle on its support, and every partial permutation {\sigma} has a decomposition {\sigma = C_1 \uplus \dots \uplus C_r} into such partial cycles, unique up to permutations. If one then selects some set {{\mathcal P}} of partial cycles on the domain {X} to serve as “generalised primes”, then one can define (in the spirit of Beurling integers) the set {{\mathcal N}} of “generalised integers”, defined as those partial permutations that are the disjoint union {\sigma = C_1 \uplus \dots \uplus C_r} of partial cycles in {{\mathcal P}}. If one lets {{\mathcal N}_n} denote the set of generalised integers of size {n}, one can (assuming that this set is non-empty and finite) select a partial permutation {\sigma} uniformly at random from {{\mathcal N}_n}, and consider the partition {\{ |C_1|, \dots, |C_r| \}} of {n} arising from the decomposition into generalised primes.

We can now embed both the cycle decomposition for (complete) permutations and the factorisation of polynomials into this common framework. We begin with the cycle decomposition for permutations. Let {q} be a large natural number, and set the domain {X} to be the set {\{1,\dots,q\}}. We define {{\mathcal P}_n} to be the set of all partial cycles on {X} of size {n}, and let {{\mathcal P}} be the union of the {{\mathcal P}_n}, that is to say the set of all partial cycles on {X} (of arbitrary size). Then {{\mathcal N}} is of course the set of all partial permutations on {X}, and {{\mathcal N}_n} is the set of all partial permutations on {X} of size {n}. To generate an element of {{\mathcal N}_n} uniformly at random for {1 \leq n \leq q}, one simply has to randomly select an {n}-element subset {S} of {X}, and then form a random permutation of the {n} elements of {S}. From this, it is obvious that the partition {\{ |C_1|, \dots, |C_r|\}} of {n} coming from a randomly chosen element of {{\mathcal N}_n} has exactly the same distribution as the partition {\{ |C_1|, \dots, |C_r|\}} of {n} coming from a randomly chosen element of {S_n}, as long as {q} is at least as large as {n} of course.

Now we embed the factorisation of polynomials into the same framework. The domain {X} is now taken to be the algebraic closure {\overline{{\mathbf F}_q}} of {{\mathbf F}_q}, or equivalently the direct limit of the finite fields {{\mathbf F}_{q^n}} (with the obvious inclusion maps). This domain has a fundamental bijection on it, the Frobenius map {\hbox{Frob}: x \mapsto x^q}, which is a field automorphism that has {{\mathbf F}_q} as its fixed points. We define {{\mathcal N}} to be the set of partial permutations on {X} formed by restricting the Frobenius map {\hbox{Frob}} to a finite Frobenius-invariant set. It is easy to see that the irreducible Frobenius-invariant sets (that is to say, the orbits of {\hbox{Frob}}) arise from taking an element {x} of {X} together with all of its Galois conjugates, and so if we define {{\mathcal P}} to be the set of restrictions of Frobenius to a single such Galois orbit, then {{\mathcal N}} are the generalised integers to the generalised primes {{\mathcal P}} in the sense above. Next, observe that, when the characteristic {p} is sufficiently large, every squarefree monic polynomial {f \in {\mathbf F}_q[T]} of degree {n} generates a generalised integer of size {n}, namely the restriction of the Frobenius map to the {n} roots of {f} (which will be necessarily distinct when the characteristic is large and {f} is squarefree). This generalised integer will be a generalised prime precisely when {f} is irreducible. Conversely, every generalised integer of size {n} generates a squarefree monic polynomial in {{\mathbf F}_q[T]}, namely the product of {T-x} as {x} ranges over the support of the integer. This product is clearly monic, squarefree, and Frobenius-invariant, thus it lies in {{\mathbf F}_q[T]}. Thus we may identify {{\mathcal N}_n} with the monic squarefree polynomials of {{\mathbf F}_q} of degree {n}. With this identification, the (now partially defined) multiplication operation on monic squarefree polynomials coincides exactly with the disjoint union operation on partial permutations. As such, we see that the partition {\{ \hbox{deg} P_1, \dots, \hbox{deg} P_r \}} associated to a randomly chosen squarefree monic polynomial {f = P_1\dots P_r} of degree {n} has exactly the same distribution as the partition {\{ |C_1|, \dots, |C_r| \}} associated to a randomly chosen generalised integer {\sigma = C_1 \uplus \dots \uplus C_r} of size {n}. By Lemma 4, one can drop the condition of being squarefree while only distorting the distribution by {o(1)}.

Now that we have placed cycle decomposition of permutations and factorisation of polynomials into the same framework, we can explain Theorem 3 as a consequence of the following universality result for generalised prime factorisations:

Theorem 5 (Universality) Let {{\mathcal P}, {\mathcal N}} be collections of generalised primes and integers respectively on a domain {X}, all of which depend on some asymptotic parameter {q} that goes to infinity. Suppose that for any fixed {n,m} and {q} going to infinity, the sets {{\mathcal N}_n, {\mathcal N}_m, {\mathcal N}_{n+m}} are non-empty with cardinalities obeying the asymptotic

\displaystyle  |{\mathcal N}_{n+m}| = (1+o(1)) |{\mathcal N}_n| |{\mathcal N}_m|. \ \ \ \ \ (1)

Also, suppose that only {o( |{\mathcal N}_n| |{\mathcal N}_m|)} of the pairs {(\sigma,\sigma') \in {\mathcal N}_n \times {\mathcal N}_m} have overlapping supports (informally, this means that {\sigma \uplus \sigma'} is defined with probability {1-o(1)}). Then, for fixed {n} and {q} going to infinity, the distribution of the partition {\{ |C_1|, \dots, |C_r|\}} of a random generalised integer from {{\mathcal N}_n} is universal in the limit; that is to say, the limiting distribution does not depend on the precise choice of {X, {\mathcal P}, {\mathcal N}}.

Note that when {{\mathcal N}_n} consists of all the partial permutations of size {n} on {\{1,\dots,q\}} we have

\displaystyle  |{\mathcal N}_n| = \binom{q}{n} n! = (1+o(1)) q^n

while when {{\mathcal N}_n} consists of the monic squarefree polynomials of degree {n} in {{\mathbf F}_q[T]} then from Lemma 4 we also have

\displaystyle  |{\mathcal N}_n| = (1+o(1)) q^n

so in both cases the first hypothesis (1) is satisfied. The second hypothesis is easy to verify in the former case and follows from Lemma 4 in the latter case. Thus, Theorem 5 gives Theorem 3 as a corollary.

Remark 6 An alternate way to interpret Theorem 3 is as an equidistribution theorem: if one randomly labels the {n} zeroes of a random degree {n} polynomial as {1,\dots,n}, then the resulting permutation on {1,\dots,n} induced by the Frobenius map is asymptotically equidistributed in the large {q} (or large {p}) limit. This is the simplest case of a much more general (and deeper) result known as the Deligne equidistribution theorem, discussed for instance in this survey of Kowalski. See also this paper of Church, Ellenberg, and Farb concerning more precise asymptotics for the number of squarefree polynomials with a given cycle decomposition of Frobenius.

It remains to prove Theorem 5. The key is to establish an abstract form of the prime number theorem in this setting.

Theorem 7 (Prime number theorem) Let the hypotheses be as in Theorem 5. Then for fixed {n} and {q \rightarrow \infty}, the density of {{\mathcal P}_n} in {{\mathcal N}_n} is {\frac{1}{n}+o(1)}. In particular, the asymptotic density {1/n} is universal (it does not depend on the choice of {X, {\mathcal P}_n, {\mathcal N}_n}).

Proof: Let {a_n := n |{\mathcal P}_n| / |{\mathcal N}_n|} (this may only be defined for {q} sufficiently large depending on {n}); our task is to show that {a_n = 1+o(1)} for each fixed {n}.

Consider the set of pairs {(\sigma, x)} where {\sigma} is an element of {{\mathcal N}_n} and {x} is an element of the support of {\sigma}. Clearly, the number of such pairs is {n |{\mathcal N}_n|}. On the other hand, given such a pair {(\sigma,x)}, there is a unique factorisation {\sigma = C \uplus \sigma'}, where {C} is the generalised prime in the decomposition of {\sigma} that contains {x} in its support, and {\sigma'} is formed from the remaining components of {\sigma}. {C} has some size {1 \leq m \leq n}, {\sigma'} has the complementary size {n-m} and has disjoint support from {C}, and {x} has to be one of the {m} elements of the support of {C}. Conversely, if one selects {1 \leq m \leq n}, then selects a generalised prime {C \in {\mathcal P}_m}, and a generalised integer {\sigma' \in {\mathcal C}_{n-m}} with disjoint support from {C}, and an element {x} in the support of {C}, we recover the pair {(\sigma,x)}. Using the hypotheses of Theorem 5, we thus obtain the double counting identity

\displaystyle  n |{\mathcal N}_n| = \sum_{m=1}^n m |{\mathcal P}_m| |{\mathcal N}_{n-m}| - o( |{\mathcal N}_m| |{\mathcal N}_{n-m}| )

\displaystyle  = (\sum_{m=1}^n a_m + o(1)) |{\mathcal N}_n|

and thus {\sum_{m=1}^n a_m = n+o(1)} for every fixed {n}, and so {a_n = 1+o(1)} for fixed {n} as claimed. \Box

Remark 8 One could cast this argument in a language more reminiscent of analytic number theory by forming generating series of {{\mathcal N}_n} and {{\mathcal P}_n} and treating these series as analogous to a zeta function and its log-derivative (in close analogy to what is done with Beurling primes), but we will not do so here.

We can now finish the proof of Theorem 5. To show asymptotic universality of the partition {\{ |C_1|,\dots,|C_r|\}} of a random generalised integer {\sigma \in {\mathcal N}_n}, we may assume inductively that asymptotic universality has already been shown for all smaller choices of {n}. To generate a uniformly random generalised integer {\sigma} of size {n}, we can repeat the process used to prove Theorem 7. It of course suffices to generate a uniformly random pair {(\sigma,x)}, where {\sigma} is a generalised integer of size {n} and {x} is an element of the support of {\sigma}, since on dropping {x} we would obtain a uniformly drawn {\sigma}.

To obtain the pair {(\sigma,x)}, we first select {m \in \{1,\dots,n\}} uniformly at random, then select a generalised prime {C} randomly from {{\mathcal P}_m} and a generalised integer {\sigma'} randomly from {{\mathcal C}_{n-m}} (independently of {C} once {m} is fixed). Finally, we select {x} uniformly at random from the support of {C}, and set {\sigma := C \uplus \sigma'}. The pair {(\sigma,x)} is certainly a pair of the required form, but this random variable is not quite uniformly distributed amongst all such pairs. However, by repeating the calculations in Theorem 5 (and in particular relying on the conclusion {a_m=1+o(1)}), we see that this distribution is is within {o(1)} of the uniform distribution in total variation norm. Thus, the distribution of the cycle partition {\{ |C_1|,\dots,|C_r|\}} of a uniformly chosen {\sigma} lies within {o(1)} in total variation of the distribution of the cycle partition of a {\sigma = C \uplus \sigma'} chosen by the above recipe. However, the cycle partition of {\sigma = C \uplus \sigma'} is simply the union (with multiplicity) of {\{m\}} with the cycle partition of {\sigma'}. As the latter was already assumed to be asymptotically universal, we conclude that the former is also, as required.

Remark 9 The above analysis helps explain why one could not easily link permutation cycle decomposition with integer factorisation – to produce permutations here with the right asymptotics we needed both the large {q} limit and the Frobenius map, both of which are available in the function field setting but not in the number field setting.

Filed under: expository, math.CO, math.NT, math.PR Tagged: finite fields, permutations, prime number theorem

Tommaso DorigoNew Results From The LHC At 13 TeV!

Well, as some of you may have heard, the restart of the LHC has not been as smooth as we had hoped. In a machine as complex as this the chance that something gets in the way of a well-followed schedule is quite significant. So there have been slight delays, but the important thing is that the data at 13 TeV centre-of-mass energy are coming, and the first results are being extracted from them.

read more

July 26, 2015

Clifford JohnsonSantiago

imageI'm in Santiago, Chile, for a short stay. My first thought, in a very similar thought process to the one I had over ten years ago in a similar context, is one of surprise as to how wonderfully far south of the equator I now am! Somehow, just like last time I was in chile (even further south in Valdivia), I only properly looked at the latitude on a map when I was most of the way here (due to being somewhat preoccupied with other things right up to leaving), and it is a bit of a jolt. You will perhaps be happy to know that I will refrain from digressions about the Coriolis force and bathtubs, hurricanes and typhoons, and the like. I arrived too early to check into my hotel and so after leaving my bag there I went wandering for a while using the subway, finding a place to sit and have lunch and coffee while watching the world go by for a while. It happened to be at Plaza de Armaz. I sketched a piece of what I saw, and that's what you see in the snap above. I think the main building I sketched is in fact the Central Post Office... And that is a bit of some statuary in front of the Metropolitan Cathedral to the left. I like that the main cathedral and post office are next to each other like that. And yes, [...] Click to continue reading this post

July 25, 2015

Doug NatelsonThe nanoscale: The edge of emergence

One of the most profound concepts to come out of condensed matter physics is the idea of emergent properties - nontrivial properties of a system that are not trivially deducible from the microscopic aspects and interactions of the underlying degrees of freedom, and that become even better defined as the system size grows.  One example is the rigidity of solids:  A single carbon atom is not rigid; a small cluster of carbon atoms has a countable number of discrete vibrational modes; but a large number of carbon atoms coupled by sp3 bonds becomes a diamond, one of the hardest, most mechanically rigid solids there is, so stiff that compressive sound travels at 12 km/s, 35 times faster than in air.  Somehow, going from one atom to many, the concept of rigidity acquires meaning, and the speed of sound in diamond approaches a precise value.   

This idea, that something remarkable, exact, yet not at all obvious can emerge collectively and even generically, is why condensed matter physics is profound and not just "mopping up the details".  This is the heart of Bob Laughlin's first book, A Different Universe:  Reinventing Physics from the Bottom Down, and was articulated concisely by Laughlin and Pines in their "Theory of Everything" paper.  

I was recently rereading that book, and one chapter articulates Laughlin's basically dismissive take on nanoscience.  He refers to it as a "carnival of baubles" - his view is that otherwise smart people get sucked into playing around at the nanoscale because it's diverting and involves fun, cool toys (i.e., everything looks cool under an electron microscope!), instead of spending their time and effort actually trying to think about deep, fundamental questions.   Well, everyone is entitled to their opinion, but it won't surprise you that I disagree with much of that take.  Working at the nanoscale allows us to examine how emergence works in specific cases, sets the ground work for the materials and devices of future technologies (two topics I touch on in my book), and allows us to develop new probes and techniques precisely for asking (some subset of) deep questions.   Like being able to probe matter on ultrafast timescales, or over a huge temperature range, or in systems of unprecedented purity, pushing our control and manipulation of materials to the nano regime lets us ask new and different questions, and that's how we make progress and find surprises.  This isn't an infatuation with baubles (though everything does look cool under an electron microscope).  

BackreactionIs the Fermi gamma-ray excess evidence for dark matter or due to milli-second pulsars?

Fermi Satellite, 3d model.
Image Source: NASA

The Large Area Telescope on board the Fermi spacecraft looks out for the most extreme events in the cosmos. Launched 2008, it scans the whole sky for gamma rays in the very high energy range, from 20 MeV to about 300 GeV; a full scan takes about 3 hours. One of Fermi’s most interesting findings is an unexpectedly large amount of gamma-rays, ordinary light but at enormous energy, stemming from the center of our galaxy.

The Fermi gamma-ray excess has proved difficult to explain with standard astrophysical processes. The spectral distribution of the observed gamma-rays over energies bulges at about 2 GeV and it is hard to come up with a mechanism that produces particles at such high energies that prefer this particular spectral feature. The other puzzle is that whatever the sources of the gamma-rays, they seem homogeneously distributed in the galactic center, out to distances of more than ten degrees, which is about 5,000 light years – a huge range to span.

The most exciting proposal to solve the riddle is that the gamma-rays are produced by dark matter annihilation. Annihilation spectra often bulge at energies that depend on both the mass of the particles and their velocity. And the dark matter distribution is known to be denser towards centers of galaxies, so one would indeed expect more emission from there. While dark matter isn’t entirely homogeneous but has substructures, the changes in its density are small, which would result in an overall smooth emission. All of this fits very well with the observations.

For this reason, many particle physicists have taken their dark matter models to see whether they can fit the Fermi data, and it is possible indeed without too much difficulty. If the Fermi gamma-ray excess was due to dark matter annihilation, it would speak for heavy dark matter particles with masses of about 30 to 100 GeV, which might also show up at the LHC, though nothing has been seen so far.

Astrophysicists meanwhile haven’t been lazy and have tried to come up with other explanations for the gamma-ray excess. One of the earliest and still most compelling proposals is a population of millisecond pulsars. Such objects are thought to be created in some binary systems, where two stars orbit around a common center. When a neutron star succeeds in accreting mass from a companion star, it spins up enormously and starts to emit large amounts of particles including gamma rays reaching up to highest energies. This emission goes into one particular direction due to the rapid rotation of the system, and since we only observe it when it points at our telescopes the source seems to turn on and off in regular intervals: A pulsar has been created.

Much remains to be understood about millisecond pulsars, including details of their formation and exact radiation characteristics, but several thousands of them have been observed so far and their properties are observationally well documented. Most of the millisecond pulsars on record are the ones in our galactic neighborhood. From this we know that they tend to occur in globular clusters where particularly many stars are close together. And the observations also tell us the millisecond pulsar spectrum peaks at about 2 GeV, which makes them ideal candidates to explain the Fermi gamma-ray excess. The only problem is that no such pulsars have been seen in the center of the galaxy where the excess seems to originate, at least so far.

There are plausible reasons for this lack of observation. Millisecond pulsars tend to be detected in the radio range, at low energies. Only after astronomers have an idea of where exactly to look, they can aim precisely with telescopes to confirm the pulsation in the higher energy range. But such detections are difficult if not impossible in the center of the galaxy because observations are shrouded by electron gas. So it is quite plausible that the millisecond pulsars are in the center, they just haven’t been seen. Indeed, model-based estimates indicate that millisecond pulsars should also be present in the galactic center, as laid out for example in this recent paper. It would seem odd indeed if they weren’t there. On the other hand it has been argued that if millisecond pulsars were the source of the gamma-ray excess, then Fermi should also have been able to pinpoint a few of the pulsars in the galactic center already, which has not been the case. So now what?

The relevant distinction between the both scenarios to explain the gamma ray excess, dark matter annihilation or millisecond pulsars, is whether the emission comes from point sources or whether the sources are indeed homogeneously distributed. This isn’t an easy question to answer because Fermi basically counts single photons and their distribution is noisy and inevitably sometimes peaks here or there just by coincidence. Making estimates based on such measurements is difficult and requires sophisticated analysis.

In two recent papers now, researchers have taken a closer look at the existing Fermi data to see whether it gives an indication for point-like sources that have so far remained below the detection threshold beyond which they would be identified as stellar objects. For this they have to take the distribution of the measured signals, extract peaks ordered by magnitude, and test this measured distribution against a random distribution.
    Strong support for the millisecond pulsar origin of the Galactic center GeV excess
    Richard Bartels, Suraj Krishnamurthy, Christoph Weniger
    arXiv:1506.05104 [astro-ph.HE]

    Evidence for Unresolved Gamma-Ray Point Sources in the Inner Galaxy
    Samuel K. Lee, Mariangela Lisanti, Benjamin R. Safdi, Tracy R. Slatyer, Wei Xue
    arXiv:1506.05124 [astro-ph.HE]
The difference between these papers is the method they used to identify the point sources. The first paper by Bartels et al uses a wavelet analysis on the data, that is somewhat like a Fourier transform with a local profile, to pick up potential sources with low statistical significance. The Lee et al paper tries to generate a pattern close to the observed one by using various mixtures of noise and point sources of particular spectra. In both papers the researchers find that the data indicates the origin of the gamma-rays is point sources and not entirely smoothly distributed. In the first paper, the authors moreover extract the necessary density of pulsars in the galactic center to explain the observations, and demonstrate that it is possible the pulsars give rise to the observed excess and might so far have stayed just below the detection threshold for point sources.

Taken together, it looks like the evidence has now shifted in favor of millisecond pulsars. As Christopher Weniger from the University of Amsterdam put it “[The pulsars] are there, we know they are there, and they have the right spectrum. We first have to rule out that this isn’t what we see.” Rather than ruling out astrophysical sources as origin of the gamma-ray excess however, the researchers are now well on the way to confirm it’s the pulsars that cause the signal.

Finding definite evidence that the Fermi gamma-ray excess is due to millisecond pulsars is difficult but not impossible. What is needed is more statistics that will allow resolving the point sources better, and more time will bring more statistics. The puzzle isn’t solved yet, but chances are good it will be solved within the next years. What constitutes dark matter however at least for now remains a mystery.

Jordan EllenbergThe adventures of Terry Tao in the 21st century

Great New York Times profile of Terry Tao by Gareth Cook, an old friend of mine from Boston Phoenix days.

I’ve got a quote in there:

‘‘Terry is what a great 21st-­century mathematician looks like,’’ Jordan Ellenberg, a mathematician at the University of Wisconsin, Madison, who has collaborated with Tao, told me. He is ‘‘part of a network, always communicating, always connecting what he is doing with what other people are doing.’’

I thought it would be good to say something about the context in which I told Gareth this.  I was explaining how happy I was he was profiling Terry, because Terry is at the same time extraordinary and quite typical as a mathematician.  Outlier stories, like those of Nash, and Perelman, and more recently Mochizuki, get a lot of space in the general press.  And they’re important stories.  But they’re stories because they’re so unrepresentative of the main stream of mathematical work.  Lone bearded men working in secret, pitched battles over correctness and priority, madness, etc.  Not a big part of our actual lives.

Terry’s story, on the other hand, is what new, deep, amazing math actually usually looks like.  Many minds cooperating, enabled by new technology.  Blogging, traveling, talking, sharing.  That’s the math world I know.  I’m happy as hell to see it in the New York Times.


July 24, 2015

Clifford JohnsonPage Samples…!

second_set_samples_editThere's something really satisfying about getting copies of printed pages back from the publisher. Makes it all seem a bit more real. This is a second batch of samples (first batch had some errors resulting from miscommunication, so don't count), and already I think we are converging. The colours are closer to what I intended, although you can't of course see that since the camera I used to take the snap, and the screen you are using, have made changes to them (I'll spare you lots of mumblings about CMYK vs RGB and monitor profiles and various PDF formats and conventions and so forth) and this is all done with pages I redid to fit the new page sizes I talked about in the last post on the book project. Our next step is to work on more paper choices, keeping in mind that this will adjust colours a bit again, and so forth - and we must also keep an eye on things like projected production costs and so forth. Some samples have been mailed to me and I shall get them next week. Looking forward to seeing them. For those who care, the pages you can see have a mixture of digital colours (most of it in fact) and analogue colours (Derwent watercolour pencils, applied [...] Click to continue reading this post

Tommaso DorigoInterna

I apologize for my lack of posting in the past few days. I will resume it very soon... So as a means of apology I thought I would explain what I have been up to this week.

read more

Sean CarrollGuest Post: Aidan Chatwin-Davies on Recovering One Qubit from a Black Hole

47858f217602be036c32e8ac76271a75_400x400 The question of how information escapes from evaporating black holes has puzzled physicists for almost forty years now, and while we’ve learned a lot we still don’t seem close to an answer. Increasingly, people who care about such things have been taking more seriously the intricacies of quantum information theory, and learning how to apply that general formalism to the specific issues of black hole information.

Now two students and I have offered a small contribution to this effort. Aidan Chatwin-Davies is a grad student here at Caltech, while Adam Jermyn was an undergraduate who has now gone on to do graduate work at Cambridge. Aidan came up with a simple method for getting out one “quantum bit” (qubit) of information from a black hole, using a strategy similar to “quantum teleportation.” Here’s our paper that just appeared on arxiv:

How to Recover a Qubit That Has Fallen Into a Black Hole
Aidan Chatwin-Davies, Adam S. Jermyn, Sean M. Carroll

We demonstrate an algorithm for the retrieval of a qubit, encoded in spin angular momentum, that has been dropped into a no-firewall unitary black hole. Retrieval is achieved analogously to quantum teleportation by collecting Hawking radiation and performing measurements on the black hole. Importantly, these methods only require the ability to perform measurements from outside the event horizon and to collect the Hawking radiation emitted after the state of interest is dropped into the black hole.

It’s a very specific — i.e. not very general — method: you have to have done measurements on the black hole ahead of time, and then drop in one qubit, and we show how to get it back out. Sadly it doesn’t work for two qubits (or more), so there’s no obvious way to generalize the procedure. But maybe the imagination of some clever person will be inspired by this particular thought experiment to come up with a way to get out two qubits, and we’ll be off.

I’m happy to host this guest post by Aidan, explaining the general method behind our madness.

If you were to ask someone on the bus which of Stephen Hawking’s contributions to physics he or she thought was most notable, the answer that you would almost certainly get is his prediction that a black hole should glow as if it were an object with some temperature. This glow is made up of thermal radiation which, unsurprisingly, we call Hawking radiation. As the black hole radiates, its mass slowly decreases and the black hole decreases in size. So, if you waited long enough and were careful not to enlarge the black hole by throwing stuff back in, then eventually it would completely evaporate away, leaving behind nothing but a bunch of Hawking radiation.

At a first glance, this phenomenon of black hole evaporation challenges a central notion in quantum theory, which is that it should not be possible to destroy information. Suppose, for example, that you were to toss a book, or a handful of atoms in a particular quantum state into the black hole. As the black hole evaporates into a collection of thermal Hawking particles, what happens to the information that was contained in that book or in the state of (what were formerly) your atoms? One possibility is that the information actually is destroyed, but then we would have to contend with some pretty ugly foundational consequences for quantum theory. Instead, it could be that the information is preserved in the state of the leftover Hawking radiation, albeit highly scrambled and difficult to distinguish from a thermal state. Besides being very pleasing on philosophical grounds, we also have evidence for the latter possibility from the AdS/CFT correspondence. Moreover, if the process of converting a black hole to Hawking radiation conserves information, then a stunning result of Hayden and Preskill says that for sufficiently old black holes, any information that you toss in comes back out almost a fast as possible!

Even so, exactly how information leaks out of a black hole and how one would go about converting a bunch of Hawking radiation to a useful state is quite mysterious. On that note, what we did in a recent piece of work was to propose a protocol whereby, under very modest and special circumstances, you can toss one qubit (a single unit of quantum information) into a black hole and then recover its state, and hence the information that it carried.

More precisely, the protocol describes how to recover a single qubit that is encoded in the spin angular momentum of a particle, i.e., a spin qubit. Spin is a property that any given particle possesses, just like mass or electric charge. For particles that have spin equal to 1/2 (like those that we consider in our protocol), at least classically, you can think of spin as a little arrow which points up or down and says whether the particle is spinning clockwise or counterclockwise about a line drawn through the arrow. In this classical picture, whether the arrow points up or down constitutes one classical bit of information. According to quantum mechanics, however, spin can actually exist in a superposition of being part up and part down; these proportions constitute one qubit of quantum information.


So, how does one throw a spin qubit into a black hole and get it back out again? Suppose that Alice is sitting outside of a black hole, the properties of which she is monitoring. From the outside, a black hole is characterized by only three properties: its total mass, total charge, and total spin. This latter property is essentially just a much bigger version of the spin of an individual particle and will be important for the protocol.

Next, suppose that Alice accidentally drops a spin qubit into the black hole. First, she doesn’t panic. Instead, she patiently waits and collects one particle of Hawking radiation from the black hole. Crucially, when a Hawking particle is produced by the black hole, a bizarro version of the same particle is also produced, but just behind the black hole’s horizon (boundary) so that it falls into the black hole. This bizarro ingoing particle is the same as the outgoing Hawking particle, but with opposite properties. In particular, its spin state will always be flipped relative to the outgoing Hawking particle. (The outgoing Hawking particle and the ingoing particle are entangled, for those in the know.)


The picture so far is that Alice, who is outside of the black hole, collects a single particle of Hawking radiation whilst the spin qubit that she dropped and the ingoing bizarro Hawking particle fall into the black hole. When the dropped particle and the bizarro particle fall into the black hole, their spins combine with the spin of the black hole—but remember! The bizarro particle’s spin was highly correlated with the spin of the outgoing Hawking particle. As such, the new combined total spin of the black hole becomes highly correlated with the spin of the outgoing Hawking particle, which Alice now holds. So, Alice measures the black hole’s new total spin state. Then, essentially, she can exploit the correlations between her held Hawking particle and the black hole to transfer the old spin state of the particle that she dropped into the hole to the Hawking particle that she now holds. Alice’s lost qubit is thus restored. Furthermore, Alice didn’t even need to know the precise state that her initial particle was in to begin with; the qubit is recovered regardless!

That’s the protocol in a nutshell. If the words “quantum teleportation” mean anything to you, then you can think of the protocol as a variation on the quantum teleportation protocol where the transmitting party is the black hole and measurement is performed in the total angular momentum basis instead of the Bell basis. Of course, this is far from a resolution of the information problem for black holes. However, it is certainly a neat trick which shows, in a special set of circumstances, how to “bounce” a qubit of quantum information off of a black hole.

Terence TaoDeducing a weak ergodic inverse theorem from a combinatorial inverse theorem

Note: this post is of a particularly technical nature, in particular presuming familiarity with nilsequences, nilsystems, characteristic factors, etc., and is primarily intended for experts.

As mentioned in the previous post, Ben Green, Tamar Ziegler, and myself proved the following inverse theorem for the Gowers norms:

Theorem 1 (Inverse theorem for Gowers norms) Let {N \geq 1} and {s \geq 1} be integers, and let {\delta > 0}. Suppose that {f: {\bf Z} \rightarrow [-1,1]} is a function supported on {[N] := \{1,\dots,N\}} such that

\displaystyle  \frac{1}{N^{s+2}} \sum_{n,h_1,\dots,h_{s+1}} \prod_{\omega \in \{0,1\}^{s+1}} f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta.

Then there exists a filtered nilmanifold {G/\Gamma} of degree {\leq s} and complexity {O_{s,\delta}(1)}, a polynomial sequence {g: {\bf Z} \rightarrow G}, and a Lipschitz function {F: G/\Gamma \rightarrow {\bf R}} of Lipschitz constant {O_{s,\delta}(1)} such that

\displaystyle  \frac{1}{N} \sum_n f(n) F(g(n) \Gamma) \gg_{s,\delta} 1.

This result was conjectured earlier by Ben Green and myself; this conjecture was strongly motivated by an analogous inverse theorem in ergodic theory by Host and Kra, which we formulate here in a form designed to resemble Theorem 1 as closely as possible:

Theorem 2 (Inverse theorem for Gowers-Host-Kra seminorms) Let {s \geq 1} be an integer, and let {(X, T)} be an ergodic, countably generated measure-preserving system. Suppose that one has

\displaystyle  \lim_{N \rightarrow \infty} \frac{1}{N^{s+1}} \sum_{h_1,\dots,h_{s+1} \in [N]} \int_X \prod_{\omega \in \{0,1\}^{s+1}} f(T^{\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}}x)\ d\mu(x)

\displaystyle  > 0

for all non-zero {f \in L^\infty(X)} (all {L^p} spaces are real-valued in this post). Then {(X,T)} is an inverse limit (in the category of measure-preserving systems, up to almost everywhere equivalence) of ergodic degree {\leq s} nilsystems, that is to say systems of the form {(G/\Gamma, x \mapsto gx)} for some degree {\leq s} filtered nilmanifold {G/\Gamma} and a group element {g \in G} that acts ergodically on {G/\Gamma}.

It is a natural question to ask if there is any logical relationship between the two theorems. In the finite field category, one can deduce the combinatorial inverse theorem from the ergodic inverse theorem by a variant of the Furstenberg correspondence principle, as worked out by Tamar Ziegler and myself, however in the current context of {{\bf Z}}-actions, the connection is less clear.

One can split Theorem 2 into two components:

Theorem 3 (Weak inverse theorem for Gowers-Host-Kra seminorms) Let {s \geq 1} be an integer, and let {(X, T)} be an ergodic, countably generated measure-preserving system. Suppose that one has

\displaystyle  \lim_{N \rightarrow \infty} \frac{1}{N^{s+1}} \sum_{h_1,\dots,h_{s+1} \in [N]} \int_X \prod_{\omega \in \{0,1\}^{s+1}} T^{\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}} f\ d\mu

\displaystyle  > 0

for all non-zero {f \in L^\infty(X)}, where {T^h f := f \circ T^h}. Then {(X,T)} is a factor of an inverse limit of ergodic degree {\leq s} nilsystems.

Theorem 4 (Pro-nilsystems closed under factors) Let {s \geq 1} be an integer. Then any factor of an inverse limit of ergodic degree {\leq s} nilsystems, is again an inverse limit of ergodic degree {\leq s} nilsystems.

Indeed, it is clear that Theorem 2 implies both Theorem 3 and Theorem 4, and conversely that the two latter theorems jointly imply the former. Theorem 4 is, in principle, purely a fact about nilsystems, and should have an independent proof, but this is not known; the only known proofs go through the full machinery needed to prove Theorem 2 (or the closely related theorem of Ziegler). (However, the fact that a factor of a nilsystem is again a nilsystem was established previously by Parry.)

The purpose of this post is to record a partial implication in reverse direction to the correspondence principle:

Proposition 5 Theorem 1 implies Theorem 3.

As mentioned at the start of the post, a fair amount of familiarity with the area is presumed here, and some routine steps will be presented with only a fairly brief explanation.

To show that {(X,T)} is a factor of another system {(Y,S)} up to almost everywhere equivalence, it suffices to obtain a unital algebra homomorphism from {L^\infty(X)} to {L^\infty(Y)} that intertwines {T} with {S}, and which is measure-preserving (or more precisely, integral-preserving). On the other hand, by hypothesis, {L^\infty(X)} is generated (as a von Neumann algebra) by the dual functions

\displaystyle  {\mathcal D} f := \lim_{N \rightarrow \infty} {\mathcal D}_N f

for {f \in L^\infty(X)}, where

\displaystyle  {\mathcal D}_N f := \frac{1}{N^{s+1}} \sum_{h_1,\dots,h_{s+1} \in [N]} \prod_{\omega \in \{0,1\}^{s+1} \backslash \{0\}} T^{\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}}f;

indeed we may restrict {f} to a countable sequence {f_n} that is dense in {L^\infty(X)} in the {L^2} (say) topology, together with their shifts. To obtain such a factor representation, it thus suffices to find a “model” {\tilde f_n \in L^\infty(Y)} associated to each dual function {{\mathcal D} f_n} in such a fashion that

\displaystyle  \int_Y P( T^{h_1} \tilde f_{n_1}, \dots, T^{h_d} \tilde f_{n_d} ) = \int_X P( T^{h_1} {\mathcal D} f_{n_1}, \dots, T^{h_d} {\mathcal D} f_{n_d} ) \ \ \ \ \ (1)

for all {n_1,\dots,n_d} and {h_1,\dots,h_d}, and all polynomials {P}. Of course it suffices to do so for those polynomials with rational coefficients (so now there are only a countable number of constraints to consider).

We may normalise all the {f_n} to take values in {[-1,1]}. For any {n,m}, we can find a scale {N_{n,m} \geq m} such that

\displaystyle  \| {\mathcal D} f_n - {\mathcal D}_{N_{n,m}} f_n \|_{L^2(X)} \leq 2^{-100(n+m)}.

If we then define the exceptional set

\displaystyle  E_{n,m} := \{ x: |{\mathcal D} f_n(x) - {\mathcal D}_{N_{n,m}} f_n(x)| \geq 2^{-(n+m)} \}

then {E_{n,m}} has measure at most {2^{-10(n+m)}} (say), and so the function {\sum_{n,m} 2^{9(n+m)} 1_{E_{n,m}}} is absolutely integrable. By the maximal ergodic theorem, we thus see that for almost every {x}, there exists a finite {C_x} such that

\displaystyle  |\{ h \in [H]: T^h x \in E_{n,m} \}| \leq C_x 2^{-8(n+m)} H \ \ \ \ \ (2)

for all {n,m} and all {H \geq 1}. Informally, we thus have the approximation

\displaystyle  {\mathcal D} f_n(T^h x ) = {\mathcal D}_{N_{n,m}} f_n(T^h x) + O( 2^{-(n+m)} )

for “most” {h,n,m}.

Next, we observe from the Cauchy-Schwarz-Gowers inequality that for almost every {x \in X}, the dual function {h \mapsto {\mathcal D}_{N_{n,m}} f_n(T^h x)} is anti-uniform in the sense that

\displaystyle  {\bf E}_{h \in [N_{n,m}]} {\mathcal D}_{N_{n,m}} f_n(T^h x) g(h) \ll_s \|g\|_{U^{s+1}[N_{n,m}]}

for any function {g: [N_{n,m}] \rightarrow [-1,1]}. By the usual structure theorems (e.g. Theorem 1.2 of this paper of Ben Green and myself) this shows that for almost every {x \in X} and every {k \geq 1}, there exists a degree {\leq s} nilsequence {h \mapsto F_{n,m,k}( g_{n,m,k}(h) \Gamma_{n,m,k} )} of complexity {O_{s,k,n}(1)} such that

\displaystyle  {\bf E}_{h \in [N_{n,m}]} | {\mathcal D}_{N_{n,m}} f_n(T^h x) - F_{n,m,k}( g_{n,m,k}(h) \Gamma_{n,m,k}) |^2 \ll 2^{-100(n+k)}

(say). (Sketch of proof: standard structure theorems give a decomposition of the form

\displaystyle {\mathcal D}_{N_{n,m}} f_n(T^h x) = f_{str} + f_{sml} + f_{unf}

where {f_{str}} is a nilsequence as above, {f_{sml}} is small in {L^2} norm, and {f_{unf}} is very small in {U^{s+1}[N]} norm; {f_{unf}} has small inner product with {{\mathcal D}_{N_{n,m}} f_n(T^h x)}, {f_{str}}, and {f_{sml}}, and thus with {f_{unf}} itself, and so {f_{sml}} and {f_{unf}} are both small in {L^2}, giving the claim.)

For each {n,m,k}, let {E'_{n,m,k}} denote the set of all {x} such that there exists a degree {\leq s} nilsequence {h \mapsto F_{n,m,k}( g_{n,m,k}(h) \Gamma_{n,m,k} )} (depending on {x,n,m,k}) of complexity {O_{s,k,n}(1)} such that

\displaystyle  \sup_{1 \leq H \leq 2^{-10k} N_{n,m}} \frac{1}{H} {\bf E}_{h \in [H]} | {\mathcal D}_{N_{n,m}} f_n(T^h x) - F_{n,m,k}( g_{n,m,k}(h) \Gamma_{n,m,k}) |^2

\displaystyle \geq 2^{-10(n+k)}.

From the Hardy-Littlewood maximal inequality (and the measure-preserving nature of {T}) we see that {E'_{n,m,k}} has measure {O( 2^{-10(n+k)} )}. This implies that the functions

\displaystyle  \frac{1}{M} \sum_{m=1}^M \sum_{n=1}^\infty \sum_{k=1}^\infty 2^{5(n+k)} 1_{E'_{n,m,k}}

are uniformly bounded in {L^1} as {M \rightarrow \infty}, which by Fatou’s lemma implies that

\displaystyle  \liminf_{M \rightarrow \infty} \frac{1}{M} \sum_{m=1}^M \sum_{n=1}^\infty \sum_{k=1}^\infty 2^{5(n+k)} 1_{E'_{n,m,k}}

is also absolutely integrable. In particular, for almost every {x}, we have

\displaystyle  \liminf_{M \rightarrow \infty} \frac{1}{M} \sum_{m=1}^M \sum_{n=1}^\infty \sum_{k=1}^\infty 2^{5(n+k)} 1_{E'_{n,m,k}} < C'_x

for some finite {C'_x}, which implies that

\displaystyle  \sum_{n=1}^\infty \sum_{k=1}^\infty 2^{5(n+k)} 1_{E'_{n,m,k}} \leq C'_x

for an infinite sequence of {m} (the exact choice of sequence depends on {x}); in particular, there is a {K_x} such that for all {m} in this sequence, one has

\displaystyle  x \not \in E'_{n,m,k}

for all {n} and all {k \geq K_x}. Thus

\displaystyle  \sup_{1 \leq H \leq 2^{-10k} N_{n,m}} \frac{1}{H} {\bf E}_{h \in [H]} | {\mathcal D}_{N_{n,m}} f_n(T^h x) - F_{n,m,k}( g_{n,m,k}(h) \Gamma_{n,m,k}) |^2

\displaystyle \leq 2^{-10(n+k)}

for all {m} in this sequence, all {n}, and all {k \geq K_x}; combining with (2) we see (for almost every {x}) that

\displaystyle  \sup_{1 \leq H \leq 2^{-10k} N_{n,m}} \frac{1}{H} {\bf E}_{h \in [H]} | {\mathcal D} f_n(T^h x) - F_{n,m,k}( g_{n,m,k}(h) \Gamma_{n,m,k}) |^2

\displaystyle \ll 2^{-10(n+k)} + 2^{-(n+m)}

and thus for all {n}, all {k \geq K_x}, and all {H \geq 1} we have

\displaystyle  \limsup_{m \rightarrow \infty} \frac{1}{H} {\bf E}_{h \in [H]} | {\mathcal D} f_n(T^h x) - F_{n,m,k}( g_{n,m,k}(h) \Gamma_{n,m,k}) |^2 \ll 2^{-10(n+k)}

where the limit is along the sequence.

For given {n, k}, there are only finitely many possibilities for the nilmanifold {G_{n,m,k}/\Gamma_{n,m,k}}, so by the usual diagonalisation argument we may pass to a subsequence of {m} and assume that {G_{n,m,k}/\Gamma_{n,m,k} = G_{n,k}/\Gamma_{n,k}} does not depend on {m} for any {n,k}. By Arzela-Ascoli we may similarly assume that the Lipschitz function {F_{n,m,k}} converges uniformly to {F_{n,k}}, so we now have

\displaystyle  \limsup_{m \rightarrow \infty} \frac{1}{H} {\bf E}_{h \in [H]} | {\mathcal D} f_n(T^h x) - F_{n,k}( g_{n,m,k}(h) \Gamma_{n,k}) |^2 \ll 2^{-10(n+k)}

along the remaining subsequence for all {n}, all {k \geq K_x}, and all {H \geq 1}.

By repeatedly breaking the coefficients of the polynomial sequence {g_{n,m,k}} into fractional parts and integer parts, and absorbing the latter in {\Gamma_{n,k}}, we may assume that these coefficients are bounded. Thus, by Bolzano-Weierstrass and refining the sequence of {m} further, we may assume that {g_{n,m,k}} converges locally uniformly in {h} as {m} goes to infinity to a polynomial sequence {g_{n,k}}, for every {n,k}. We thus have (for almost every {x}) that

\displaystyle  \frac{1}{H} {\bf E}_{h \in [H]} | {\mathcal D} f_n(T^h x) - F_{n,k}( g_{n,k}(h) \Gamma_{n,k}) |^2 \ll 2^{-10(n+k)}

for all {n}, all {k \geq K_x}, and all {H \geq 1}. Henceforth we shall cease to keep control of the complexity of {F_{n,k}} or {G_{n,k}/\Gamma_{n,k}}.

We can lift the polynomial sequence up to a linear sequence (enlarging {G_{n,k}} as necessary), thus

\displaystyle  \limsup_{H \rightarrow \infty} \frac{1}{H} {\bf E}_{h \in [H]} | {\mathcal D} f_n(T^h x) - F_{n,k}( g_{n,k}^h \Gamma_{n,k}) |^2 \ll 2^{-10(n+k)} \ \ \ \ \ (3)

for all {n}, all {k \geq K_x}, and some {g_{n,k} \in G_{n,k}}. By replacing various nilsystems with Cartesian powers, we may assume that the nilsystems {(G_{n,k}/\Gamma_{n,k}, x \mapsto g_{n,k} x)} are increasing in {n} and {k} in the sense that the nilsystem for {n,k} is a factor of that for {n,k+1} or {n+1,k}, with the origin mapping to the origin. Then, by restricting to the orbit of the origin, we may assume that all the nilsystems are ergodic (and thus also uniquely ergodic, by the special properties of nilsystems). The nilsystems then have an ergodic inverse limit {(Y,S)} with an origin {0}, and each function {F_{n,k}} lifts up to a continuous function {\tilde f_{n,k}} on {Y}, with {F_{n,k}(g_{n,k}^h \Gamma_{n,k}) = \tilde f_{n,k}(S^h 0)}. Thus

\displaystyle  \limsup_{H \rightarrow \infty} \frac{1}{H} {\bf E}_{h \in [H]} | {\mathcal D} f_n(T^h x) - \tilde f_{n,k}(S^h 0) |^2 \ll 2^{-10(n+k)} \ \ \ \ \ (4)

From the triangle inequality we see in particular that

\displaystyle  \limsup_{H \rightarrow \infty} \frac{1}{H} {\bf E}_{h \in [H]} | \tilde f_{n,k'}(S^h 0) - \tilde f_{n,k}(S^h 0) |^2 \ll 2^{-10(n+k)}

for all {n} and all {K_x \leq k \leq k'}, which by unique ergodicity of the nilsystems implies that

\displaystyle  \int_Y | \tilde f_{n,k'} - \tilde f_{n,k} |^2 \ll 2^{-10(n+k)}.

Thus the sequence {\tilde f_{n,k}} is Cauchy in {L^2} and tends to a some limit {\tilde f_n \in L^\infty(Y)}.

If {x} is generic for {{\mathcal D} f_n} (which is true for almost every {x}), we conclude from (4) and unique ergodicity of nilsystems that

\displaystyle  | \int_X {\mathcal D} f_n - \int_Y \tilde f_{n,k} | \ll 2^{-(n+k)}

for {k \geq K_x}, which on taking limits as {k \rightarrow \infty} gives

\displaystyle  \int_X {\mathcal D} f_n = \int_Y \tilde f_{n}.

A similar argument gives (1) for almost every {x}, for each choice of {n_1,\dots,n_d,h_1,\dots,h_d,P}. Since one only needs to verify a countable number of these conditions, we can find an {x} for which all the (1) hold simultaneously, and the claim follows.

Remark 6 In order to use the combinatorial inverse theorem to prove the full ergodic inverse theorem (and not just the weak version), it appears that one needs an “algorithmic” or “measurable” version of the combinatorial inverse theorem, in which the nilsequence produced by the inverse theorem can be generated in a suitable “algorithmic” sense from the original function {f}. In the setting of the inverse {U^3} theorem over finite fields, a result in this direction was established by Tulsiani and Wolf (building upon a well-known paper of Goldreich and Levin handling the {U^2} case). It is thus reasonable to expect that a similarly algorithmic version of the combinatorial inverse conjecture is true for higher Gowers uniformity norms, though this has not yet been achieved in the literature to my knowledge.

Filed under: expository, math.CO, math.DS Tagged: characteristic factor, Gowers uniformity norms, inverse conjecture, nilmanifolds, nilsequences

July 23, 2015

John PreskillAnt-Man and the Quantum Realm

ant-man-2015-marvel-movieIt was the first week of August last summer and I was at LAX for a trip to North Carolina as a guest speaker at Project Scientist’s STEM summer camp for young women. I had booked an early morning flight and had arrived at my gate with time to spare, so I decided to get some breakfast. I walked by a smart-looking salad bar and thought: Today is the day. Moving past the salad bar, I ordered a juicy cheeseburger with fries at the adjacent McDonald’s. Growing up in Greece, eating McDonald’s was a rare treat; as was playing video games with my brothers and reading comic books late at night. Yet, through a weird twist of fate, it was these last two guilty pleasures that were to become my Hadouken!, my Hulk, Smash!, my secret weapons of choice for breaking down the barriers between the world of quantum mechanics and the everyday reality of our super-normal, super-predictable lives.

I finished my burger, stuffing the last few delicious fries in my mouth, when my phone buzzed – I had mail from The Science and Entertainment Exchange, a non-profit organization funded by the National Academy of Sciences, whose purpose is to bring leading scientists in contact with Hollywood in order to elevate the level of science in the movies. I was to report to Atlanta, GA for a movie consult on a new superhero movie: Ant-Man. As I read halfway through the email, I grumbled to myself: Why can’t I be the guy who works on Thor? Who is this Ant-Man anyways? But, in typical Hollywood fashion, the email had a happy ending: “Paul Rudd is playing Ant-Man. He may be at the meeting, but we cannot promise anything.” Marvel would cover my expenses, so I sent in my reluctant reply. It went something like this:

Dear Marvel Secret-ary Agent,
Hell yeah.

The meeting was in three days time. I would finish my visit to Queens University in Charlotte, NC and take the next flight out to Atlanta. But first, some fun and games were in order. As part of my visit to Project Scientist’s camp, I was invited to teach quantum mechanics to a group of forty young women, ages 11-14, all of whom were interested in science, engineering and mathematics and many of whom did not have the financial means to pursue these interests outside of the classroom. So, I went to Queens University with several copies of MinecraftEDU, the educational component to one of the most popular video games of all time: Minecraft. As I described in “Can a game teach kids quantum mechanics”, I spent the summer of 2013 designing qCraft, a modification (mod) to Minecraft that allows players to craft blocks imbued with quantum superpowers such as quantum superposition and quantum entanglement. The mod, developed in collaboration with Google and TeacherGaming, became really popular, amassing millions of downloads around the world. But it is one thing to look at statistics as a measure of success and another to look into the eyes of young women who have lost themselves in a game (qCraft is free to download and comes with an accompanying curriculum) designed to teach them such heady concepts that inspired Richard Feynman to quip: If you think you understand quantum theory, you don’t.

girls_qcraftMy visit to Charlotte was so wonderful that I nearly decided to cancel my trip to Atlanta in order to stay with the girls and their mentors until the end of the week. But Mr. Rudd deserved the very best science could offer in making-quantum-stuff-up, so I boarded my flight and resolved to bring Project Scientist to Caltech the next summer. On my way to Atlanta, I used the in-flight WiFi to do some research on Ant-Man. He was part of the original Avengers, a founding member, in fact (nice!) His name was Dr. Hank Pym. He had developed a particle, aptly named after himself, which allowed him to shrink the space between atoms (well now…) He embedded vials of that particle in a suit that allowed him to shrink to the size of an ant (of course, that makes sense.) In short, he was a mad, mad scientist. And I was called in to help his successor, Scott Lang (Paul Rudd’s character), navigate his way through quantum land. Holy guacamole Ant-man! How does one shrink the space between atoms? As James Kakalios, author of The Physics of Superheroes, puts it in a recent article on Nate Silver’s FiveThirtyEight:

We’re made of atoms, and the neighboring atoms are all touching each other. One method of changing your size that’s out: Just squeeze the atoms closer together.

So the other option: What determines the size of atoms anyway?

We can calculate this with quantum mechanics, and it turns out to be the ratio of fundamental constants: Planck’s constant and the mass of an electron and the charge of the electron and this and that. The thing that all these constants have in common is that they’re constant. They don’t change.

Wonderful. Just peachy. How am I supposed to come up with a way that will allow Ant-Man to shrink to the size of an ant, if one of the top experts in movie science magic thinks that our best bet is to somehow change fundamental constants of nature?

The shrinking

Naturally, I could not, umm, find the time last summer to read last week’s article during my flight (time travel issues), so like any young Greek of my generation who still hopes that our national debt will just go poof, I attacked the problem of shrinking someone’s volume without shrinking their mass con pasión. The answer was far from obvious… but, it was possible. If one could convert electrons into muons, the atomic radius would shrink 200 times, shrinking a human to the size of an ant without changing any of the chemical properties of the atoms (muons have the same charge as the electrons, but are 200 times heavier). The problem then was the lifetime of the muonic atoms. Muons decay into electrons in about 2 millionths of a second, on average. That is indeed a problem. Could we somehow extend the half-life of a muon about a million times? Yes, if the muon has relativistic mass close to 20 TeV (near the current energy of the Large Hadron Collider in Geneva), the effect of Einstein’s relativistic time-dilation (which is how actual high-energy muons from cosmic radiation have time to reach our detectors before decaying) would allow our hero to shrink for a few seconds at a time with high probability. To shrink beyond that, or for longer periods of time, would require knowledge of physics beyond the standard model. Which is precisely what the Mu2e experiment at Fermilab is looking at right now. It’s like Christmas for Ant-Man fans!

So the famed Pym particle is a neutral particle that adds mass to an electron, converting it to a high-energy muon… When did I become a particle physicist? Oh well, fake it until you make it. Oh, hey, they are bringing pretzels! I love these little pretzel bites!

Enter Pinewood Studios

The flight was longer than I expected, which gave me time to think. A limo was waiting for me at the airport; I was to be taken directly to Pinewood Studios, luggage in hand and all. Once I was at my destination, I was escorted to the 3rd floor of a nondescript building, accessible only through special authorization (nice touch, Marvel). I was shown to what seemed like the main conference room, an open area with a large conference table. I expected that I would have to wait an hour before the assistant (to the) general manager showed up, so I started fiddling with my phone, making myself look busy and important. The next time I looked up, Paul Rudd was tapping my shoulder, dressed in sweats after what seemed like a training session for The 300. I am not sure what happened next, but Paul and I were in deep conversation about… qCraft? Someone must have told him that I was involved with designing the quantum mod for Minecraft and suddenly our roles were reversed. His son was a huge Minecraft fan and I was the celebrity in this boy’s eyes, and by parental transitivity, an associative, but highly non-commutative group action, in his dad’s eyes. I promised Paul that I would teach him how to install mods in Minecraft so his kids could enjoy qCraft and teach him about quantum entanglement when I wasn’t around. To my delight, I found myself listening to Mr. Rudd talk about his son’s sense of humor and his daughter’s intelligence with such pride, that I forgot for a moment the reason I was there; instead, it felt like I was catching up with an old friend and all we could talk about was our kids (I don’t have any, so I mostly listened).

The Meeting

Within five minutes, the director (Peyton Reed), the writers, producers, VFX specialists, computer playback experts (I became friends with their supervisor, Mr. Matthew Morrissey, who went to great lengths to represent the atoms on-screen as clouds of probability, in the s, p, d, f orbital configurations you see flashing in quantum superposition behind Hank Pym at his lab) and everyone else with an interest in quantum mechanics was in the room. I sat at the head of the long table with Paul next to me. He asked most of the questions along with the director, but at the time I didn’t know Paul was involved with writing the script. We discussed a lot of things, but what got everyone excited was the idea that the laws of physics as we know them may break down as we delve deeper and deeper into the quantum realm. You see, all of the other superheroes, no matter how strong and super, had powers that conformed to the laws of physics (stretching them from time to time, but never breaking them). But if someone could go to a place where the laws of physics as we know them were not yet formed, at a place where the arrow of time was broken and the fabric of space was not yet woven, the powers of such a master of the quantum realm would only be constrained by their ability to come back to the same (or similar) reality from which they departed. All the superheroes of Marvel and DC Comics combined would stand no chance against Ant-Man with a malfunctioning regulator…

The Quantum Realm

The birth of the term itself is an interesting story. Brad Winderbaum, co-producer for the movie, emailed me a couple of weeks after the meeting with the following request: Could I come up with a term describing Ant-Man going to the “microverse”? The term “microverse” carried legal baggage, so something fresh was needed. I offered “going nano”, “going quantum”, “going atomic”, or… “quantum realm”. I didn’t know how small the “microverse” scale was supposed to be in a writer’s mind (in a physicist’s mind it is exactly 10^{-6} meters – one thousandth of a millimeter), hence the many options. The reply was quick:

Thanks Spiros! Quantum Realm is a pretty great term.

Et voilà. Ant-Man was going to the quantum realm, a place where time and space dissolve and the only thing strong enough to anchor Scott Lang to reality is… You have to watch the movie to see what that is – it was an off-the-cuff remark I made at the meeting… At the end of the meeting, Paul, Peyton and the others thanked me and asked me if I could fly to San Francisco the next week for the first week of shooting. There, I would have met Michael Douglas and Evangeline Lilly, but I declined the generous offer. It was the week of Innoworks Academy at Caltech, an award-winning summer camp for middle school kids on the free/reduced lunch program. As the camp’s adviser, I resolved to never miss a camp as long as I was in the country and San Francisco is in the same state as Caltech. My mom would be proud of my decision (I hope), though an autograph from Mr. Douglas would have fetched me a really good hug.

The Movie

I just watched the movie yesterday (it is actually good!) and the feeling was surreal. Because I had no idea what to expect. Because I never thought that the people in that room would take what I was saying seriously enough to use it in the movie. I never got a copy of the script and during the official premiere three weeks ago, I was delivering a lecture on the future of quantum computing in a monastery in Madrid, Spain. When I found out that Kevin Feige, president of Marvel Studios, said this at a recent interview, my heart skipped several beats:

But the truth is, there is so much in Ant-Man: introducing a new hero, introducing a very important part of technology in the Marvel universe, the Pym particles. Ant-Man getting on the Avengers’ radar in this film and even – this is the weirdest part, you shouldn’t really talk about it because it won’t be apparent for years – but the whole notion of the quantum realm and the whole notion of going to places that are so out there, they are almost mind-bendingly hard to fathom. It all plays into Phase Three.

The third phase of the Marvel Cinematic Universe is about to go quantum and all I can think of is: I better start reading comic books again. But first, I have to teach a bunch of 11-14 year-old girls quantum physics through Minecraft. It is, after all, the final week of Project Scientist here at Caltech this summer and the theme is coding. With quantum computers at the near horizon, these young women need to learn how to program Asimov’s laws of quantum robotics into our benevolent quantum A.I. overlords. These young women are humanity’s new hope…

Sean CarrollWhy is the Universe So Damn Big?

I love reading io9, it’s such a fun mixture of science fiction, entertainment, and pure science. So I was happy to respond when their writer George Dvorsky emailed to ask an innocent-sounding question: “Why is the scale of the universe so freakishly large?”

You can find the fruits of George’s labors at this io9 post. But my own answer went on at sufficient length that I might as well put it up here as well. Of course, as with any “Why?” question, we need to keep in mind that the answer might simply be “Because that’s the way it is.”

Whenever we seem surprised or confused about some aspect of the universe, it’s because we have some pre-existing expectation for what it “should” be like, or what a “natural” universe might be. But the universe doesn’t have a purpose, and there’s nothing more natural than Nature itself — so what we’re really trying to do is figure out what our expectations should be.

The universe is big on human scales, but that doesn’t mean very much. It’s not surprising that humans are small compared to the universe, but big compared to atoms. That feature does have an obvious anthropic explanation — complex structures can only form on in-between scales, not at the very largest or very smallest sizes. Given that living organisms are going to be complex, it’s no surprise that we find ourselves at an in-between size compared to the universe and compared to elementary particles.

What is arguably more interesting is that the universe is so big compared to particle-physics scales. The Planck length, from quantum gravity, is 10^{-33} centimeters, and the size of an atom is roughly 10^{-8} centimeters. The difference between these two numbers is already puzzling — that’s related to the “hierarchy problem” of particle physics. (The size of atoms is fixed by the length scale set by electroweak interactions, while the Planck length is set by Newton’s constant; the two distances are extremely different, and we’re not sure why.) But the scale of the universe is roughly 10^29 centimeters across, which is enormous by any scale of microphysics. It’s perfectly reasonable to ask why.

Part of the answer is that “typical” configurations of stuff, given the laws of physics as we know them, tend to be very close to empty space. (“Typical” means “high entropy” in this context.) That’s a feature of general relativity, which says that space is dynamical, and can expand and contract. So you give me any particular configuration of matter in space, and I can find a lot more configurations where the same collection of matter is spread out over a much larger volume of space. So if we were to “pick a random collection of stuff” obeying the laws of physics, it would be mostly empty space. Which our universe is, kind of.

Two big problems with that. First, even empty space has a natural length scale, which is set by the cosmological constant (energy of the vacuum). In 1998 we discovered that the cosmological constant is not quite zero, although it’s very small. The length scale that it sets (roughly, the distance over which the curvature of space due to the cosmological constant becomes appreciable) is indeed the size of the universe today — about 10^26 centimeters. (Note that the cosmological constant itself is inversely proportional to this length scale — so the question “Why is the cosmological-constant length scale so large?” is the same as “Why is the cosmological constant so small?”)

This raises two big questions. The first is the “coincidence problem”: the universe is expanding, but the length scale associated with the cosmological constant is a constant, so why are they approximately equal today? The second is simply the “cosmological constant problem”: why is the cosmological constant scale so enormously larger than the Planck scale, or event than the atomic scale? It’s safe to say that right now there are no widely-accepted answers to either of these questions.

So roughly: the answer to “Why is the universe so big?” is “Because the cosmological constant is so small.” And the answer to “Why is the cosmological constant so small?” is “Nobody knows.”

But there’s yet another wrinkle. Typical configurations of stuff tend to look like empty space. But our universe, while relatively empty, isn’t *that* empty. It has over a hundred billion galaxies, with a hundred billion stars each, and over 10^50 atoms per star. Worse, there are maybe 10^88 particles (mostly photons and neutrinos) within the observable universe. That’s a lot of particles! A much more natural state of the universe would be enormously emptier than that. Indeed, as space expands the density of particles dilutes away — we’re headed toward a much more natural state, which will be much emptier than the universe we see today.

So, given what we know about physics, the real question is “Why are there so many particles in the observable universe?” That’s one angle on the question “Why is the entropy of the observable universe so small?” And of course the density of particles was much higher, and the entropy much lower, at early times. These questions are also ones to which we have no good answers at the moment.

July 22, 2015

Terence TaoAn inverse theorem for the continuous Gowers uniformity norm

A few years ago, Ben Green, Tamar Ziegler, and myself proved the following (rather technical-looking) inverse theorem for the Gowers norms:

Theorem 1 (Discrete inverse theorem for Gowers norms) Let {N \geq 1} and {s \geq 1} be integers, and let {\delta > 0}. Suppose that {f: {\bf Z} \rightarrow [-1,1]} is a function supported on {[N] := \{1,\dots,N\}} such that

\displaystyle  \frac{1}{N^{s+2}} \sum_{n,h_1,\dots,h_{s+1}} \prod_{\omega \in \{0,1\}^{s+1}} f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta.

Then there exists a filtered nilmanifold {G/\Gamma} of degree {\leq s} and complexity {O_{s,\delta}(1)}, a polynomial sequence {g: {\bf Z} \rightarrow G}, and a Lipschitz function {F: G/\Gamma \rightarrow {\bf R}} of Lipschitz constant {O_{s,\delta}(1)} such that

\displaystyle  \frac{1}{N} \sum_n f(n) F(g(n) \Gamma) \gg_{s,\delta} 1.

For the definitions of “filtered nilmanifold”, “degree”, “complexity”, and “polynomial sequence”, see the paper of Ben, Tammy, and myself. (I should caution the reader that this blog post will presume a fair amount of familiarity with this subfield of additive combinatorics.) This result has a number of applications, for instance to establishing asymptotics for linear equations in the primes, but this will not be the focus of discussion here.

The purpose of this post is to record the observation that this “discrete” inverse theorem, together with an equidistribution theorem for nilsequences that Ben and I worked out in a separate paper, implies a continuous version:

Theorem 2 (Continuous inverse theorem for Gowers norms) Let {s \geq 1} be an integer, and let {\delta>0}. Suppose that {f: {\bf R} \rightarrow [-1,1]} is a measurable function supported on {[0,1]} such that

\displaystyle  \int_{{\bf R}^{s+1}} \prod_{\omega \in \{0,1\}^{s+1}} f(t+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1})\ dt dh_1 \dots dh_{s+1} \geq \delta. \ \ \ \ \ (1)

Then there exists a filtered nilmanifold {G/\Gamma} of degree {\leq s} and complexity {O_{s,\delta}(1)}, a (smooth) polynomial sequence {g: {\bf R} \rightarrow G}, and a Lipschitz function {F: G/\Gamma \rightarrow {\bf R}} of Lipschitz constant {O_{s,\delta}(1)} such that

\displaystyle  \int_{\bf R} f(t) F(g(t) \Gamma)\ dt \gg_{s,\delta} 1.

The interval {[0,1]} can be easily replaced with any other fixed interval by a change of variables. A key point here is that the bounds are completely uniform in the choice of {f}. Note though that the coefficients of {g} can be arbitrarily large (and this is necessary, as can be seen just by considering functions of the form {f(t) = \cos( \xi t)} for some arbitrarily large frequency {\xi}).

It is likely that one could prove Theorem 2 by carefully going through the proof of Theorem 1 and replacing all instances of {{\bf Z}} with {{\bf R}} (and making appropriate modifications to the argument to accommodate this). However, the proof of Theorem 1 is quite lengthy. Here, we shall proceed by the usual limiting process of viewing the continuous interval {[0,1]} as a limit of the discrete interval {\frac{1}{N} \cdot [N]} as {N \rightarrow \infty}. However there will be some problems taking the limit due to a failure of compactness, and specifically with regards to the coefficients of the polynomial sequence {g: {\bf N} \rightarrow G} produced by Theorem 1, after normalising these coefficients by {N}. Fortunately, a factorisation theorem from a paper of Ben Green and myself resolves this problem by splitting {g} into a “smooth” part which does enjoy good compactness properties, as well as “totally equidistributed” and “periodic” parts which can be eliminated using the measurability (and thus, approximate smoothness), of {f}.

We now prove Theorem 2. Firstly observe from Hölder’s inequality that the Gowers norm expression in the left-hand side of (1) is continuous in {f} in the {L^{2^{s+1}}({\bf R})} topology. As such, it suffices to prove the theorem for a dense class of {f}, such as the Lipschitz-continuous {f}, so long as the bounds remain uniform in {f}. Thus, we may assume without loss of generality that {f} is Lipschitz continuous.

Now let {N} be a large integer (which will eventually be sent to infinity along a subsequence). As {f} is Lipschitz continuous, the integral in (1) is certainly Riemann integrable, and so for sufficiently large {N} (where we allow “sufficiently large” to depend on the Lipschitz constant) we will have

\displaystyle  \frac{1}{N^{s+1}} \sum_{n,h_1,\dots,h_{s+1}} \prod_{\omega \in \{0,1\}^{s+1}} f(\frac{n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}}{N}) \geq \delta/2

(say). Applying Theorem 2, we can thus find for sufficiently large {N}, a filtered nilmanifold {G_N/\Gamma_N} of degree {s} and complexity {O_{s,\delta}(1)}, a polynomial sequence {g_N: {\bf Z} \rightarrow G_N}, and a Lipschitz function {F_N: G_N/\Gamma_N \rightarrow {\bf R}} of Lipschitz constant {O_{s,\delta}(1)} such that

\displaystyle  \frac{1}{N} \sum_n f(\frac{n}{N}) F_N(g_N(n) \Gamma_N) \gg_{s,\delta} 1.

Now we prepare to take limits as {N \rightarrow \infty}, passing to subsequences as necessary. Using Mal’cev bases, one can easily check that there are only finitely many filtered nilmanifolds {G_N/\Gamma_N} of a fixed degree and complexity, hence by passing to a subsequence of {N} we may assume that {G_N/\Gamma_N = G/\Gamma} is independent of {N}. The Lipschitz functions {F_N} are now equicontinuous on a fixed compact domain {G/\Gamma}, so by the Arzelá-Ascoli theorem and further passage to a subsequence we may assume that {F_N} converges uniformly to a Lipschitz function {F: G/\Gamma \rightarrow {\bf C}} of Lipschitz constant {O_{s,\delta}(1)}. In particular (passing to a further subsequence as necessary) we have

\displaystyle  \frac{1}{N} \sum_n f(\frac{n}{N}) F(g_N(n) \Gamma) \gg_{s,\delta} 1.

We have removed a lot of the dependencies of the nilsequence {F_N(g_N(n) \Gamma_N)} on {N}, however there is still a serious lack of compactness in the remaining dependency of the polynomial sequence {g_N(n)} on {N}. Fortunately, we can argue as follows. Let {M_0, A} be large quantities (depending on {s,\delta}, and the Lipschitz constant of {f}) to be chosen later. Applying the factorisation theorem for polynomial sequences (see Theorem 1.19 of this paper of Ben Green and myself), we may find for each {N} in the current subsequence, an integer {M_0 \leq M_N \ll_{M_0,d,s,\delta} 1}, a rational subgroup {G'_N} of {G} whose associated filtered nilmanifold {G'_N/\Gamma'_N} has structure constants that are {M_N}-rational with respect to the Mal’cev bais of {G/\Gamma}, and a decomposition

\displaystyle  g_N(n) = \varepsilon_N(n) g'_N(n) \gamma_N(n)


  • {\varepsilon_N: {\bf Z} \rightarrow G} is a polynomial sequence which is {(M_N,N)}-smooth;
  • {g'_N: {\bf Z} \rightarrow G'} is a polynomial sequence with {n \mapsto g'_N(n) \Gamma'_N} is totally {1/M_N^A}-equidistributed in {G'_N/\Gamma'_N}; and
  • {\gamma_N: {\bf Z} \rightarrow G} is a polynomial sequence which is {M}-rational, and {n \mapsto \gamma_N(n) \Gamma} is periodic with period at most {M}.

See the above referenced paper for a definition of all the terminology used here.

Once again we can make a lot of the data here independent of {N} by passing to a subsequence. Firstly, {M_N} takes only finitely many values so by passing to a subsequence we may assume that {M_N=M} is independent of {N}. Then the number of rational subgroups {G'_N} with {M}-rational structure constants is also finite, so by passing to a further subsequence we may take {G'_N = G'} independent of {N}, so {\Gamma'_N = \Gamma' = G' \cap \Gamma} is also independent of {N}. Up to right multiplication by polynomial sequences from {{\bf Z}} to {\Gamma} (which do not affect the value of {g_N(n) \Gamma}), there are only finitely many {M}-rational polynomial sequences {\gamma_N} that are periodic with period at most {N}, so we may take {\gamma_N = \gamma} independent of {N}. Finally, using coordinates one can write {\varepsilon_N(n) = \tilde \varepsilon_N( \frac{n}{N} )} where {\tilde \varepsilon_N: {\bf R} \rightarrow G} is a continuous polynomial sequence whose coefficients are bounded uniformly in {N}. By Bolzano-Weierstrass, we may assume on passing to a subsequence that {\tilde \varepsilon_N} converges locally uniformly to a limit {\tilde \varepsilon: {\bf R} \rightarrow G}, which is again a continuous polynomial sequence. Thus, on passing to a further subsequence, we have

\displaystyle  \frac{1}{N} \sum_n f(\frac{n}{N}) F(\tilde \varepsilon(\frac{n}{N}) g'_N(n) \gamma(n) \Gamma) \gg_{s,\delta} 1.

Let {q} be the period of {\gamma}. By the pigeonhole principle (and again passing to a subsequence) we may find a residue class {a \hbox{ mod } q} independent of {N} such that

\displaystyle  \frac{q}{N} \sum_{n = a \hbox{ mod } q} f(\frac{n}{N}) F(\tilde \varepsilon(\frac{n}{N}) g'_N(n) \gamma(a) \Gamma) \gg_{s,\delta} 1. \ \ \ \ \ (2)

Because {g'_N(n)} is totally {1/M^A}-equidistributed in {G'/\Gamma'}, and {\gamma(a)} is {M}-rational, the conjugate {\gamma(a)^{-1} g'_N(n) \gamma(a)} is totally {1/M^{A-O_{s,\delta}(1)}}-equidistributed in {G''/\Gamma''}, where {G'' := \gamma(a)^{-1} G' \gamma(a)} and {\Gamma'' := G'' \cap \Gamma}; see Section 2 of this paper of Ben and myself for a derivation of this fact. From this, we have the approximation

\displaystyle  \frac{q}{N} \sum_{n = a \hbox{ mod } q} 1_I( \frac{n}{N}) \tilde F( \gamma(a)^{-1} g'_N(n) \gamma(a) \Gamma) = |I| \int_{G''/\Gamma''} \tilde F + O(\kappa)

for any {\kappa > 0} and any fixed interval {I}, where {|I|} is the length of {I} and the integral is with respect to Haar measure, and {M_0,A} are sufficiently large depending on {\kappa,s,\delta,I}. Using Riemann integration, we thus see that the left-hand side of (2) is thus of the form

\displaystyle  \int_{\bf R} f(t) (\int_{G''/\Gamma''} F( \tilde \varepsilon(t) \gamma(a) \cdot ))\ dt + O(\kappa)

for sufficiently large {N} if {M_0,A} are sufficiently large depending on {\kappa,s,\delta}, and the Lipschitz constant of {f}, and so (writing {g_0(t) := \tilde \varepsilon(t) \gamma(a)}) we have

\displaystyle  \int_{\bf R} f(t) (\int_{G''/\Gamma''} F( g_0(t) \cdot ))\ dt \gg_{s,\delta} 1

if {M_0,A} are sufficiently large depending on {s,\delta}, and the Lipschitz constant of {f}. If we let {h: {\bf R} \rightarrow G''} be a continuous polynomial sequence that is equidistributed in {G''} (which will happen as soon as the sequence is equidistributed with respect to the abelianisation of {G''}, by an old result of Leon Green), then a similar argument shows that

\displaystyle  \lim_{T \rightarrow \infty} \int_{\bf R} f(t) F(g_0(t) h(\frac{t}{T}))\ dt = \int_{\bf R} f(t) (\int_{G''/\Gamma''} F( g(t) \cdot ))\ dt

and thus there exists {T} such that

\displaystyle  \int_{\bf R} f(t) F(g_0(t) h(\frac{t}{T}))\ dt \gg_{s,\delta} 1.

Setting {g(t) := g_0(t) h(\frac{t}{T})}, we obtain the claim.

I thank Ben Green for helpful conversations that inspired this post.

Filed under: expository, math.CO, math.DS Tagged: Ben Green, Gowers uniformity norms, inverse conjecture, regularity lemma, Tamar Ziegler

Scott AaronsonIntroducing some British people to P vs. NP

Here’s a 5-minute interview that I did with The Naked Scientists (a radio show syndicated by the BBC, and no, I’m not sure why it’s called that), explaining the P vs. NP problem.  For readers of this blog, there won’t be anything new here, but, well … you might enjoy the rest of the hour-long programme [sic], which also includes segments about a few other Clay Millennium problems (the Riemann Hypothesis, Navier-Stokes, and Poincaré), as well as a segment about fat fish that live in caves and gorge themselves on food on the rare occasions when it becomes available, and which turn out to share a gene variant with humans with similar tendencies.

Chad OrzelHave Laptop, Will Travel

Having mentioned in yesterday’s post that I’ll be on sabbatical for the next academic year, this would probably be a good time to point out that this means I’m somewhat more flexible than usual in terms of going places and giving talks. And I enjoy going places and giving talks. About lots of different things.

So, if you’re at a place that might be interested in a science-y speaker on quantum physics, relativity, science communication, science in general, or something related to those, drop me a line. I’d be happy to talk about the possibility of visiting new places and talking to people about science.

July 21, 2015

Clifford JohnsonIan McKellen on Fresh Air!

Ian McKellen twitter profile imageI had a major treat last night! While making myself an evening meal I turned on the radio to find Ian McKellen (whose voice and delivery I love so very much I can listen to him slowly reading an arbitrarily long list of random numbers) being interviewed by Dave Davies on NPR's Fresh Air. It was of course delightful, and some of the best radio I've enjoyed in a while (and I listen to a ton of good radio every day, between having either NPR or BBC Radio 4 on most of the time) since it was the medium at its simple best - a splendid conversation with an interesting, thoughtful, well-spoken person. They also played and discussed a number of clips from his work, recent (I've been hugely excited to see Mr. Holmes, just released) and less recent (less well known delights such as Gods and Monsters -you should see it if you have not- and popular material like the first Hobbit film), and spoke at length about his private and public life and the intersection between the two, for example how his coming out as gay in 1988 positively affected his acting, and why.... There's so much in that 35 minutes! [...] Click to continue reading this post

Mark Chu-CarrollArabic numerals have nothing to do with angle counting!

There’s an image going around that purports to explain the origin of the arabic numerals. It’s cute. It claims to show why the numerals that we use look the way that they do. Here it is:

According to this, the shapes of the numbers was derived from a notation where for each numeral contains its own number of angles. It’s a really interesting idea, and it would be really interesting if it were true. The problem is, it isn’t.

Look at the numerals in that figure. Just by looking at them, you can see quite a number of problems with them.

For a couple of obvious examples:

  • Look at the 7. The crossed seven is a recent invention made up to compensate for the fact that in cursive roman lettering, it can be difficult to distinguish ones from sevens, the mark was added to clarify. The serifed foot on the 7 is even worse: there’s absolutely no tradition of writing a serifed foot on the 7; it’s just a font decoration. The 7’s serifed foot is no more a part of the number than serifed foot on the lowercase letter l is an basic feature of the letter ls.
  • Worse is the curlique on the 9: the only time that curly figures like that appear in writing is in calligraphic documents, where they’re an aesthetic flourish. That curly thing has never been a part of the number 9. But if you want to claim this angle-counting nonsense, you’ve got to add angles to a 9 somewhere. It’s not enough to just add a serifed foot – that won’t get you enough angles. So you need the curlique, no matter how obviously ridiculous it is.

You don’t even need to notice stuff like that to see that this is rubbish. We actually know quite a lot about the history of arabic numeral notation. We know what the “original” arabic numerals looked like. For example, this wikipedia image shows the standard arabic numerals (this variant is properly called the Bakshali numerals) from around the second century BC:


It’s quite fascinating to study the origins of our numeric notation. It’s true that we – “we” meaning the scholarly tradition that grew out of Europe – learned the basic numeric notation from the Arabs. But they didn’t invent it – it predates them by a fair bit. The notation originally came from India, where Hindu scholars, who wrote in an alphabet derived from Sanskrit, used a sanskrit-based numeric notation called Brahmi numerals (which, in turn, were derived from an earlier notation, Karosthi numerals, which weren’t used quite like the modern numbers, so the Brahmi numerals are considered the earliest “true” arabic numeral.) That notation moved westward, and was adopted by the Persians, who spread it to the Arabs. As the arabs adopted it, they changed the shapes to work with their calligraphic notations, producing the Bakshali form.

In the Brahmi numerals, the numbers 1 through 4 are written in counting-based forms: one is written as one horizontal line; 2 as two lines; 3 as three lines. Four is written as a pair of crossed lines, giving four quadrants. 5 through 9 are written using sanskrit characters: their “original” form had nothing to do with counting angles or lines.

The real history of numerical notations is really interesting. It crosses through many different cultures, and the notations reform each time it migrates, keeping the same essential semantics, but making dramatic changes in the written forms of individual numerals. It’s so much more interesting – and the actual numeral forms are so much more beautiful – than you’d ever suspect from the nonsense of angle-counting.

Chad Orzel14 Years Before the Class(room)

This past academic year was my 14th as a professor at Union, and my last as department chair. I’m on sabbatical for the 2015-16 academic year, doing my very best to avoid setting foot in an academic building, so it will be September 2016 before I’m teaching a class again. This seems like a good opportunity to reflect a bit on my experiences to this point, which in turn is a good excuse for a blog post. So, here are some things I’ve found out over the last 14 years of being a college professor:

— Teaching is really hard. My first year, when colleagues from other schools asked how I was finding it, my standard reply was “It’s a lot more work than it looks like from out in the classroom.” Fourteen years in, that’s still true. Even a class I’ve taught a dozen times before still involves a huge amount of outside-of-class time. I’ve gotten more efficient at this over the years, but it’s still a whole bunch of work.

— Shaking things up is good. I’ve taught our intro sequence more times than I care to think about, and found that I’m good for about three passes through a given course before I need to change things up in a big way. By the third time using the same notes, I find that I’m starting to bore myself, and need to blow it up and start over. This is part of the motivation behind moving to more “active learning” stuff in my classes– I was losing interest in the lectures I was giving, and needed to do something new.

— Students are generally very good. I told both my intro mechanics sections this past Spring that I am consistently amazed at how cheerfully they do the huge amount of stuff we ask of them. This isn’t true for all classes, mind, but generally speaking, I find that most students are willing to put in a significant amount of effort. Particularly if you explain why it matters.

— Not all “kids these days” stories are wrong. Having said that students are, in general, very good about doing what we ask, there are some subgroups who are not. And that’s changed a lot just in the time I’ve been here. This is most pronounced in the pre-med class– they’ve always been highly motivated by grades, but my first few years, they were often willing to do extra work in search of that good grade, while more recently, they’ve shifted that effort to lobbying to get out of things. The number of students who expect us to re-work our class schedule to make it easier for them to take organic chemistry just boggles my mind.

— Most “kids these days stories” are garbage. While there have been a number of changes in attitudes over the years, a lot of what gets said about “millennials” is just crap. In both directions. They’re no worse about paying attention or doing work in general than students of my era– it’s just slightly more obvious when a student is tuning out of a lecture by web surfing on their phone than when they start doodling in their notebook. They’re also not significantly more tech-savvy than any other crop of students– they’re very good with the narrow range of things they use a lot, which happens to be different than what faculty use a lot, but this does not magically transfer into an ability to use technology in general.

— Student course evaluations are just this thing, y’know? There are lots of arguments in faculty circles about whether student course evaluations are a threat, a menace, or the WORST THING EVER. In reality, though, they’re just kind of… there. While there are always a few outliers, they generally track pretty well with my general sense of how a class went, when I think carefully about it. There are occasional surprises– I ran afoul of a particular set of expectations this past fall that I didn’t realize was a Thing– but mostly, when my numerical evaluation scores come in low, I can generally understand why.

The written comments are generally more useful, but also harder to sort through. For every student who writes something good and thoughtful there are three or four who write “Fine,” or just scribble illegibly. But as much as I roll my eyes about some of the useless comments, I’ve also found good and perceptive stuff in there.

(For the record, I should note that in the last three years of looking at other people’s evaluations for performance reviews, I find the same general pattern holds. I also haven’t seen any significant number of racist or sexist comments directed at our faculty, not even of the “she needs to dress better” sort of variety. From horror stories online, I was expecting much worse, but again, our students are generally good.)

— Research with students is a great thing. My original main research project has been pretty dormant over the last few years, because being Chair, having kids, and writing books didn’t leave much time. this is the first summer since I’ve been at Union, though, when I haven’t had at least one research students. Most years I’ve had at least two. I’ve been putting them on smaller projects that are tangential to my original research plan, for the most part– developing new labs, building optical tweezers, etc.– things that require somewhat less direct intervention from me, but are still enough to give a sense of how experimental AMO physics works.

While it’s kind of exhausting, I also find these research projects really rewarding, and think they work out well for the students. And that kind of one-on-one in-the-lab experience is what put undergrad-me on a path toward my current career, so I’m glad to pay it forward a bit. For reasons that don’t really bear talking about, I’m not likely to restart my laser cooling project in the very near future, but I’m going to give some thought to finding new and improved “side projects” to keep students in the lab.

So, that’s some of what I’ve been thinking about as I head into an extended break. I have Opinions about things more on the administrative side, too, as I wrap up my time as Chair, but the vast majority of that wouldn’t be appropriate to discuss in public under my own name. So, you’ll just have to try to figure out how to find my angry pseudonymous Tumblr blog if you want to know what I think about that…


(Note: I do not have an angry pseudonymous Tumblr blog. Though I’m sure if you went looking, you could find someone saying more or less what I think about issues relating to academic administration…)

July 20, 2015

Jordan EllenbergI can’t even tell when I’m on the Internet anymore

I asked the barista in the Starbucks in Target to settle an argument between CJ and me — should it be referred to as “Tarbucks” or “Starget?”

Barista:  I don’t know, I think both.

Other barista:  No, it’s Tarbucks.

Barista:  It’s Tarbucks?

Other barista:  That’s what they call it on Reddit.

It’s true!

BackreactionThe number-crunchers. How we learned to stop worrying and love to code.

My grandmother was a calculator, and I don’t mean to say I’m the newest from Texas Instruments. I mean my grandmother did calculations for a living, with pencil on paper, using a slide rule and logarithmic tables. She calculated the positions of stars on the night sky, for five minute intervals, day by day, digit by digit.

Today you can download one of a dozen free apps to display the night sky for any position on Earth, any time, any day. Not that you actually need to know stellar constellations to find True North. Using satellite signals, your phones can now tell your position to within a few meters, and so can 2 million hackers in Russia.

My daughters meanwhile are thoroughly confused as to what a phone is, since we use the phone to take photos but make calls on the computer. For my four-year old a “phone” is pretty much anything that beeps, including the microwave, which for all I know by next year might start taking photos of dinner and upload them to facebook. And the landline. Now that you say it. Somebody called in March and left a voicemail.

Jack Myers dubbed us the “gap generation,” the last generation to remember the time before the internet. Myers is a self-described “media ecologist” which makes you think he’d have heard of search engine optimization. Unfortunately, when queried “gap generation” it takes Google 0.31 seconds to helpfully bring up 268,000,000 hits for “generation gap.” But it’s okay. I too recall life without Google, when “viral” meant getting a thermometer stuffed between your lips rather than being on everybody’s lips.

I wrote my first email in 1995 with a shell script called “mail” when the internet was chats and animated gifs. Back then, searching a journal article meant finding a ladder and blowing dust off thick volumes with yellowish pages. There were no keyword tags or trackbacks; I looked for articles by randomly browsing through journals. If I had an integral to calculate, there were Gradshteyn and Ryzhik’s tables, or Abramovitz and Stegun's Handbook of Special Functions, and else, good luck.

Our first computer software for mathematical calculations, one of the early Maple versions, left me skeptical. It had an infamous error in one of the binomial equations that didn’t exactly instill trust. The program was slow and stalled the machine for which everybody hated me because my desk computer was also the institute’s main server (which I didn’t know until I turned it off, but then learned very quickly). I taught myself fortran and perl and java script and later some c++, and complained it wasn’t how I had imagined being a theoretical physicist. I had envisioned myself thinking deep thoughts about the fundamental structure of reality, not chasing after missing parentheses.

It turned out much of my masters thesis came down to calculating a nasty integral that wasn’t tractable numerically, by computer software, because it was divergent. And while I was juggling generalized hypergeometric functions and Hermite polynomials, I became increasingly philosophic about what exactly it meant to “solve an integral.”

We say an integral is solved if we can write it down as a composition of known functions. But this selection of functions, even the polynomials, are arbitrary choices. Why not take the supposedly unsolvable integral, use it to define a function and be done with it? Why are some functions solutions and others aren’t? We prefer particular functions because their behaviors are well understood. But that again is a matter of how much they are used and studied. Isn’t it in the end all a matter of habit and convention?

After two years I managed to renormalize the damned integral and was left with an expression containing incomplete Gamma functions, which are themselves defined by yet other integrals. The best thing I knew to do with this was to derive some asymptotic limits and then plot the full expression. Had there been any way to do this calculation numerically all along, I’d happily have done it, saved two years of time, and gotten the exact same result and insight. Or would I? I doubt the paper would even have gotten published.

Twenty years ago, I like most physicists considered numerical results inferior to analytical, pen-on-paper, derivations. But this attitude has changed, changed so slowly I almost didn’t notice it changing. Today numerical studies are still often considered suspicious, fighting a prejudice of undocumented error. But it has become accepted practice to publish results merely in forms of graphs, figures, and tables, videos even, for (systems of) differential equations that aren’t analytically tractable. Especially in General Relativity, where differential equations tend to be coupled, non-linear, and with coordinate-dependent coefficients – ie as nasty as it gets – analytic solutions are the exception not the norm.

Numerical results are still less convincing, but not so much because of a romantic yearning for deep insights. They are less convincing primarily because we lack shared standards for coding, whereas we all know the standards of analytical calculation. We use the same functions and the same symbols (well, mostly), whereas deciphering somebody else’s code requires as much psychoanalysis as patience. For now. But imagine you could check code with the same ease you check algebraic manipulation. Would you ditch analytical calculations over purely numerical ones, given the broader applicability of the latter? How would insights obtained by one method be of any less value than those obtained by the other?

The increase of computing power has generated entirely new fields of physics by allowing calculations that previously just weren’t feasible. Turbulence in plasma, supernovae explosion, heavy ion collisions, neutron star mergers, or lattice qcd to study the strong nuclear interaction, these are all examples of investigations that have flourished only with the increase in processing speed and memory. Such disciplines tend to develop their own, unique and very specialized nomenclature and procedures that are difficult if not impossible to evaluate for outsiders.
Lattice QCD. Artist’s impression.

Then there is big data that needs to be handled. May that be LHC collisions or temperature fluctuations in the CMB or global fits of neutrino experiments, this isn’t data any human can deal with by pen on paper. In these areas too, subdisciplines have sprung up, dedicated to data manipulation and -handling. Postdocs specialized in numerical methods are high in demand. But even though essential to physics, they face the prejudice of somehow being “merely calculators.”

The maybe best example is miniscule corrections to probabilities of scattering events, like those taking place at the LHC. Calculating these next-to-next-to-next-to-leading-order contributions is an art as much as a science; it is a small subfield of high energy physics that requires summing up thousands or millions of Feynman diagrams. While there are many software packages available, few physicists know all the intricacies and command all the techniques; those who do often develop software along with their research. They are perfecting calculations, aiming for the tiniest increase in precision much like pole jumpers perfect their every motion aiming after the tiniest increase in height. It is a highly specialized skill, presently at the edge of theoretical physics. But while we admire the relentless perfection of professional athletes, we disregard the single-minded focus of the number-crunchers. What can we learn from it? What insight can be gained from moving the bar an inch higher?

What insight do you gain from calculating the positions of stars on the night sky, you could have asked my grandmother. She was the youngest of seven siblings, her father died in the first world war. Her only brother and husband were drafted for the second world war and feeding the family was left to the sisters. To avoid manufacturing weapons for a regime she detested, she took on a position in an observatory, calculating the positions of stars. This appointment came to a sudden stop when her husband was badly injured and she was called to his side at the war front to watch him die, or so she assumed. Against all expectations, my grandfather recovered from his skull fractures. He didn’t have to return to the front and my grandma didn’t return to her job. It was only when the war was over that she learned her calculations were to help the soldiers target bombs, knowledge that would haunt her still 60 years later.

What insight do we gain from this? Precision is the hallmark of science, and for much of our society science is an end for other means. But can mere calculation ever lead to true progress? Surely not with the computer codes we use today, which execute operations but do not look for simplified underlying principles, which is necessary to advance understanding. It is this lacking search for new theories that leaves physicists cynical about the value of computation. And yet some time in the future we might have computer programs doing exactly this, looking for underlying mathematical laws better suited than existing ones to match observation. Will physicists one day be replaced by software? Can natural law be extracted by computers from data? If you handed all the LHC output to an artificial intelligence, could it spit out the standard model?

In an influential 2008 essay “The End of Theory,” Chris Anderson argued that indeed computers will one day make human contributions to theoretical explanations unnecessary:
“The reason physics has drifted into theoretical speculation about n-dimensional grand unified models over the past few decades (the "beautiful story" phase of a discipline starved of data) is that we don't know how to run the experiments that would falsify the hypotheses — the energies are too high, the accelerators too expensive, and so on[...]

The new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.”
His future vision was widely criticized by physicists, me included, but I’ve had a change of mind. Much of the criticism Anderson took was due to vanity. We like to believe the world will fall into pieces without our genius and don’t want to be replaced by software. But I don’t think there’s anything special about the human brain that an artificial intelligence couldn’t do, in principle. And I don’t care very much who or what delivers insights, as long as they come by. In the end it comes down to trust.

If a computer came up with just the right string theory vacuum to explain the standard model and offered you the explanation that the world is made of strings to within an exactly quantified precision, what difference would it make whether the headline was made by a machine rather than a human? Wouldn’t you gain the exact same insight? Yes, we would still need humans to initiate the search, someone to write a code that will deliver to our purposes. And chances are we would celebrate the human, rather than the machine. But the rest is overcoming prejudice against “number crunching,” which has to be addressed by setting up reliable procedures that ensure a computer’s results are sound science. I’ll be happy if your AI delivers a theory of quantum gravity; bring it on.

My grandmother outlived her husband who died after multiple strokes. Well in her 90s she still had a habit of checking all the numbers on her receipts, bills, and account statements. Despite my conviction that artificial intelligences could replace physicists, I don’t think it’s likely to happen. The human brain is remarkable not so much for its sheer computing power, but for its efficiency, resilience, and durability. You show me any man-made machine that will still run after 90 years in permanent use.

John BaezThe Game of Googol

Here’s a puzzle from a recent issue of Quanta, an online science magazine:

Puzzle 1: I write down two different numbers that are completely unknown to you, and hold one in my left hand and one in my right. You have absolutely no idea how I generated these two numbers. Which is larger?

You can point to one of my hands, and I will show you the number in it. Then you can decide to either select the number you have seen or switch to the number you have not seen, held in the other hand. Is there a strategy that will give you a greater than 50% chance of choosing the larger number, no matter which two numbers I write down?

At first it seems the answer is no. Whatever number you see, the other number could be larger or smaller. There’s no way to tell. So obviously you can’t get a better than 50% chance of picking the hand with the largest number—even if you’ve seen one of those numbers!

But “obviously” is not a proof. Sometimes “obvious” things are wrong!

It turns out that, amazingly, the answer to the puzzle is yes! You can find a strategy to do better than 50%. But the strategy uses randomness. So, this puzzle is a great illustration of the power of randomness.

If you want to solve it yourself, stop now or read Quanta magazine for some clues—they offered a small prize for the best answer:

• Pradeep Mutalik, Can information rise from randomness?, Quanta, 7 July 2015.

Greg Egan gave a nice solution in the comments to this magazine article, and I’ll reprint it below along with two followup puzzles. So don’t look down there unless you want a spoiler.

I should add: the most common mistake among educated readers seems to be assuming that the first player, the one who chooses the two numbers, chooses them according to some probability distribution. Don’t assume that. They are simply arbitrary numbers.

The history of this puzzle

I’d seen this puzzle before—do you know who invented it? On G+, Hans Havermann wrote:

I believe the origin of this puzzle goes back to (at least) John Fox and Gerald Marnie’s 1958 betting game ‘Googol’. Martin Gardner mentioned it in his February 1960 column in Scientific American. Wikipedia mentions it under the heading ‘Secretary problem’. Gardner suggested that a variant of the game was proposed by Arthur Cayley in 1875.

Actually the game of Googol is a generalization of the puzzle that we’ve been discussing. Martin Gardner explained it thus:

Ask someone to take as many slips of paper as he pleases, and on each slip write a different positive number. The numbers may range from small fractions of 1 to a number the size of a googol (1 followed by a hundred 0s) or even larger. These slips are turned face down and shuffled over the top of a table. One at a time you turn the slips face up. The aim is to stop turning when you come to the number that you guess to be the largest of the series. You cannot go back and pick a previously turned slip. If you turn over all the slips, then of course you must pick the last one turned.

So, the puzzle I just showed you is the special case when there are just 2 slips of paper. I seem to recall that Gardner incorrectly dismissed this case as trivial!

There’s been a lot of work on Googol. Julien Berestycki writes:

I heard about this puzzle a few years ago from Sasha Gnedin. He has a very nice paper about this

• Alexander V. Gnedin, A solution to the game of Googol, Annals of Probability (1994), 1588–1595.

One of the many beautiful ideas in this paper is that it asks what is the best strategy for the guy who writes the numbers! It also cites a paper by Gnedin and Berezowskyi (of oligarchic fame). 

Egan’s solution

Okay, here is Greg Egan’s solution, paraphrased a bit:

Pick some function f : \mathbb{R} \to \mathbb{R} such that:

\displaystyle{ \lim_{x \to -\infty} f(x) = 0 }

\displaystyle{ \lim_{x \to +\infty} f(x) = 1 }

f is monotonically increasing: if x > y then f(x) > f(y)

There are lots of functions like this, for example

\displaystyle{f(x) = \frac{e^x}{e^x + 1} }

Next, pick one of the first player’s hands at random. If the number you are shown is x, compute f(x). Then generate a uniformly distributed random number z between 0 and 1. If z is less than or equal to f(x) guess that x is the larger number, but if z is greater than f(x) guess that the larger number is in the other hand.

The probability of guessing correctly can be calculated as the probability of seeing the larger number initially and then, correctly, sticking with it, plus the probability of seeing the smaller number initially and then, correctly, choosing the other hand.

This is

\frac{1}{2} f(x) + \frac{1}{2} (1 - f(y)) =  \frac{1}{2} + \frac{1}{2} (f(x) - f(y))

This is strictly greater than \frac{1}{2} since x > y so f(x) - f(y) > 0.

So, you have a more than 50% chance of winning! But as you play the game, there’s no way to tell how much more than 50%. If the numbers on the other players hands are very large, or very small, your chance will be just slightly more than 50%.

Followup puzzles

Here are two more puzzles:

Puzzle 2: Prove that no deterministic strategy can guarantee you have a more than 50% chance of choosing the larger number.

Puzzle 3: There are perfectly specific but ‘algorithmically random’ sequences of bits, which can’t predicted well by any program. If we use these to generate a uniform algorithmically random number between 0 and 1, and use the strategy Egan describes, will our chance of choosing the larger number be more than 50%, or not?

But watch out—here come Egan’s solutions to those!


Egan writes:

Here are my answers to your two puzzles on G+.

Puzzle 2: Prove that no deterministic strategy can guarantee you have a more than 50% chance of choosing the larger number.

Answer: If we adopt a deterministic strategy, that means there is a function S: \mathbb{R} \to \{0,1\} that tells us whether on not we stick with the number x when we see it. If S(x)=1 we stick with it, if S(x)=0 we swap it for the other number.

If the two numbers are x and y, with x > y, then the probability of success will be:

P = 0.5 + 0.5(S(x)-S(y))

This is exactly the same as the formula we obtained when we stuck with x with probability f(x), but we have specialised to functions S valued in \{0,1\}.

We can only guarantee a more than 50% chance of choosing the larger number if S is monotonically increasing everywhere, i.e. S(x) > S(y) whenever x > y. But this is impossible for a function valued in \{0,1\}. To prove this, define x_0 to be any number in [1,2] such that S(x_0)=0; such an x_0 must exist, otherwise S would be constant on [1,2] and hence not monotonically increasing. Similarly define x_1 to be any number in [-2,-1] such that S(x_1) = 1. We then have x_0 > x_1 but S(x_0) < S(x_1).

Puzzle 3: There are perfectly specific but ‘algorithmically random’ sequences of bits, which can’t predicted well by any program. If we use these to generate a uniform algorithmically random number between 0 and 1, and use the strategy Egan describes, will our chance of choosing the larger number be more than 50%, or not?

Answer: As Philip Gibbs noted, a deterministic pseudo-random number generator is still deterministic. Using a specific sequence of algorithmically random bits

(b_1, b_2, \dots )

to construct a number z between 0 and 1 means z takes on the specific value:

z_0 = \sum_i b_i 2^{-i}

So rather than sticking with x with probability f(x) for our monotonically increasing function f, we end up always sticking with x if z_0 \le f(x), and always swapping if z_0 > f(x). This is just using a function S:\mathbb{R} \to \{0,1\} as in Puzzle 2, with:

S(x) = 0 if x < f^{-1}(z_0)

S(x) = 1 if x \ge f^{-1}(z_0)

So all the same consequences as in Puzzle 2 apply, and we cannot guarantee a more than 50% chance of choosing the larger number.

Puzzle 3 emphasizes the huge gulf between ‘true randomness’, where we only have a probability distribution of numbers z, and the situation where we have a specific number z_0, generated by any means whatsoever.

We could generate z_0 using a pseudorandom number generator, radioactive decay of atoms, an oracle whose randomness is certified by all the Greek gods, or whatever. No matter how randomly z_0 is generated, once we have it, we know there exist choices for the first player that will guarantee our defeat!

This may seem weird at first, but if you think about simple games of luck you’ll see it’s completely ordinary. We can have a more than 50% chance of winning such a game even if for any particular play we make the other player has a move that ensures our defeat. That’s just how randomness works.

Jordan EllenbergThe dime of America

I take it as a near-certainty that, assuming we’re still using physical currency throughout my life, some denomination of that currency will eventually feature Ronald Reagan.  But where will he go?  You can’t really evict Jefferson or Washington or Lincoln.  Alexander Hamilton and Andrew Jackson seem more vulnerable, but somehow it’s the coins that really read as “inner-circle President” — would Reagan’s boosters really settle for grubby green pieces of linen, that get filthy and torn?

But here’s what would work.  Put Reagan on the dime.  Instead of Roosevelt?  No — in addition to Roosevelt.  Nobody cares about the shrubbery on the back of the dime.  Roosevelt on the obverse, Reagan on the reverse.  The two radical revisions of the American idea that shaped the 20th century, separated only by a thin disc of copper.  A government big enough to crush Hitler versus a government small enough to drown in a bathtub.  Now that’s a coin.  Flipping that coin has stakes.

Jordan EllenbergDane County Fair: Flippin’

1.  I realized today that the Dane County Fair might be the only place my kids ever go which is reasonably representative of the socioeconomic and demographic mix of Madison.

2.  The usual:  funnel cake, Ferris wheel, etc.  But something special this year was Flippin’, a steampunk-themed acrobatics show.  Boy does that sound unpromising.  But it was actually kind of amazing.  Especially the Wheel of Death.

I didn’t think the Wallendas, who we saw at the fair a couple of years ago, could be beat.  But these guys might have done it.  There’s a bizarre optical illusion when one of the wheelers leaps as the wheel reaches its crest; his motion is almost in sync with that of the wheel, so it feels to the eye like you’re watching something happening in slow motion.  I was floored.

Three of the four members of Flippin’ are Españas, members of a family that’s been in the circus for five generations.  The dad, Ivan, is in the video above with his brother.  His two kids were in the show we just saw.  The mom, in 2004, fell 30 feet doing a routine and was killed.  I think that’s kind of how it is in these families.



July 19, 2015

Mark Chu-CarrollMy version of Tsukune: Japanese Chicken Meatballs

A few weeks ago, my wife took me out to dinner at an Izakaya that she found. If you don’t know about Izakayas, you really need to find one and try it. An Izakaya is basically a Japanese sake bar – it’s sort-of the Japanese equivalent of a Tapas bar – you order a variety of small dishes, and eat them with your sake. Izakayas are really terrific – they’re a nice informal place to have dinner with good sake, and a variety of delicious dishes.

One of the dishes we had was Tsukune, which is a standard at Izakayas. It’s basically a meatball made from ground chicken, and served drizzled with a sauce that’s a lot like a thick, rich teriyaki. These suckers were the tenderest, most succulent meatballs I’ve ever eaten. I was strongly motivated to try to reproduce them.

So I did some research. And it turned out that there were two tricks to how really good tsukune come out that way.

One part is what goes in: pretty much any part of the bird that can get finely minced. Breast, thigh, neck, legs, wings. If the cartilage is soft enough to be minced, it goes right in with the meat.

The other part is how it’s treated. The meat has the living hell beaten out of it. It’s finely minced, and then the ground meat is kneaded and pounded until it’s all soft and firmly stuck together. That process both binds the meat without adding any extra binding agents like eggs, and tenderizes it.

I really don’t like working with cartilage. I’ve tried it in some recipes in the past, and I find it really unpleasant. It’s slimy, and it tends to sometimes form sharp edges that can cut you. Getting cut by raw meat is not something I like doing.

So I decided to try to reproduce as much of the flavor as I could, using ingredients that I had on hand. And I tried couple of tricks to get the tender texture. It’s not as tender as a true Izakaya tsukune, but it was awfully good.

I relied on three things to get the tenderness I wanted.

  1. Force: I beat the living hell out of the meat. I didn’t do it by kneading; instead, I took chicken thighs, cut the meat off the bones, and then beat the crap out of it with the flat of my meat tenderizer. By the time I was done, I wouldn’t describe the result as a particular thickness; I’d say that it had been reduced to more of a chicken paste smeared across my cutting board.
  2. Chemistry: I took that, and put it into a flat bottomed baking dish, and poured a cup of apple juice over it. There’s an enzyme in fruits from the apple family that tenderizes meat, so I let that do some work.
  3. Slow heat: Finally, I browned the meatball quickly in a pan, and then transferred them to a baking dish with the sauce, and put them into a low oven for a long time.

The result was, honestly, not very much like the tsukune from the Izakaya. But it was really, really delicious. I mean really, seriously delicious – like some of the best meatballs I’ve ever eaten. Not quite as good as the Izakaya ones, and quite different – but so, so good.

It’s not a difficult dish to make, but it is time consuming. Not really that different from other meatball recipes I’ve made: meatballs are often the kind of dish that cooks all day.

The sauce cooks for a while on its own, and then cooks more with the meatballs in it. But the chicken needs to marinate for an hour or two – so you really need to start both around the same time.

The Sauce

The sauce is basically just a slightly amped up teriyaki sauce – but homemade is so much better than the crap you get out of a bottle, and it’s really easy to make.

  • White part of one scallion, minced.
  • 3 cloves finely minced garlic.
  • 1 inch section of fresh ginger, finely minced.
  • 1/3 cup soy sauce (ideally good quality shoyu!).
  • 1/2 cup sake
  • 1/4 cup mirin
  • 1/2 cup apple juice
  • 1 cup chicken stock.
  • 2 tablespoons brown sugar.
  1. Heat a bit of oil in a pot. When it’s hot, add the garlic, ginger, and scallions, and cook until it’s nice and fragrant.
  2. Add the sake, and then let it reduce by half.
  3. Reduce the heat to medium-low, and add the rest of the liquids and the brown sugar.
  4. Simmer for about an hour.
  5. Remove from heat, and set aside.

The meatballs

  • 6 boneless chicken thighs
  • 1/4 onion, finely minced
  • 1/4 cup apple juice
  • 1/2 cup bread crumbs.
  • sliced scallion greens (the leftovers from the whites used in the sauce.)
  • flour
  1. With a meat tenderizer, pound the heck out of the chicken thighs until they’re severely mangled and very thin.
  2. Put them into a dish with the apple juice, and put that into the refrigerator for an hour or two.
  3. Put the marinated thighs into the food processor with 1/4 cup of the sauce, and pulse a couple of times until they’re minced.
  4. Fold in the onion and the bread crumbs.
  5. Chill the ground meat for an hour until it firms up.
  6. Separate into about 30 meatballs – it should be around a heaped teaspoon per meatball.
  7. Roll each meatball in flour; flatten slightly, and then brown the two flat sides in a hot pan with a bit of oil. The meatballs will not be cooked through by this – you’re just browning the meatballs, not fully cooking them.
  8. Put the meatballs into a baking dish, cover with the sauce, and bake for 1 1/2 hours.
  9. Remove the meatballs from the sauce, and reduce the sauce until there’s just a cup or so left. Thicken with a bit of cornstarch.
  10. Pour the sauce back over the meatballs, and sprinkle with scallion greens.

Serve this on a bed of simple steamed rice, and some stir fried veggies on the side. Home cooking doesn’t get better than this.

n-Category Café Category Theory 2015

Just a quick note: you can see lots of talk slides here:

Category Theory 2015, Aveiro, Portugal, June 14-19, 2015.

The Giry monad, tangent categories, Hopf monoids in duoidal categories, model categories, topoi… and much more!

n-Category Café What's so HoTT about Formalization?

In my last post I promised to follow up by explaining something about the relationship between homotopy type theory (HoTT) and computer formalization. (I’m getting tired of writing “publicity”, so this will probably be my last post for a while in this vein — for which I expect that some readers will be as grateful as I).

As a potential foundation for mathematics, HoTT/UF is a formal system existing at the same level as set theory (ZFC) and first-order logic: it’s a collection of rules for manipulating syntax, into which we can encode most or all of mathematics. No such formal system requires computer formalization, and conversely any such system can be used for computer formalization. For example, the HoTT Book was intentionally written to make the point that HoTT can be done without a computer, while the Mizar project has formalized huge amounts of mathematics in a ZFC-like system.

Why, then, does HoTT/UF seem so closely connected to computer formalization? Why do the overwhelming majority of publications in HoTT/UF come with computer formalizations, when such is still the exception rather than the rule in mathematics as a whole? And why are so many of the people working on HoTT/UF computer scientists or advocates of computer formalization?

To start with, note that the premise of the third question partially answers the first two. If we take it as a given that many homotopy type theorists care about computer formalization, then it’s only natural that they would be formalizing most of their papers, creating a close connection between the two subjects in people’s minds.

Of course, that forces us to ask why so many homotopy type theorists are into computer formalization. I don’t have a complete answer to that question, but here are a few partial ones.

  1. HoTT/UF is built on type theory, and type theory is closely connected to computers, because it is the foundation of typed functional programming languages like Haskell, ML, and Scala (and, to a lesser extent, less-functional typed programming languages like Java, C++, and so on). Thus, computer proof assistants built on type theory are well-suited to formal proofs of the correctness of software, and thus have received a lot of work from the computer science end. Naturally, therefore, when a new kind of type theory like HoTT comes along, the existing type theorists will be interested in it, and will bring along their predilection for formalization.

  2. HoTT/UF is by default constructive, meaning that we don’t need to assert the law of excluded middle or the axiom of choice unless we want to. Of course, most or all formal systems have a constructive version, but with type theories the constructive version is the “most natural one” due to the Curry-Howard correspondence. Moreover, one of the intriguing things about HoTT/UF is that it allows us to prove certain things constructively that in other systems require LEM or AC. Thus, it naturally attracts attention from constructive mathematicians, many of whom are interested in computable mathematics (i.e. when something exists, can we give an algorithm to find it?), which is only a short step away from computer formalization of proofs.

  3. One could, however, try to make similar arguments from the other side. For instance, HoTT/UF is (at least conjecturally) an internal language for higher topos theory and homotopy theory. Thus, one might expect it to attract an equal influx of higher topos theorists and homotopy theorists, who don’t care about computer formalization. Why hasn’t this happened? My best guess is that at present the traditional 1-topos theorists seem to be largely disjoint from the higher topos theorists. The former care about internal languages, but not so much about higher categories, while for the latter it is reversed; thus, there aren’t many of us in the intersection who care about both and appreciate this aspect of HoTT. But I hope that over time this will change.

  4. Another possible reason why the influx from type theory has been greater is that HoTT/UF is less strange-looking to type theorists (it’s just another type theory) than to the average mathematician. In the HoTT Book we tried to make it as accessible as possible, but there are still a lot of tricky things about type theory that one seemingly has to get used to before being able to appreciate the homotopical version.

  5. Another sociological effect is that Vladimir Voevodsky, who introduced the univalence axiom and is a Fields medalist with “charisma”, is also a very vocal and visible advocate of computer formalization. Indeed, his personal programme that he calls “Univalent Foundations” is to formalize all of mathematics using a HoTT-like type theory.

  6. Finally, many of us believe that HoTT is actually the best formal system extant for computer formalization of mathematics. It shares most of the advantages of type theory, such as the above-mentioned close connection to programming, the avoidance of complicated ZF-encodings for even basic concepts like natural numbers, and the production of small easily-verifiable “certificates” of proof correctness. (The advantages of some type theories that HoTT doesn’t yet share, like a computational interpretation, are work in progress.) But it also rectifies certain infelicious features of previously existing type theories, by specifying what equality of types means (univalence), including extensionality for functions and truth values, providing well-behaved quotient types (HITs), and so on, making it more comfortable for ordinary mathematicians. (I believe that historically, this was what led Voevodsky to type theory and univalence in the first place.)

There are probably additional reasons why HoTT/UF attracts more people interested in computer formalization. (If you can think of others, please share them in the comments.) However, there is more to it than this, as one can guess from the fact that even people like me, coming from a background of homotopy theory and higher category theory, tend to formalize a lot of our work on HoTT. Of course there is a bit of a “peer pressure” effect: if all the other homotopy type theorists formalize their papers, then it starts to seem expected in the subject. But that’s far from the only reason; here are some “real” ones.

  1. Computer formalization of synthetic homotopy theory (the “uniquely HoTT” part of HoTT/UF) is “easier”, in certain respects, than most computer formalization of mathematics. In particular, it requires less infrastructure and library support, because it is “closer to the metal” of the underlying formal system than is usual for actually “interesting” mathematics. Thus, formalizing it still feels more like “doing mathematics” than like programming, making it more attractive to a mathematician. You really can open up a proof assistant, load up no pre-written libraries at all, and in fairly short order be doing interesting HoTT. (Of course, this doesn’t mean that there is no value in having libraries and in thinking hard about how best to design those libraries, just that the barrier to entry is lower.)

  2. Precisely because, as mentioned above, type theory is hard to grok for a mathematician, there is a significant benefit to using a proof assistant that will automatically tell you when you make a mistake. In fact, messing around with a proof assistant is one of the best ways to learn type theory! I posted about this almost exactly four years ago.

  3. I think the previous point goes double for homotopy type theory, because it is an unfamiliar new world for almost everyone. The types of HoTT/UF behave kind of like spaces in homotopy theory, but they have their own idiosyncracies that it takes time to develop an intuition for. Playing around with a proof assistant is a great way to develop that intuition. It’s how I did it.

  4. Moreover, because that intuition is unique and recently developed for all of us, we may be less confident in the correctness of our informal arguments than we would be in classical mathematics. Thus, even an established “homotopy type theorist” may be more likely to want the comfort of a formalization.

  5. Finally, there is an additional benefit to doing mathematics with a proof assistant (as opposed to formalizing mathematics that you’ve already done on paper), which I think is particularly pronounced for type theory and homotopy type theory. Namely, the computer always tells you what you need to do next: you don’t need to work it out for yourself. A central part of type theory is inductive types, and a central part of HoTT is higher inductive types; both of which are characterized by an induction principle (or “eliminator”) which says that in order to prove a statement of the form “for all x:Wx:W, P(x)P(x)”, it suffices to prove some number of other statements involving the predicate PP. The most familiar example is induction on the natural numbers, which says that in order to prove “for all nn\in \mathbb{N}, P(n)P(n)” it suffices to prove P(0)P(0) and “for all nn\in \mathbb{N}, if P(n)P(n) then P(n+1)P(n+1)”. When using proof by induction, you need to isolate PP as a predicate on nn, specialize to n=0n=0 to check the base case, write down P(n)P(n) as the inductive hypothesis, then replace nn by n+1n+1 to find what you have to prove in the induction step. The students in an intro to proofs class have trouble with all of these steps, but professional mathematicians have learned to do them automatically. However, for a general inductive or higher inductive type, there might instead be four, six, ten, or more separate statements to prove when applying the induction principle, many of which involve more complicated transformations of PP, and it’s common to have to apply several such inductions in a nested way. Thus, when doing HoTT on paper, a substantial amount of time is sometimes spent simply figuring out what has to be proven. But a proof assistant equipped with a unification algorithm can do that for you automatically: you simply say “apply induction for the type WW” and it immediately decides what PP is and presents you with a list of the remaining goals that have to be proven.

To summarize this second list, then, I think it’s fair to say that compared to formalizing traditional mathematics, formalizing HoTT tends to give more benefit at lower cost. However, that cost is still high, especially when you take into account the time spent learning to use a proof assistant, which is often not the most user-friendly of software. This is why I always emphasize that HoTT can perfectly well be done without a computer, and why we wrote the book the way we did.

July 18, 2015

Noncommutative GeometryDaniel Kastler

Daniel Kastler played for many many years a key role as a leading Mathematical Physicist in developing Algebraic Quantum Field Theory. He laid the foundations of the subject as the famous "Haag-Kastler" axioms in his joint paper with Rudolf Haag in 1964. He gathered around him, in Bandol, a whole international school of mathematicians and physicists.  With the devoted help of his beloved wife

Noncommutative GeometryUffe Haagerup

Uffe Haagerup was a wonderful man, with a perfect kindness and openness of mind, and a mathematician of incredible power and insight.  His whole career is a succession of amazing achievements and of decisive and extremely influential contributions to the field of operator algebras, C*-algebras and von Neumann algebras.  His first work (1973-80) concerned the theory of weights and more generally

Scott AaronsonFive announcements

1. Sanjeev Arora sent me a heads-up that there’s a discussion about the future of the STOC conference  at the Windows on Theory blog—in particular, about the idea of turning STOC into a longer “CS theory festival.”  If you have opinions about this, don’t miss the chance to make your voice heard.

2. Back in January, I blogged about a new quantum optimization algorithm by Farhi, Goldstone, and Gutmann, which was notable for being, as far as anyone could tell, the first quantum algorithm to achieve a provably better approximation ratio than the best-known classical algorithm for an NP-hard optimization problem.  Today, I report that a fearsome list of authors—Boaz Barak, Ankur Moitra, Ryan O’Donnell, Prasad Raghavendra, Oded Regev, David Steurer, Luca Trevisan, Aravindan Vijayaraghavan, David Witmer, and John Wright—has put out an eagerly-awaited paper that gives a classical algorithm for the same problem, with better performance than the quantum algorithm’s.  (They write that this “improves both qualitatively and quantitatively” on Farhi et al.’s work; I assume “qualitatively” refers to the fact that the new algorithm is classical.)  What happened, apparently, is that after I blogged (with enthusiasm) about the Farhi et al. result, a bunch of classical complexity theorists read my post and decided independently that they could match or beat the quantum algorithm’s performance classically; then they found out about each other and decided to merge their efforts.  I’m proud to say that this isn’t the first example of this blog catalyzing actual research progress, though it’s probably the best example so far.  [Update: Luca Trevisan now has a great post explaining what happened in much more detail, entitled “How Many Theoreticians Does It Take to Approximate Max 3Lin?”]

Another update: Farhi et al. have posted a new version of their paper, in which they can almost match the performance of the classical algorithm using their quantum algorithm.

3. Jennifer Ouellette has a wonderful article in Quanta magazine about recent progress in AdS/MERA (i.e., “the emergence of spacetime from entanglement”), centered around the ideas of Brian Swingle.  This is one of the main things that I’d love to understand better right now—if I succeed even partially, you’ll know because I’ll write a blog post trying to explain it to others.  See also this blog post by Sean Carroll (about this paper by Ning Bao et al.), and this paper by Pastawski, Yoshida, Harlow, and Preskill, which explicitly mines the AdS/CFT correspondence for examples of quantum error-correcting codes.

4. Celebrity rationalist Julia Galef, who I had the great honor of meeting recently, has a podcast interview with Sean Carroll about why Carroll accepts the many-worlds interpretation.  (Or if, like me, you prefer the written word to the spoken one, click here for a full transcript.)  Unfortunately, Sean is given the opportunity at the end of the interview to recommend one science book to his listeners—just one!—but he squanders it by plugging some weird, self-indulgent thing called Quantum Computing Since Democritus.  Julia also has a YouTube video about what she learned from the interview, but I haven’t yet watched it (is there a transcript?).

5. I came across an insightful if meandering essay about nerd culture by Meredith L. Patterson.  In particular, noticing how the term “nerd” has been co-opted by normal, socially-skilled people, who’ve quickly set about remaking nerd social norms to make them identical to the rest of the world’s norms, Patterson coins the term “weird-nerd” to describe people like herself, who are still nerds in the original sense and who don’t see nerd culture as something horribly, irreparably broken.  As she writes: “We’ll start to feel less defensive when we get some indication — any indication — that our critics understand what parts of our culture we don’t want to lose and why we don’t want to lose them.”  (But is this the start of a linguistic treadmill?  Will we eventually need to talk about weird-weird-nerds, etc.?)

Noncommutative GeometryTwo great losses

It is with incommensurable sadness that we learned of the death of two great figures of the fields of operator algebras and mathematical physics. Daniel Kastler died on July 4-th in his house in Bandol. Uffe Haagerup died on July 5-th in a tragic accident while swimming near his summer house in Denmark. I will write on each of them separately.

Georg von HippelLATTICE 2015, Days Three and Four

Due to the one-day shift of the entire conference programme relative to other years, Thursday instead of Wednesday was the short day. In the morning, there were parallel sessions. The most remarkable thing to be reported from those (from my point of view) is that MILC are generating a=0.03 fm lattices now, which handily beats the record for the finest lattice spacing; they are observing some problems with the tunnelling of the topological charge at such fine lattices, but appear hopeful that they can be useful.

After the lunch break, excursions were offered. I took the trip to Himeji to see Himeji Castle, a very remarkable five-story wooden building that due to its white exterior is also known the "White Heron Castle". During the trip, typhoon Nangka approached, so the rains cut our enjoyment of the castle park a bit short (though seeing koi in a pond with the rain falling into it had a certain special appeal to it, the enjoyment of which I in my Western ignorance suppose might be considered a form of Japanese wabi aesthetics).

As the typhoon resolved into a rainstorm, the programme wasn't cancelled or changed, and so today's plenary programme started with a talk on some formal developments in QFT by Mithat Ünsal, who reviewed trans-series, Lefschetz thimbles, and Borel summability as different sides of the same coin. I'm far too ignorant of these more formal field theory topics to do them justice, so I won't try a detailed summary. Essentially, it appears that the expansion of certain theories around the saddle points corresponding to instantons is determined by their expansion around the trivial vacuum, and the ambiguities arising in the Borel resummation of perturbative series when the Borel transform has a pole on the positive real axis can in some way be connected to this phenomenon, which may allow for a way to resolve the ambiguities.

Next, Francesco Sannino spoke about the "bright, dark, and safe" sides of the lattice. The bright side referred to the study of visible matter, in particular to the study of technicolor models as a way of implementing the spontaneous breaking of electroweak symmetry, without the need for a fundamental scalar introducing numerous tunable parameters, and with the added benefits of removing the hierarchy problem and the problem of φ4 triviality. The dark side referred to the study of dark matter in the context of composite dark matter theories, where one should remember that if the visible 5% of the mass of the universe require three gauge groups for their description, the remaining 95% are unlikely to be described by a single dark matter particle and a homogeneous dark energy. The safe side referred to the very current idea of asymptotic safety, which is of interest especially in quantum gravity, but might also apply to some extension of the Standard Model, making it valid at all energy scales.

After the coffee break, the traditional experimental talk was given by Toru Iijima of the Belle II collaboration. The Belle II detector is now beginning commissioning at the upcoming SuperKEKB accelerator, which will greatly improved luminosity to allow for precise tests of the Standard Model in the flavour sector. In this, Belle II will be complementary to LHCb, because it will have far lower backgrounds allowing for precision measurements of rare processes, while not being able to access as high energies. Most of the measurements planned at Belle II will require lattice inputs to interpret, so there is a challenge to our community to come up with sufficiently precise and reliable predictions for all required flavour observables. Besides quark flavour physics, Belle II will also search for lepton flavour violation in τ decays, try to improve the phenomenological prediction for (g-2)μ by measuring the cross section for e+e- -> hadrons more precisely, and search for exotic charmonium- and bottomonium-like states.

Closely related was the next talk, a review of progress in heavy flavour physics on the lattice given by Carlos Pena. While simulations of relativistic b quarks at the physical mass will become a possibility in the not-too-distant future, for the time being heavy-quark physics is still dominated by the use of effective theories (HQET and NRQCD) and methods based either on appropriate extrapolations from the charm quark mass region, or on the Fermilab formalism, which is sort of in-between. For the leptonic decay constants of heavy-light mesons, there are now results from all formalisms, which generally agree very well with each other, indicating good reliability. For the semileptonic form factors, there has been a lot of development recently, but to obtain precision at the 1% level, good control of all systematics is needed, and this includes the momentum-dependence of the form factors. The z-expansion, and extended versions thereof allowing for simultaneous extrapolation in the pion mass and lattice spacing, has the advantage of allowing for a test of its convergence properties by checking the unitarity bound on its coefficients.

After the coffee break, there were parallel sessions again. In the evening, the conference banquet took place. Interestingly, the (excelleent) food was not Japanese, but European (albeit with a slight Japanese twist in seasoning and presentation).

Georg von HippelLATTICE 2015, Day Five

In a marked deviation from the "standard programme" of the lattice conference series, Saturday started off with parallel sessions, one of which featured my own talk.

The lunch break was relatively early, therefore, but first we all assembled in the plenary hall for the conference group photo (a new addition to the traditions of the lattice conference), and was followed by afternoon plenary sessions. The first of these was devoted to finite temperature and density, and started with Harvey Meyer giving the review talk on finite-temperature lattice QCD. The thermodynamic properties of QCD are by now relatively well-known: the transition temperature is agreed to be around 155 MeV, chiral symmetry restoration and the deconfinement transition coincide (as well as that can defined in the case of a crossover), and the number of degrees of freedom is compatible with a plasma of quarks and gluons above the transition, but the thermodynamic potentials approach the Stefan-Boltzmann limit only slowly, indicating that there are strong correlations in the medium. Below the transition, the hadron resonance gas model describes the data well. The Columbia plot describing the nature of the transition as a function of the light and strange quark masses is being further solidified: the size of the lower-left hand corner first-order region is being measured, and the nature of the left-hand border (most likely O(4) second-order) is being explored. Beyond these static properties, real-time properties are beginning to be studied through the finite-temperature spectral functions. One interesting point was that there is a difference between the screening masses (spatial correlation lengths) and quasiparticle masses (from the spectral function) in any given channel, which may even tend in opposite directions as functions of the temperature (as seen for the pion channel).

Next, Szabolcs Borsanyi spoke about fluctuations of conserved charges at finite temperature and density. While of course the sum of all outcoming conserved charges in a collision must equal the sum of the ingoing ones, when considering a subvolume of the fireball, this can be best described in the grand canonical ensemble, as charges can move into and out of the subvolume. The quark number susceptibilities are then related to the fluctuating phase of the fermionic determinant. The methods being used to avoid the sign problem include Taylor expansions, fugacity expansions and simulations at imaginary chemical potential, all with their own strengths and weaknesses. Fluctuations can be used as a thermometer to measure the freeze-out temperature.

Lastly, Luigi Scorzato reviewed the Lefschetz thimble, which may be a way out of the sign problem (e.g. at finite chemical potential). The Lefschetz thimble is a higher-dimensional generalization of the concept of steepest-descent integration, in which the integral of eS(z) for complex S(z) is evaluated by finding the stationary points of S and integrating along the curves passing through them along which the imaginary part of S is constant. On such Lefschetz thimbles, a Langevin algorithm can be defined, allowing for a Monte Carlo evaluation of the path integral in terms of Lefschetz thimbles. In quantum-mechanical toy models, this seems to work already, and there appears hope that this might be a way to avoid the sign problem of finite-density QCD.

After the coffee break, the last plenary session turned to physics beyond the Standard Model. Daisuke Kadoh reviewed the progress in putting supersymmetry onto the lattice, which is still a difficult problem due to the fact that the finite differences which replace derivatives on a lattice do not respect the Leibniz rule, introducing SUSY-breaking terms when discretizing. The ways past this are either imposing exact lattice supersymmetries or fine-tuning the theory so as to remove the SUSY-breaking in the continuum limit. Some theories in both two and four dimensions have been simulated successfully, including N=1 Super-Yang-Mills theory in four dimensions. Given that there is no evidence for SUSY in nature, lattice SUSY is of interesting especially for the purpose of verifying the ideas of gauge-dravity duality from the Super-Yang-Mills side, and in one and two dimensions, agreement with the predictions from gauge-gravity duality has been found.

The final plenary speaker was Anna Hasenfratz, who reviewed Beyond-the-Standard-Model calculations in technicolor-like theories. If the Higgs is to be a composite particle, there must be some spontaneously broken symmetry that keeps it light, either a flavour symmetry (pions) or a scale symmetry (dilaton). There are in fact a number of models that have a light scalar particle, but the extrapolation of these theories is rendered difficult by the fact that this scalar is (and for phenomenologically interesting models would have to be) lighter than the (techni-)pion, and thus the usual formalism of chiral perturbation theory may not work. Many models of strong BSM interactions have been and are being studied using a large number of different methods, with not always conclusive results. A point raised towards the end of the talk was that for theories with a conformal IR fixed-point, universality might be violated (and there are some indications that e.g. Wilson and staggered fermions seem to give qualitatively different behaviour for the beta function in such cases).

The conference ended with some well-deserved applause for the organizing team, who really ran the conference very smoothly even in the face of a typhoon. Next year's lattice conference will take place in Southampton (England/UK) from 24th to 30th July 2016. Lattice 2017 will take place in Granada (Spain).

July 17, 2015

Doug NatelsonA grand challenge for nano: Solar energy harvesting near the thermodynamic limit

As I'd mentioned earlier in the week, the US Office of Science and Technology Policy had issued a call for "Grand Challenges" for nanotechnology for the next decade, with a deadline of July 16, including guidelines about specific points that a response should address.  Here is my shot:  

Affordable solar energy collection/conversion that approaches the thermodynamic efficiency limit based on the temperature of the sun (efficiency approaching 85%).  

Physics, specifically the second law of thermodynamics, places very strict limits on how much useful energy we can extract from physical systems.  For example, if you have a big rock at temperature \(T_{\mathrm{hot}}\), and another otherwise identical big rock at temperature \(T_{\mathrm{cold}}\), you could let these two rocks just exchange energy, and they would eventually equilibrate to a temperature \(T_{0} = (T_{\mathrm{hot}}+T_{\mathrm{cold}})/2\), but we would not have gotten any useful energy out of the system.  From the standpoint of extracting useful energy, that process (just thermal conduction + equilibration) would have an efficiency of zero.  Instead, you could imagine running a heat engine:  You might warm gas in a cylinder using the hot rock, so that its pressure goes up and pushes a piston to turn a crank that you care about, and then cool the piston back to its initial condition (so that you can run this as a cycle) by letting the gas dump energy to the cold rock.  Carnot showed that the best you can do in terms of efficiency here is \( (1 - T_{\mathrm{cold}}/T_{\mathrm{hot}})\).  On a fundamental level, this is what limits the efficiency of car engines, gas turbines in power plants, etc.  If the "cold" side of your system is near room temperature (300 Kelvin), then the maximum efficiency permitted by physics is limited by how hot you can make the "hot" side.  

So, what about solar power?  The photosphere of the sun is pretty hot - around 5000 K.  We can get energy from the sun in the form of the photons it radiates.  Using 300 K for \(T_{\mathrm{cold}}\), that implies that the theoretical maximum efficiency for solar energy collection is over 90%.  How are we doing?  Rather badly.  The most efficient solar panels you can buy have efficiencies around 35%, and typical ones are more like 18%.  That means we are "throwing away" 60% - 80% of the energy that should be available for use.  Why is that?  This article (here is a non-paywall pdf) by Albert Polman and Harry Atwater has a very good discussion of the issues.   In brief:  There are many processes in conventional photovoltaics where energy is either not captured or is "lost" to heat and entropy generation.  However, manipulating materials down to the nm level offers possible avenues for avoiding these issues - controlling optical properties to enhance absorption; controlling the available paths for the energy (and charge carriers) so that energy is funneled where it can be harnessed.  On the scale of "grand challenges", this has a few virtues:  It's quantitative without being fantastical; there are actually ideas about how to proceed; it's a topical, important social and economic issue; and even intermediate progress would still be of great potential importance.  

Chad OrzelPhysics Blogging Round-Up: Condensed Matter, Magic, Navigation, and Late Nights

Another week, another set of posts at Forbes to link here:

Why Do Solids Have Energy Bands? A conceptual explanation of why putting together lots of atoms with electrons in well-defined energy levels leads to a solid with electrons filling broad energy bands.

This Is The Key Distinction Between Magic And Advanced Technology: Following up a fun panel at Readercon, and how the “magical thinking” involved in my grad school lab is distinct from real magic.

What Submarine Navigation Can Teach Us About Building Luxury Prison Tunnels: The editor at Forbes sent email asking if anybody could talk about the science of that Mexican drug lord’s escape tunnel. I used that as an excuse to talk about gyroscopes and accelerometers and why the Navy funds atomic physics.

Scientists Should Work The Hours When They Work Best: Science did another career advice thing that pissed people off, about how young scientists should work 16-hour days to make sure they get noticed. In response, I talk a bit about my experience with late nights in the lab, and how those are better viewed as True Lab Stories than career advice.

So, there you go. Another week in the books.

Sean CarrollYoichiro Nambu

yoichiro_nambu It was very sad to hear yesterday that Yoichiro Nambu has died. He was aged 94, so it was after a very long and full life.

Nambu was one of the greatest theoretical physicists of the 20th century, although not one with a high public profile. Among his contributions:

  • Being the first to really understand spontaneous symmetry breaking in quantum field theory, work for which he won a (very belated) Nobel Prize in 2008. We now understand the pion as a (pseudo-) “Nambu-Goldstone boson.”
  • Suggesting that quarks might come in three colors, and those colors might be charges for an SU(3) gauge symmetry, giving rise to force-carrying particles called gluons.
  • Proposing the first relativistic string theory, based on what is now called the Nambu-Goto action.

So — not too shabby.

But despite his outsized accomplishments, Nambu was quiet, proper, it’s even fair to say “shy.” He was one of those physicists who talked very little, and was often difficult to understand when he does talk, but if you put in the effort to follow him you would invariably be rewarded. One of his colleagues at the University of Chicago, Bruce Winstein, was charmed by the fact that Nambu was an experimentalist at heart; at home, apparently, he kept a little lab, where he would tinker with electronics to take a break from solving equations.

Any young person in science might want to read this profile of Nambu by his former student Madhusree Mukerjee. In it, Nambu tells of when he first came to the US from Japan, to be a postdoctoral researcher at the Institute for Advanced Study in Princeton. “Everyone seemed smarter than I,” Nambu recalls. “I could not accomplish what I wanted to and had a nervous breakdown.”

If Yoichiro Nambu can have a nervous breakdown because he didn’t feel smart enough, what hope is there for the rest of us?

Here are a few paragraphs I wrote about Nambu and spontaneous symmetry breaking in The Particle at the End of the Universe.

A puzzle remained: how do we reconcile the idea that photons have mass inside a superconductor with the conviction that the underlying symmetry of electromagnetism forces the photon to be massless?

This problem was tackled by a number of people, including American physicist Philip Anderson, Soviet physicist Nikolai Bogoliubov, and Japanese-American physicist Yoichiro Nambu. The key turned out to be that the symmetry was indeed there, but that it was hidden by a field with that took on a nonzero value in the superconductor. According to the jargon that accompanies this phenomenon, we say the symmetry is “spontaneously broken”: the symmetry is there in the underlying equations, but the particular solution to those equations in which we are interested doesn’t look very symmetrical.

Yoichiro Nambu, despite the fact that he won the Nobel Prize in 2008 and has garnered numerous other honors over the years, remains relatively unknown outside physics. That’s a shame, as his contributions are comparable to those of better-known colleagues. Not only was he one of the first to understand spontaneous symmetry breaking in particle physics, but he was also the first to propose that quarks carry color, to suggest the existence of gluons, and to point out that certain particle properties could be explained by imagining that the particles were really tiny strings, thus launching string theory. Theoretical physicists admire Nambu’s accomplishments, but his inclination is to avoid the limelight.

For several years in the early 2000’s I was a faculty member at the University of Chicago, with an office across the hall from Nambu’s. We didn’t interact much, but when we did he was unfailingly gracious and polite. My major encounter with him was one time when he knocked on my door, hoping that I could help him with the email system on the theory group computers, which tended to take time off at unpredictable intervals. I wasn’t much help, but he took it philosophically. Peter Freund, another theorist at Chicago, describes Nambu as a “magician”: “He suddenly pulls a whole array of rabbits out of his hat and, before you know it, the rabbits reassemble in an entirely novel formation and by God, they balance the impossible on their fluffy cottontails.” His highly developed sense of etiquette, however, failed him when he was briefly appointed as department chair: reluctant to explicitly say “no” to any question, he would indicate disapproval by pausing before saying “yes.” This led to a certain amount of consternation among his colleagues, once they realized that their requests hadn’t actually been granted.

After the BCS theory of superconductivity was proposed, Nambu began to study the phenomenon from the perspective of a particle physicist. He put his finger on the key role played by spontaneous symmetry breaking, and began to wonder about its wider applicability. One of Nambu’s breakthroughs was to show (partly in collaboration with Italian physicist Giovanni Jona-Lasinio) how spontaneous symmetry breaking could happen even if you weren’t inside a superconductor. It could happen in empty space, in the presence of a field with a nonzero value — a clear precursor to the Higgs field. Interestingly, this theory also showed how a fermion field could start out massless, but gain mass through the process of symmetry breaking.

As brilliant as it was, Nambu’s suggestion of spontaneous symmetry breaking came with a price. While his models gave masses to fermions, they also predicted a new massless boson particle — exactly what particle physicists were trying to avoid, since they didn’t see any such particles created by the nuclear forces. Soon thereafter, Scottish physicist Jeffrey Goldstone argued that this wasn’t just an annoyance: this kind of symmetry breaking necessarily gave rise to massless particles, now called “Nambu-Goldstone bosons.” Pakistani physicist Abdus Salam and American physicist Steven Weinberg then collaborated with Goldstone in promoting this argument to what seemed like an air-tight proof, now called “Goldstone’s theorem.”

One question that must be addressed by any theory of broken symmetry is, what is the field that breaks the symmetry? In a superconductor the role is played by the Cooper pairs, composite states of electrons. In the Nambu/Jona-Lasinio model, a similar effect happens with composite nucleons. Starting with Goldstone’s 1961 paper, however, physicists become comfortable with the idea of simply positing a set of new fundamental boson fields whose job it was to break symmetries by taking on a nonzero value in empty space. The kind of fields required are known as a “scalar” fields, which is a way of saying they have no intrinsic spin. The gauge fields that carry forces, although they are also bosons, have spin equal to one.

If the symmetry weren’t broken, all the fields in Goldstone’s model would behave in exactly the same way, as massive scalar bosons, due to the requirements of the symmetry. When the symmetry is broken, the fields differentiate themselves. In the case of a global symmetry (a single transformation all throughout space), which is what Goldstone considered, one field remains massive, while the others become massless Nambu-Goldstone bosons — that’s Goldstone’s theorem.

July 16, 2015

John BaezCarbon Budget

On Quora someone asked:

What is the most agreed-on figure for our future carbon budget?

My answer:

Asking “what is our future carbon budget?” is a bit like asking how many calories a day you can eat. There’s really no limit on how much you can eat if you don’t care how overweight and unhealthy you become. So, to set a carbon budget, you need to say how much global warming you will accept.

That said, here’s a picture of how we’re burning through our carbon budget:

It says that our civilization has burnt 60% of the carbon we’re allowed to while still having a 50-50 chance of keeping global warming below 2 °C.

This chart appears in the International Energy Agency report World Energy Outlook Special Report 2015, which is free and definitely worth reading.

The orange bars show CO2 emissions per year, in gigatonnes. The blue curve shows the fraction of the total carbon budget we have left to burn, based on data from the Intergovernmental Panel for Climate Change. The projection of future carbon emissions is based on the Intended Nationally Determined Contributions (INDC) that governments are currently submitting to the United Nations. So, based on what governments had offered to do by June 2015, we may burn through this carbon budget in 2040.

Our civilization’s total carbon budget for staying below 2 °C was about 1 trillion tonnes. We have now burnt almost 60% of that. You can watch the amount rise as we speak:

Trillionth Tonne.

Quoting the International Energy Agency report:

The transition away from fossil fuels is gradual in the INDC Scenario, with the share of fossil fuels in the world’s primary energy mix declining from more than 80% today to around three-quarters in 2030 […] The projected path for energy-related emissions in the INDC Scenario means that, based on IPCC estimates, the world’s remaining carbon budget consistent with a 50% chance of keeping a temperature increase of below 2 °C would be exhausted around 2040, adding a grace period of only around eight months, compared to the date at which the budget would be exhausted in the absence of INDCs (Figure 2.3). This date is already within the lifetime of many existing energy sector assets: fossil-fuelled power plants often operate for 30-40 years or more, while existing fossil-fuel resources could, if all developed, sustain production levels far beyond 2040. If energy sector investors believed that not only new investments but also existing fossil-fuel operations would be halted at that critical point, this would have a profound effect on investment even today.

Since we seem likely to go above 2 °C warming over pre-industrial levels, it would be nice to make a similar chart for a carbon budget based on 3 ° C warming. The Trillionth Tonne website projects that with current trends we’ll burn 1.5 trillion tonnes, for a warming of 3 °C in a cautious scenario, by 2056.

But: we would never burn the 1.5 trillionth tonne if emissions dropped by 1.2% per year from now on. And we’d not even burn the trillionth tonne if they dropped by 2.6% per year.

Doug NatelsonActive learning vs. lecturing: the annual hectoring

The new issue of Nature contains this article, discussing the active learning approach to teaching.  Actually, "discussing" is too neutral.  The title of the article and the subheadline flatly state that  lecture-based pedagogy is "wrong".  The article includes this quote about active learning: " 'At this point it is unethical to teach any other way' ", and presents that quote as a bold-face callout.   

The article says "Researchers often feel that a teacher's job is simply to communicate content: the factual knowledge covered in the course."  This is an assertion, and based on my experience, one that is incorrect.  If basic communication of facts is the job of teachers, we should just quit now, since books and more recently google have made us obsolete.  The whole point of in-person instruction is more than conveying a list of facts - in the case of physics, it's a matter of teaching people how to think about the world, how to think critically and translate concepts and ideas into the language of mathematics for the purposes of gaining predictive understanding and an appreciation for the beautiful way the universe works.

The article also implies that faculty are reluctant to migrate to active learning because it would require that we work harder (i.e., greater prep time, and therefore less time for research) to achieve its benefits.  I do think it was good for the author to raise the issue of incentives and rewards at the end:  If universities want to claim that they value teaching, they actually need to reward pedagogy.

By trying to cast active learning vs lecture-based pedagogy as a one-size-fits-all, good vs bad, modernists vs sticks-in-the-mud faceoff, the author does a disservice to the genuinely subtle questions at play here.  Yes, it looks like well-done active learning does enable large segments of the target audience (typically in intro courses) to retain concepts better.  Not all active learning approaches are implemented well, however; some lecturers can be outstanding, and the ones that engage the class in discussion and back-and-forth are blurring the line into active learning anyway; active learning definitely is a compromise in that the investment of personnel and time to achieve the benefits does mean leaving out some content; and different people learn best from different methods!  The author raises these issues, but the main thesis of the article is clear.
I want to raise a question that you will find in many physics departments around the US:  Who is the target audience in our classes, particularly beyond the large freshmen service teaching courses?   In a big intro class with 350 future engineers, or 400 pre-meds, maybe sacrificing some content for the sake of engaging a larger fraction of the students to better internalize and retain physical concepts is a good idea.   If we do this, however, in a way that bores or fails to challenge the top students, or leaves gaps in terms of content, is that a net good?  

My point:  Pedagogy is complicated, and in the sciences and engineering we are trying to do several competing tasks in parallel.  Oversimplifying to the level that "active learning = unalloyed good all the time; traditional lecture = unethical, abusive method clung to by lazy, hidebound, research-driven-teaching-averse faculty" is not helpful. 

Georg von HippelLATTICE 2015, Day Two

Hello again from Lattice 2015 in Kobe. Today's first plenary session began with a review talk on hadronic structure calculations on the lattice given by James Zanotti. James did an excellent job summarizing the manifold activities in this core area of lattice QCD, which is also of crucial phenomenological importance given situations such as the proton radius puzzle. It is now generally agreed that excited-state effects are one of the more important issues facing hadron structure calculations, especially in the nucleon sector, and that these (possibly together with finite-volume effects) are likely responsible for the observed discrepancies between theory and experiment for quantities such as the axial charge of the nucleon. Many groups are studying the charges and form factors of the nucleon, and some have moved on to more complicated quantities, such as transverse momentum distributions. Newer ideas in the field include the use of the Feynman-Hellmann theorem to access quantities that are difficult to access through the traditional three-point-over-two-point ratio method, such as form factors at very high momentum transfer, and quantities with disconnected diagrams (such as nucleon strangeness form factors).

Next was a review of progress in light flavour physics by Andreas Jüttner, who likewise gave an excellent overview of this also phenomenologically very important core field. Besides the "standard" quantities, such as the leptonic pion and kaon decay constants and the semileptonic K-to-pi form factors, more difficult light-flavour quantities are now being calculated, including the bag parameter BK and other quantities related to both Standard Model and BSM neutral kaon mixing, which require the incorporation of long-distance effects, including those from charm quarks. Given the emergence of lattice ensembles at the physical pion mass, the analysis strategies of groups are beginning to change, with the importance of global ChPT fits receding. Nevertheless, the lattice remains important in determining the low-energy constants of Chiral Perturbation Theory. Some groups are also using newer theoretical developments to study quantities once believed to be outside the purview of lattice QCD, such as final-state photon corrections to meson decays, or the timelike pion form factor.

After the coffee break, the Ken Wilson Award for Excellence in Lattice Field Theory was announced. The award goes to Stefan Meinel for his substantial and timely contributions to our understanding of the physics of the bottom quark using lattice QCD. In his acceptance talk, Stefan reviewed his recent work on determining |Vub|/|Vcb| from decays of Λb baryons measured by the LHCb collaboration. There has long been a discrepancy between the inclusive and exclusive (from B -> πlν) determinations of Vub, which might conceivably be due to a new (BSM) right-handed coupling. Since LHCb measures the decay widths for Λb to both pμν and Λcμν, combining these with lattice determinations of the corresponding Λb form factors allows for a precise determination of |Vub|/|Vcb|. The results agree well with the exclusive determination from B -> πlν, and fully agree with CKM unitarity. There are, however, still other channels (such as b -> sμ+μ- and b -> cτν) in which there is still potential for new physics, and LHCb measurements are pending.

This was followed by a talk by Maxwell T. Hansen (now a postdoc at Mainz) on three-body observables from lattice QCD. The well-known Lüscher method relates two-body scattering amplitudes to the two-body energy levels in a finite volume. The basic steps in the derivation are to express the full momentum-space propagator in terms of a skeleton expansion involving the two-particle irreducible Bethe-Salpeter kernel, to express the difference between the two-particle reducible loops in finite and infinite volume in terms of two-particle cuts, and to reorganize the skeleton expansion by the number of cuts to reveal that the poles of the propagator (i.e. the energy levels) in finite volume are related to the scattering matrix. For three-particle systems, the skeleton expansion becomes more complicated, since there can now be situations involving two-particle interactions and a spectator particle, and intermediate lines can go on-shell between different two-particle interactions. Treating a number of other technical issues such as cusps, Max and collaborators have been able to derive a Lüscher-like formula three-body scattering in the case of scalar particles with a Z2 symmetry forbidding 2-to-3 couplings. Various generalizations remain to be explored.

The day's plenary programme ended with a talk on the Standard Model prediction for direct CP violation in K-> ππ decays by Christopher Kelly. This has been an enormous effort by the RBC/UKQCD collaboration, who have shown that the ΔI=1/2 rule comes from low-energy QCD by way of strong cancellations between the dominant contributions, and have determined ε' from the lattice for the first time. This required the generation of ensembles with an unusual set of boundary conditions (G-parity boundary conditions on the quarks, requiring complex conjugation boundary conditions on the gauge fields) in space to enforce a moving pion ground state, as well as the precise evaluation of difficult disconnected diagrams using low modes and stochastic estimators, and treatment of finite-volume effects in the Lellouch-Lüscher formalism. Putting all of this together with the non-perturbative renormalization (in the RI-sMOM scheme) of ten operators in the electroweak Hamiltonian gives a result which currently still has three times the experimental error, but is systematically improvable, with better-than-experimental precision expected in maybe five years.

In the afternoon there were parallel sessions again, and in the evening, the poster session took place. Food ran out early, but it was pleasant to see free-form smearing begin improved upon and used to very good effect by Randy Lewis, Richard Woloshyn and students.

July 15, 2015

Jordan EllenbergRolling the dice on Iran

David Sanger in today’s NYT on the Iran deal:

Mr. Obama will be long out of office before any reasonable assessment can be made as to whether that roll of the dice paid off.

Which is true!  But something else that’s true: not having a deal would also be a roll of the dice.  We’re naturally biased to think of the status quo as the safest course.  But why?  There’s no course of political action that leads to a certain outcome.  We’re rolling the dice no matter what; all we get to do is choose which dice.

July 14, 2015

Tommaso DorigoMarek Karliner: Not A Pentaquark, But A Molecule - As He And Rosner Predicted

The reported observation of a resonant state of a J/psi meson and a proton in the decay of the Lambda_b baryon by the LHCb collaboration, broadcast by CERN today, is a very intriguing new piece of the puzzle of hadron spectroscopy - a topic on which many brilliant minds have spent their life in the course of the last half century.

read more

Sean CarrollInfinite Monkey Cage

The Infinite Monkey Cage is a British science/entertainment show put on by the dynamic duo of physicist Brian Cox and comedian Robin Ince. It exists as a radio program, a podcast, and an occasional live show. There are laughs, a bit of education, and some guests for the hosts to spar with. The popular-science ecosystem is a lot different in the UK than it is here in the US; scientists and science communicators can generally have a much higher profile, and a show like this can really take off.

So it was a great honor for me to appear as one of the guests when the show breezed through LA back in March. It was a terrific event, as you might guess from the other guests: comedian Joe Rogan, TV writer David X. Cohen, and one Eric Idle, who used to play in the Rutles. And now selected bits of the program can be listened to at home, courtesy of this handy podcast link, or directly on iTunes.


Be sure to check out the other stops on the IMC tour of the US, which included visits to NYC, Chicago, and San Francisco, featuring many friends-of-the-blog along the way.

These guys, of course, are heavy hitters, so you never know who is going to show up at one of these things. Their relationship with Eric Idle goes back quite a ways, and he actually composed and performed a theme song for the show (below). Naturally, since he was on stage in LA, they asked him to do a live version, which was a big hit. And there in the band, performing on ukulele for just that one song, was Jeff Lynne, of the Electric Light Orchestra. Maybe a bit under-utilized in this context, but why not get the best when you can?

Georg von HippelLATTICE 2015, Day One

Hello from Kobe, where I am attending the Lattice 2015 conference. The trip here was uneventful, as was the jetlag-day.

The conference started yesterday evening with a reception in the Kobe Animal Kingdom (there were no animals when we were there, though, with the exception of some fish in a pond and some cats in a cage, but there were lot of plants).

Today, the scientific programme began with the first plenary session. After a welcome address by Akira Ukawa, who reminded us of the previous lattice meetings held in Japan and the tremendous progress the field has made in the intervening twelve years, Leonardo Giusti gave the first plenary talk, speaking about recent progress on chiral symmetry breaking. Lattice results have confirmed the proportionality of the square of the pion mass to the quark mass (i.e. the Gell-Mann-Oakes-Renner (GMOR) relation, a hallmark of chiral symmetry breaking) very accurately for a long time. Another relation involving the chiral condensate is the Banks-Casher relation, which relates it to the eigenvalue density of the Dirac operator at zero. It can be shown that the eigenvalue density is renormalizable, and that thus the mode number in a given interval is renormalization-group invariant. Two recent lattice studies, one with twisted-mass fermions and one with O(a)-improved Wilson fermions, confirm the Banks-Casher relation, with the chiral condensates found agreeing very well with those inferred from GMOR. Another relation is the Witten-Veneziano relation, which relates the η' mass to the topological susceptibility, thus explaining how precisely the η' is not a Goldstone boson. The topological charge on the lattice can be defined through the index of the Neuberger operator or through chain of spectral porjectors, but a recently invented and much cheaper definition is through the topological charge density at finite flow time in Lüscher's Wilson flow formalism. The renormalization properties of the Wilson flow allow for a derivation of the universality of the topological susceptibility, and numerical tests using all three definitions indeed agree within errors in the continuum limit. Higher cumulants determined in the Wilson flow formalism agree with large-Nc predictions in pure Yang-Mills, and the suppression of the topological susceptibility in QCD relative to the pure Yang-Mills case is in line with expectations (which in principle can be considered an a posteriori determination of Nf in agreement with the value used in simulations).

The next speaker was Yu Nakayama, who talked about a related topic, namely the determination of the chiral phase transition in QCD from the conformal bootstrap. The chiral phase transition can be studied in the framework of a Landau effective theory in three dimensions. While the mean-field theory predicts a second-order phase transition in the O(4) universality class, one-loop perturbation theory in 4-ε dimensions predicts a first-order phase transition at ε=1. Making use of the conformal symmetry of the effective theory, one can apply the conformal bootstrap method, which combines an OPE with crossing relations to obtain results for critical exponents, and the results from this method suggest that the phase transition is in fact of second order. This also agrees with many lattice studies, but others disagree. The role of the anomalously broken U(1)A symmetry in this analysis appears to be unclear.

After the coffee break, Tatsumi Aoyama, a long-time collaborator in the heroic efforts of Kinoshita to calculate the four- and five-loop QED contributions to the electron and muon anomalous moments, gave a plenary talk on the determination of the QED contribution to lepton (g-2). For likely readers of this blog, the importance of (g-2) is unlikely to require an explanation: the current 3σ tension between theory and experiment for (g-2)μ is the strongest hint of physics beyond the Standard Model so far, and since the largest uncertainties on the theory side are hadronic, lattice QCD is challenged to either resolve the tension or improve the accuracy of the predictions to the point where the tension becomes an unambiguous, albeit indirect, discovery of new physics. The QED calculations are on the face of it simpler, being straightforward Feynman diagram evaluations. However, the number of Feynman diagrams grows so quickly at higher orders that automated methods are required. In fact, in a first step, the number of Feynman diagrams is reduced by using the Ward-Takahashi identity to relate the vertex diagrams relevant to (g-2) to self-energy diagrams, which are then subjected to an automated renormalization procedure using the Zimmermann forest formula. In a similar way, infrared divergences are subtracted using a more complicated "annotated forest"-formula (there are two kinds of IR subtractions needed, so the subdiagrams in a forest need to be labelled with the kind of subtraction). The resulting UV- and IR-finite integrands are then integrated using VEGAS in Feynman parameter space. In order to maintain the required precision, quadruple-precision floating-point numbers (or an emulation thereof) must be used. Whether these methods could cope with the six-loop QED contribution is not clear, but with the current and projected experimental errors, that contribution will not be required for the foreseeable future, anyway.

This was followed by another (g-2)-related plenary, with Taku Izubichi speaking about the determination of anomalous magnetic moments and nucleon electric dipole moments in QCD. In particular the anomalous magnetic moment has become such an active topic recently that the time barely sufficed to review all of the activity in this field, which ranges from different approaches to parameterizing the momentum dependence of the hadronic vacuum polarization, through clever schemes to reduce the noise by subtracting zero-momentum contributions, to new ways of extracting the vacuum polarization through the use of background magnetic fields, as well as simulations of QCD+QED on the lattice. Among the most important problems are finite-volume effects.

After the lunch break, there were parallel sessions in the afternoon. I got to chair the first session on hadron structure, which was devoted to determinations of hadronic contributions to (g-2)μ.

After the coffee break, there were more parallel sessions, another complete one of which was devoted to (g-2) and closely-related topics. A talk deserving to be highlighted was given by Jeremy Green, who spoke about the first direct calculation of the hadronic light-to-light scattering amplitude from lattice QCD.

July 13, 2015

BackreactionBook review: “Eureka” by Chad Orzel.

Eureka: Discovering Your Inner Scientist
By Chad Orzel
Basic Books (December 9, 2014)

It’s a good book, really. The problem isn’t the book, the problem is me.

Chad Orzel’s new book “Eureka” lays out how the scientific enterprise reflects in every-day activities. In four parts, “Look, Think, Test, Tell,” Chad connects examples from sports, cooking, collecting, crossword puzzles, mystery novels, and more, to the methodology used in scientific discovery. Along the way, he covers a lot of ground in astrophysics, cosmology, geology, and atomic physics. And the extinction of the dinosaurs.

It’s a well-written book, with relevant but not excessive references and footnotes; the scientific explanations are accurate yet non-technical; the anecdotes from the history of science and contemporary academia are nicely woven together with the books theme, that every one of us has an “inner scientist” who is waiting to be discovered.

To my big relief in this recent book Chad doesn’t talk to his dog. Because I’m really not a dog person. I’m the kind of person who feeds the neighbor’s cat. Sometimes. I’m also the kind of person who likes baking, running, house music, and doesn’t watch TV. I order frozen food and get it delivered to the door. Chad cooks, believes baking is black magic, plays basketball, likes rock music, and his writing draws a lot on contemporary US TV shows or movies. He might be a good science popularizer, but his sports popularization is miserable. It doesn’t seem to have occurred to him that somebody could read his book who, like me, doesn’t know anything about baseball, or bridge, or basketball, and therefore much of his explanations are entirely lost on me.

I don’t think I’ve ever read a book that made me feel so distinctly that I’m not the intended audience. Of course the whole idea of “Eureka” is totally backwards for me. You don’t have to convince me I’m capable of scientific reasoning. I have even proved capable of convincing others I’m capable of scientific reasoning. Sometimes. But I do not have the slightest idea why somebody would want to spend hours trying to throw a ball into a basket, or, even more bizarre, watch other people trying to throw balls into baskets. So some stretches of the book stretched indeed. Which is why it’s taken me so long to get through with it, since I had an advance proof more than a year ago.

Besides this, I have a general issue with the well-meant message that we were born to think scientifically, as I elaborated on in this recent post. Chad’s argument is that every one of us brings the curiosity and skills to be a scientist, and that we use many of these skills intuitively. I agree on this. Sometimes. I wish though that he had spent a few more words pointing out that being a scientist is a profession after all, and one that requires adequate education for a reason. While we do some things right intuitively, intuition can also mislead us. I understand that Chad addresses an existing cultural misconception, which is that science is only for a gifted few rather than a general way to understand the world. However, I’d rather not swap this misconception for another misconception, which is that scientific understanding comes with little guidance and effort.

In summary, it’s a great book to give to someone who is interested in sports but not in science, to help them discover their inner scientist. Chad does an excellent job pointing out how much scientific thought there is in daily life, and he gets across a lot of physics along with that. He tells the reader that approaching problems scientifically is not just helpful for researchers, but for every one to understand the world. Sometimes. Now somebody please explain me the infield fly rule.

Mark Chu-CarrollSilly φ and π crackpottery

Over time, I’ve come to really, really hate the number φ.

φ is the so-called golden ratio. It’s the number that is a solution for the equation (a+b)/a = (a/b). The reason that that’s interesting at all is because it’s got an interesting property when you draw it out: if you take a rectangle where the ratio of the length of the sides is 1:φ, then if you remove the largest possible square from it, you’ll get another rectangle whose sides have the ratio φ:1. If you take the largest square from that, you’ll get a rectangle whose sides have the ratio 1:φ. And so on.

The numeric value of it is (1+sqrt(5))/2, or about 1.618033988749895.

The problem with φ is that people are convinced that it’s some kind of incredibly profound thing, and find it all over the place. The problem is, virtually all of the places where people claim to find it are total rubbish. A number that’s just a tiny bit more that 1 1/2 is really easy to find if you go looking for it, and people go looking for it all over the place.

People claim it’s in all sorts of artwork. You can certainly find a ton of things in paintings whose size ratio is about 1 1/2, and people find it and insist that it was deliberately done to make it φ. People find it in musical scales, the diatonic and pentatonic scales, and the indian scales.

People claim it comes up all over the place in nature: in beehives, ant colonies, flowers, tree sizes, tree-limb positions, size of herds of animals, litters of young, body shapes, face shapes.

People claim it’s key to architecture.

And yet… it seems like if you actually take any of those and actually start to look at it in detail? The φ isn’t there. It’s just a number that’s kinda-sorta in the 1 1/2 range.

One example of that: there’s a common claim that human faces have proportions based on &phi. You can see a bunch of that nonsense here. The thing is, the “evidence” for the claim consists of rectangles drawn around photographs of faces – and if you look closely at those rectangles, what you find is that the placement of the corners isn’t consistent. When you define, say, “the distance between the eyes”, you can measure that as distances between inner-edges, or between pupils, or between outer edges. Most of these claims use outer edges. But where’s the outer edge of an eye? It’s not actually a well-defined point. You can pick a couple of different places in a photo as “the” edge. They’re all close together, so there’s not a huge amount of variation. But if you can fudge the width a little bit, and you can fudge other facial measurements just a little bit, you’ve got enough variation that if you’re looking for two measurements with a ratio close to φ, you’ll always find one.

Most of the φ nonsense is ultimately aesthetic: people claiming that the golden ratio has a fundamental beauty to it. They claim that facial features match it because it’s intrinsically beautiful, and so people whose faces have φ ratios are more beautiful, and that that led to sexual-selection which caused our faces to embody the ratio. I think that’s bunk, but it’s hard to make a mathematical argument against aesthetics.

But then, you get the real crackpots. There are people who think φ has amazing scientific properties. In the words of the crank I’m writing about today, understanding φ (and the “correct” value of π derived from it) will lead humanity to “enter into a veritable Space Age”.

I’m talking about a guy who calls himself “Jain 108″. I’m not quite sure what to call him. Mr. Jain? Mr. 108? Dr 108? Most of the time on his website, he just refers to himself as “Jain” (or sometimes “Jain of Oz”) so I’ll go with “Jain”).

Jain believes that φ is the key to mathematics, science, art, and human enlightenment. He’s a bit hard to pin down, because most of his website is an advertisement for his books and seminars: if you want to know “the truth”, you’ve got to throw Jain some cash. I’m not willing to give money to crackpots, so I’m stuck with just looking at what he’s willing to share for free. (But I do recommend browsing around his site. It’s an impressive combination of newage scammery, pomposity, and cluelessness.)

What you can read for free is more than enough to conclude that he’s a total idiot.

I’m going to focus my mockery on one page: “Is Pi a Lie?”.

On this page, Jain claims to be able to prove that the well-known value of π (3.14159265….) is wrong. In fact, that value is wrong, and the correct value of π is derived from φ! The correct value of π is \frac{4}{\sqrt{\phi}}, or about 3.144605511029693.

For reasons that will be soon explained, traditional Pi is deficient because historically it has awkwardly used logical straight lines to measure illogical curvature. Thus, by using the highest level of mathematics known as Intuitive Maths, the True Value of Pi must be a bit more than anticipated to compensate for the mysterious “Area Under The Curve”. When this is done, the value, currently known as JainPi, = 3.144… can be derived, by knowing the precise Height of the Cheops Pyramid which is based on the Divine Phi Proportion (1.618…). Instead of setting our diameter at 1 unit or 1 square, something magical happens when we set the diameter at the diagonal length of a Double Square = 2.236… which is the Square Root of 5 (meaning 2.236… x 2.236… = 5). This is the critical part of the formula that derives Phi \frac{1+\sqrt{5}}{2}, and was used by ancient vedic seers as their starting point to construct their most important diagram or ‘Yantra’ or power-art called the Sri Yantra. With a Root 5 diameter, the translation of the Phi’s formula into a geometric construct derives the royal Maltese Cross symbol, concluding that Phi is Pi, that Phi generates Pi, and that Pi must be derived with a knowledge of the Harmonics of Phi. When this is understood and utilized, we will collectively enter into a veritable Space Age.

How did we get the wrong value? It’s based on the “fact” that the computation of π is based on the use of “logical” straight lines to measure “illogical” curvurature. (From just that one sentence, we can already conclude that Jain knows nothing about logic, except what he learned from Mr. Spock on Star Trek.) More precisely, according to Jain:

In all due good respects, we must first honour Archimedes of Syracuse 2,225 years ago, who gave the world his system on how to calculate Pi, approximated to 22÷7, by cutting the circle into say 16 slices of a pizza, and measuring the 16 edge lengths of these 16 triangular polygons (fig 3), to get a good estimate for the circumference of a circle. The idea was that if we kept making the slices of pizza smaller and smaller, by subsequently cutting the circle into 32 slices, then 64, then 128 then 256 slices, we would get a better and more accurate representation for the circumference. The Fundamental Flawed Logic or Error with Archimede’s Increasing Polygon Method was that he failed to measure The Area Under The Curve. In fact, he assumed that The Area Under The Curve, just magically disappeared. Even in his time, Archimedes admitted that his value was a mere estimate!

This explanation does a beautiful job of demonstrating how utterly ignorant Jain is of math. Archimedes may have been the first person from the western tradition to have worked out a mechanism to compute a value for π – and his mechanism was a good one. But it’s far from the only one. But let’s ignore that for a moment. Jain’s supposed critique, if true, would mean that modern calculus doesn’t work. The wedge-based computation of π is a forerunner of the common methods of calculus. In reality, when we compute the value of almost any integral using calculus, our methods are based on the concept of drawing rectangles under the curve, and narrowing those rectangles until they’re infinitely small, at which point the “area under the curve” missed by the rectangles becomes zero. If the wedge computation of π is wrong because it misses are under the curve, then so will every computation using integral calculus.

Gosh, think we would have noticed that by now?

Let’s skip past that for a moment, and come back to the many ways that π comes into reality. π is the ratio of the diameter of a circle to its radius. Because circles are such a basic thing, there are many ways of deriving the value of π that come from its fundamental nature. Many of these have no relation to the wedge-method that Jain attributes to Archimedes.

For example, there is Viete’s product:

\frac{2}{\pi} = \left(\frac{\sqrt{2}}{2}\right)\left(\frac{\sqrt{2 + \sqrt{2}}}{2}\right)\left(\frac{\sqrt{2 + \sqrt{2 + \sqrt{2}}}}{2}\right)(...)

Or there’s the Gregory-Leibniz series:

\frac{\pi}{4} = 1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \frac{1}{9} - ...

These have no relation to the wedge-method – they’re derived from the fundamental nature of π. And all of them produce the same value – and it’s got no connection at all to φ.

As supportive evidence for the incorrectness of π, Jain gives to apocryphal stories about NASA and the moon landings. First, he claims that the first moon landing was off by 20 kilometers, and that the cause of this was an incorrect value of π: that the value of π used in computing trajectories was off by 0.003:

NASA admitted that when the original Mooncraft landing occurred, the targeted spot was missed by about 20km?
What could have been wrong with the Calculations?
NASA subsequently adjusted their traditional mathematical value for Pi (3.141592…) by increasing it in the 3rd decimal by .003!

Let’s take just a moment, and consider that.

It’s a bit difficult to figure out how to address that, because he’s not mentioning what part of the trajectory was messed up. Was it the earth-to-moon transit of the full apollo system? Or was it the orbit-to-ground flight of the lunar lander? Since he doesn’t bother to tell us, we’ll look at both.

π does matter when computing the trajectory of the earth-to-moon trip – because it involves the intersection of two approximate circles – the orbit of the earth around the sun, and the orbit of the moon around the earth. (Both of these are approximations, but they’re quite useful ones; the apollo trajectory computations did rely on a value for π.

Let’s look at earth-to-moon. I’m going to oversimplify ridiculously – but I’m just trying to give us a ballpark order-of-magnitude guess as just how much of a difference Mr. Jain’s supposed error would cause. THe distance from the earth to the moon is about 384,000 kilometers. If we assume that π is a linear factor in the computation, then a difference in the value of pi of around 1 part in 1000 would cause a difference in distance computations of around 384 kilometers. Mr. Jain is alleging that the error only caused a difference of 20 kilometers. He’s off by a factor of 15. We can hand-wave this away, and say that the error that caused the lander to land in the “wrong” place wasn’t in the earth-moon trajectory computation – but we’re still talking about the apollo unit being in the wrong place by hundreds of kilometers – and no one noticing.

What if the problem was in the computation of the trajectory the lander took from the capsule to the surface of the moon? The orbit was a nearly circular one at about 110 kilometers above the lunar surface. How much of an error would the alleged π difference cause? About 0.1 kilometer – that is, about 100 meters. Less than what Jain claims by a factor of 200.

The numbers don’t work. These aren’t precise calculations by any stretch, but they’re ballpark. Without Jain providing more information about the alleged error, they’re the best we can do, and they don’t make sense.

Jain claims that in space work, scientists now use an adjusted value of π to cover the error. This piece I can refute by direct knowledge. My father was a physicist who worked on missiles, satellites, and space probes. (He was part of the Galileo team.) They used good old standard 3.14159 π. In fact, he explained how the value of π actually didn’t need to be that precise. In satellite work, you’re stuck with the measurement problems of reality. In even the highest precision satellite work, they didn’t use more that 4 significant digits of precision, because the manufacturing and measurement of components was only precise to that scale. Beyond that, it was always a matter of measure and adjust. Knowing that π was 3.14159265356979323 was irrelevant in practice, because anything beyond “about 3.1416″ was smaller that the errors in measurement.

Mr. Jain’s next claim is far worse.

Also, an ex-Engineer from NASA, “Smokey” admitted (via email) that when he was making metal cylinders for this same Mooncraft, finished parts just did not fit perfectly, so an adjusted value for Pi was also implemented. At the time, he thought nothing about it, but after reading an internet article called The True Value of Pi, by Jain 108, he made contact.

This is very, very simple to refute by direct experience. This morning, I got up, shaved with an electric razor (3 metal rotors), made myself iced coffee using a moka pot (three round parts, tight fitted, with circular-spiral threading). After breakfast, I packed my backpack and got in my car to drive to the train. (4 metal cylinders with 4 precisely-fitted pistons in the engine, running on four wheels with metal rims, precisely fitted to circular tires, and brakes clamping on circular disks.) I drove to the train station, and got on an electric train (around 200 electric motors on the full train, with circular turbines, driving circular wheels).

All those circles. According to Jain, every one of those circles isn’t the size we think it is. And yet they all fit together perfectly. According to Jain, every one of those circular parts is larger that we think it should be. To focus on one thing, every car engine’s pistons – every one of the millions of pistons created every year by companies around the world – requires more metal to produce than we’d expect. And somehow, in all that time, no one has ever noticed. Or if they’ve noticed, every single person who ever noticed it has never mentioned it!

It’s ludicrous.

Jain also claims that the value of e is wrong, and comes up with a cranky new formula for computing it. Of course, the problem with e is the same as the problem wiht π: in Jain’s world, it’s really based on φ.

In Jain’s world, everything is based on φ. And there’s a huge, elaborate conspiracy to keep it secret. Any Jain will share the secret with you, showing you how everything you think you know is wrong. You just need to buy his books ($77 for a hard-copy, or $44 for an ebook.) Or you could pay for him to travel to you and give you a seminar. But he doesn’t list a price for that – you need to send him mail to inquire.

July 12, 2015

Doug NatelsonNano "Grand Challenges" for the next decade

Last month the White House Office of Science and Technology Policy issued a call for suggestions for "nanotechnology-inspired grand challenges".  The term "grand challenge" is popular, both within the federal agencies and among science/technology coordinating and policy-making groups.  When done well, a list of grand challenges can cleanly, clearly articulate big, overarching goals that a community has identified as aspirational milestones toward which to strive.  When done poorly, a list of grand challenges can read like naive fantasy, with the added issue that pointing this out can lead to being labeled "negative" or "lacking in vision".  To make up a poor example: "In the next ten years we should develop a computing device smaller than a grain of rice, yet more powerful than an IBM Blue Gene supercomputer, able to operate based on power acquired from the ambient environment, and costing less than $5 to manufacture."    Yeah, not too realistic.

It's worth thinking hard about these, though, and trying to contribute good ones.  The deadline for this call is this coming Thursday, so get yours in while you can.  I put one in that I will discuss later in the week.