Planet Musings

April 30, 2016

David Hoggfiber positioners for spectrographs

In a short research day, I had a very useful conversation about robot fiber positioners for multi-object spectrographs with Peter Mao (Caltech), who is working on the Prime Focus Spectrograph for Subaru. He gave me a sense of the cost scale, the human effort scale, and the technical precision of such systems. This is critical information for the Letter of Intent that I and various others are putting in for the use of the SDSS hardware after the end of the current survey, SDSS-IV. We would like to go really big with the two APOGEE spectrographs, but if we want to do really large numbers of stars (think: millions or even tens of millions) we need to have robots place the fibers.

Clifford JohnsonWild Thing

wild_thing_april_2016The wildflower patch continues to produce surprises. You never know exactly what's going to come up, and in what quantities. I've been fascinated by this particular flower, for example, which seems to be constructed out of several smaller flowers! What a wonder, and of course, there's just one example of its parent plant in the entire patch, so once it is gone, it's gone.

-cvj Click to continue reading this post

The post Wild Thing appeared first on Asymptotia.

n-Category Café Relative Endomorphisms

Let (M,)(M, \otimes) be a monoidal category and let CC be a left module category over MM, with action map also denoted by \otimes. If mMm \in M is a monoid and cCc \in C is an object, then we can talk about an action of mm on cc: it’s just a map

α:mcc\alpha : m \otimes c \to c

satisfying the usual associativity and unit axioms. (The fact that all we need is an action of MM on CC to define an action of mm on cc is a cute instance of the microcosm principle.)

This is a very general definition of monoid acting on an object which includes, as special cases (at least if enough colimits exist),

  • actions of monoids in Set\text{Set} on objects in ordinary categories,
  • actions of monoids in Vect\text{Vect} (that is, algebras) on objects in Vect\text{Vect}-enriched categories,
  • actions of monads (letting M=End(C)M = \text{End}(C)), and
  • actions of operads (letting CC be a symmetric monoidal category and MM be the monoidal category of symmetric sequences under the composition product)

This definition can be used, among other things, to straightforwardly motivate the definition of a monad (as I did here): actions of a monoidal category MM on a category CC correspond to monoidal functors MEnd(C)M \to \text{End}(C), so every action in the above sense is equivalent to an action of a monad, namely the image of the monoid mm under such a monoidal functor. In other words, monads on CC are the universal monoids which act on objects cCc \in C in the above sense.

Corresponding to this notion of action is a notion of endomorphism object. Say that the relative endomorphism object End M(c)\text{End}_M(c), if it exists, is the universal monoid in MM acting on cc: that is, it’s a monoid acting on cc, and the action of any other monoid on cc uniquely factors through it.

This is again a very general definition which includes, as special cases (again if enough colimits exist),

  • the endomorphism monoid in Set\text{Set} of an object in an ordinary category,
  • the endomorphism algebra of an object in a Vect\text{Vect}-enriched category,
  • the endomorphism monad of an object in an ordinary category, and
  • the endomorphism operad of an object in a symmetric monoidal category.

If the action of MM on CC has a compatible enrichment [,]:C op×CM[-, -] : C^{op} \times C \to M in the sense that we have natural isomorphisms

Hom C(mc 1,c 2)Hom M(m,[c 1,c 2])\text{Hom}_C(m \otimes c_1, c_2) \cong \text{Hom}_M(m, [c_1, c_2])

then End M(c)\text{End}_M(c) is just the endomorphism monoid [c,c][c, c], and in fact the above discussion could have been done in the context of enrichments only, but in the examples I have in mind the actions are easier to notice than the enrichments. (Has anyone ever told you that symmetric monoidal categories are canonically enriched over symmetric sequences? Nobody told me, anyway.)

Here’s another example where the action is easier to notice than the enrichment. If D,CD, C are two categories, then the monoidal category End(C)=[C,C]\text{End}(C) = [C, C] has a natural left action on the category [D,C][D, C] of functors DCD \to C. If G:DCG : D \to C is a functor, then the relative endomorphism object End End(C)(G)\text{End}_{\text{End}(C)}(G), if it exists, turns out to be the codensity monad of GG!

This actually follows from the construction of an enrichment: the category [D,C][D, C] of functors DCD \to C is (if enough limits exist) enriched over End(C)\text{End}(C) in a way compatible with the natural left action. This enrichment takes the following form (by a straightforward verification of universal properties): if G 1,G 2[D,C]G_1, G_2 \in [D, C] are two functors DCD \to C, then their hom object

[G 1,G 2]=Ran G 1(G 2)End(C)[G_1, G_2] = \text{Ran}_{G_1}(G_2) \in \text{End}(C)

is, if it exists, the right Kan extension of G 2G_2 along G 1G_1. When G 1=G 2G_1 = G_2 this recovers the definition of the codensity monad of a functor G:DCG : D \to C as the right Kan extension of GG along itself, and neatly explains why it’s a monad: it’s an endomorphism object.

Question: Has anyone seen this definition of relative endomorphisms before?

It seems pretty natural, but I tried guessing what it would be called on the nLab and failed. It also seems that “relative endomorphisms” is used to mean something else in operad theory.

April 29, 2016

Doug NatelsonTechnical help question: Quantum Design magnet power supplies

I'd like to ask my readers that own Quantum Design PPMS or MPMS instruments for help regarding a technical glitch.  My aging PPMS superconducting magnet power supply (the kind QD calls the H-plate version) has developed a problem.  For high fields (say above 7 T) the power supply fails to properly put the magnet in persistent mode and throws up an error in the control software.  After talking with QD, it seems like options are limited.  They no longer service this model of power supply, and therefore one option would be to buy a new one.  However, I have a sense that other people have dealt with this issue before, and I would feel dumb buying a new supply if the answer was that this is a known issue involving a $ 0.30 diode or something.  Without a schematic it's difficult to do diagnostics ourselves.  Has anyone out there seen this issue and knows how to correct it?

April 27, 2016

BackreactionIf you fall into a black hole

If you fall into a black hole, you’ll die. That much is pretty sure. But what happens before that?

The gravitational pull of a black hole depends on its mass. At a fixed distance from the center, it isn’t any stronger or weaker than that of a star with the same mass. The difference is that, since a black hole doesn’t have a surface, the gravitational pull can continue to increase as you approach the center.

The gravitational pull itself isn’t the problem, the problem is the change in the pull, the tidal force. It will stretch any extended object in a process with technical name “spaghettification.” That’s what will eventually kill you. Whether this happens before or after you cross the horizon depends, again, on the mass of the black hole. The larger the mass, the smaller the space-time curvature at the horizon, and the smaller the tidal force.

Leaving aside lots of hot gas and swirling particles, you have good chances to survive crossing the horizon of a supermassive black hole, like that in the center of our galaxy. You would, however, probably be torn apart before crossing the horizon of a solar-mass black hole.

It takes you a finite time to reach the horizon of a black hole. For an outside observer however, you seem to be moving slower and slower and will never quite reach the black hole, due to the (technically infinitely large) gravitational redshift. If you take into account that black holes evaporate, it doesn’t quite take forever, and your friends will eventually see you vanishing. It might just take a few hundred billion years.

In an article that recently appeared on “Quick And Dirty Tips” (featured by SciAm), Everyday Einstein Sabrina Stierwalt explains:
“As you approach a black hole, you do not notice a change in time as you experience it, but from an outsider’s perspective, time appears to slow down and eventually crawl to a stop for you [...] So who is right? This discrepancy, and whose reality is ultimately correct, is a highly contested area of current physics research.”
No, it isn’t. The two observers have different descriptions of the process of falling into a black hole because they both use different time coordinates. There is no contradiction between the conclusions they draw. The outside observer’s story is an infinitely stretched version of the infalling observer’s story, covering only the part before horizon crossing. Nobody contests this.

I suspect this confusion was caused by the idea of black hole complementarity. Which is indeed a highly contest area of current physics research. According to black hole complementarity the information that falls into a black hole both goes in and comes out. This is in contradiction with quantum mechanics which forbids making exact copies of a state. The idea of black hole complementarity is that nobody can ever make a measurement to document the forbidden copying and hence, it isn’t a real inconsistency. Making such measurements is typically impossible because the infalling observer only has a limited amount of time before hitting the singularity.

Black hole complementarity is actually a pretty philosophical idea.

Now, the black hole firewall issue points out that black hole complementarity is inconsistent. Even if you can’t measure that a copy has been made, pushing the infalling information in the outgoing radiation changes the vacuum state in the horizon vicinity to a state which is no longer empty: that’s the firewall.

Be that as it may, even in black hole complementarity the infalling observer still falls in, and crosses the horizon at a finite time.

The real question that drives much current research is how the information comes out of the black hole before it has completely evaporated. It’s a topic which has been discussed for more than 40 years now, and there is little sign that theorists will agree on a solution. And why would they? Leaving aside fluid analogies, there is no experimental evidence for what happens with black hole information, and there is hence no reason for theorists to converge on any one option.

The theory assessment in this research area is purely non-empirical, to use an expression by philosopher Richard Dawid. It’s why I think if we ever want to see progress on the foundations of physics we have to think very carefully about the non-empirical criteria that we use.

Anyway, the lesson here is: Everyday Einstein’s Quick and Dirty Tips is not a recommended travel guide for black holes.

Secret Blogging SeminarSpringer’s copyright agreement is, according to Springer, compatible with posting your article on the arXiv under the CC-BY license

It is far from clear to me that Springer’s copyright agreement (quoted below) is compatible with posting your paper on the arXiv under the CC-BY license. Happily, a Springer representative has just confirmed for me that this is allowed:

Let me first say that the cc-by-0 license is no problem at all as it allows for other publications without restrictions. Second, our copyright statement of course only talks about the version published in one of our journals, with our copyright line (or the copyright line of a partner society if applicable, or the author’s copyright if Open Access is chosen) on it.

At least if you are publishing in a Springer journal, and more generally, I would strongly encourage you to post your papers to the arXiv under the more permissive CC-BY-0 license, rather than the minimal license the arXiv requires.

As a question to any legally-minded readers: does copyright law genuinely distinguish between “the version published in one of our journals, with our copyright line”, and the “author-formatted post-peer-review” version which is substantially identical, barring the journals formatting and copyright line?


April 26, 2016

David Hogga Gaussian process for galaxy SEDs

Today Boris Leistedt and Michael Troxel (Manchester) came to Simons to hack on a proposal to change the latter years of DES observing strategy. Their argument is that a small amount of u-band imaging (currently DES does none) could have a huge impact on photometric redshifts (particularly bias), which, in turn, could have a huge impact on the accuracy of the convergence mapping and large-scale structure constraints. They spent the day doing complete end-to-end simulations of observing, photometry, data analysis, and parameter estimation. I shouldn't really blog this, because it isn't my research, but it is very impressive!/p>

On the side, Leistedt and I checked in on our project to build a generative model of galaxy photometry, in which the full family of possible spectral energy distributions would be latent variables. Leistedt had a great breakthrough: If the SEDs are drawn from a Gaussian process, then the observables are also drawn from a Gaussian process, because projection onto redshifted bandpasses is a linear operation! He has code that implements this and some toy problems that seem to work, so I am cautiously optimistic.

David Hoggconvergence maps and alien technologies

At group meeting today, Michael Troxel (Manchester) showed up, as he and Boris Leistedt are working on proposals to modify DES observing strategy. Troxel showed us his convergence maps from the first look or engineering data from DES. Incredible! They look high in signal-to-noise and correlate very well with measured large-scale structure along the line of sight. It appears that this kind of mass density (or convergence) mapping is really mature.

After that, Dun Wang summarized a talk from last week by Jason Wright (PSU); this led to a conversation about the alien megastructures and Tabby's star. We discussed projects we might do on this. I asked what we would need to observe in order to be really convinced that this is alien technology. Since this question is very hard to answer, it is not clear that the “alien megastructure” explanation is really a scientific explanation at all. Oh so many good April Fools' projects.

Alexey Petrov30 years of Chernobyl disaster

30 years ago, on 26 April 1986, the biggest nuclear accident happened at the Chernobyl nuclear power station.


The picture above is of my 8th grade class (I am in the front row) on a trip from Leningrad to Kiev. We wanted to make sure that we’d spend May 1st (Labor Day in the Soviet Union) in Kiev! We took that picture in Gomel, which is about 80 miles away from Chernobyl, where our train made a regular stop. We were instructed to bury some pieces of clothing and shoes after coming back to Leningrad due to excess of radioactive dust on them…


April 25, 2016

Sean CarrollYouthful Brilliance

A couple of weeks ago I visited the Texas A&M Physics and Engineering Festival. It was a busy trip — I gave a physics colloquium and a philosophy colloquium as well as a public talk — but the highlight for me was an hourlong chat with the Davidson Young Scholars, who had traveled from across the country to attend the festival.

The Davidson Young Scholars program is an innovative effort to help nurture kids who are at the very top of intellectual achievement in their age group. Every age and ability group poses special challenges to educators, and deserves attention and curricula that are adjusted for their individual needs. That includes the most high-achieving ones, who easily become bored and distracted when plopped down in an average classroom. Many of them end up being home-schooled, simply because school systems aren’t equipped to handle them. So the DYS program offers special services, including most importantly a chance to meet other students like themselves, and occasionally go out into the world and get the kind of stimulation that is otherwise hard to find.


These kids were awesome. I chatted just very briefly, telling them a little about what I do and what it means to be a theoretical physicist, and then we had a free-flowing discussion. At some point I mentioned “wormholes” and it was all over. These folks love wormholes and time travel, and many of them had theories of their own, which they were eager to come to the board and explain to all of us. It was a rollicking, stimulating, delightful experience.

You can see from the board that I ended up talking about Einstein’s equation. Not that I was going to go through all of the mathematical details or provide a careful derivation, but I figured that was something they wouldn’t typically be exposed to by either their schoolwork or popular science, and it would be fun to give them a glimpse of what lies ahead if they study physics. Everyone’s life is improved by a bit of exposure to Einstein’s equation.

The kids are all right. If we old people don’t ruin them, the world will be in good hands.

Doug NatelsonOxide interfaces for fun and profit

The so-called III-V semiconductors, compounds that combine a group III element (Al, Ga, In) and a group V element (N, As, P, Sb), are mainstays of (opto)electronic devices and condensed matter physics.  They have never taken over for Si in logic and memory like some thought they might, for a number of materials science and economic reasons.  (To paraphrase an old line, "GaAs is the material of the future [for logic] and always will be.")  However, they are tremendously useful, in part because they are (now) fortuitously easy to grow - many of the compounds prefer the diamond-like "zinc blende" structure, and it is possible to prepare atomically sharp, flat, abrupt interfaces between materials with quite different semiconducting properties (very different band gaps and energetic alignments relative to each other).  Fundamentally, though, the palette is limited - these materials are very conventional semiconductors, without exhibiting other potentially exciting properties or competing phases like ferroelectricity, magnetism, superconductivity, etc.

Enter oxides.  Various complex oxides can exhibit all of these properties, and that has led to a concerted effort to develop materials growth techniques to create high quality oxide thin films, with an eye toward creating the same kind of atomically sharp heterointerfaces as in III-Vs.  A foundational paper is this one by Ohtomo and Hwang, where they used pulsed laser deposition to produce a heterojunction between LaAlO3, an insulating transparent oxide, and SrTiO3, another insulating transparent oxide (though one known to be almost a ferroelectric).  Despite the fact that both of those parent constituents are band insulators, the interface between the two was found to play host to a two-dimensional gas of electrons with remarkable properties.  The wikipedia article linked above is pretty good, so you should read it if you're interested.   

When you think about it, this is really remarkable.  You take an insulator, and another insulator, and yet the interface between them acts like a metal.  Where did the charge carriers come from?  (It's complicated - charge transfer from LAO to STO, but the free surface of the LAO and its chemical termination is hugely important.)  What is happening right at that interface?  (It's complicated.  There can be some lattice distortion from the growth process. There can be oxygen vacancies and other kinds of defects.  Below about 105 K the STO substrate distorts "ferroelastically", further complicating matters.)   Do the charge carriers live more on one side of the interface than the other, as in III-V interfaces, where the (conduction) band offset between the two layers can act like a potential barrier, and the same charge transfer that spills electrons onto one side leads to a self-consistent electrostatic potential that holds the charge layer right against that interface?  (Yes.)

Even just looking at the LAO/STO system, there is a ton of exciting work being performed.  Directly relevant to the meeting I just attended, Jeremy Levy's group at Pitt has been at the forefront of creating nanoscale electronic structures at the LAO/STO interface and examining their properties.  It turns out (one of these fortunate things!) that you can use a conductive atomic force microscope tip to do (reversible) electrochemistry at the free LAO surface, and basically draw conductive structures with nm resolution at the buried LAO/STO interface right below.   This is a very powerful technique, and it's enabled the study of the basic science of electronic transport at this interface at the nanoscale.

Beyond LAO/STO, over the same period there has been great progress in complex oxide materials growth by groups at a number of universities and at national labs.  I will refrain from trying to list them since I don't know them all and don't want to offend with the sin of inadvertent omission.  It is now possible to prepare a dizzying array of material types (ferromagnetic insulators like GdTiO3; antiferromagnetic insulators like SmTiO3; Mott insulators like LaTiO3; nickelates; superconducting cuprates; etc.) and complicated multilayers and superlattices of these systems.   It's far too early to say where this is all going, but historically the ability to grow new material systems of high quality with excellent precision tends to pay big dividends in the long term, even if they're not the benefits originally envisioned.

April 24, 2016

n-Category Café Polygonal Decompositions of Surfaces

If you tell me you’re going to take a compact smooth 2-dimensional manifold and subdivide it into polygons, I know what you mean. You mean something like this picture by Norton Starr:

or this picture by Greg Egan:

(Click on the images for details.) But what’s the usual term for this concept, and the precise definition? I’m writing a paper that uses this concept, and I don’t want to spend my time proving basic stuff. I want to just refer to something.

I’m worried that CW decompositions of surfaces might include some things I don’t like, though I’m not sure.

Maybe I want PLCW decompositions, which seem to come in at least two versions: the old version discussed in C. Hog-Angeloni, W. Metzler, and A. Sieradski’s book Two-dimensional Homotopy and Combinatorial Group Theory, and a new version due to Kirillov. But I don’t even know if these two versions are the same!

One big question is whether one wants polygons to include ‘bigons’ or even ‘unigons’. For what I’m doing right now, I don’t much care. But I want to know what I’m including!

Another question is whether we can glue one edge of a polygon to another edge of that same polygon.

Surely there’s some standard, widely used notion here…

David Hoggmisleading paper about spiral structure

Adrian Price-Whelan and I met at Columbia to discuss various things. We looked at the new paper of Reid et al on distances; which appeared on twitter this month as an argument that the Milky Way has spiral structure. Although this paper is not really dishonest—it explains what it did (though with a few lacunae)—it is misleading and wrong in various ways. The most important: It is misleading because it is being used as evidence for spiral structure (its figure 5 is being tweeted around!). But it also shows (in its figure 6) that even if there was no evidence at all for spiral structure in the data, their analysis would find a spiral pattern in the posterior pdf and distance estimators! It is wrong because it (claims to) multiply together posteriors (rather than likelihoods). That is, it violates the rules of probability that I tried to set out clearly here. I try not to use the word "wrong" when talking about other people's work; I don't mean to be harsh! The team on this paper includes some of the best observational astrophysicists in the world. I just mean that if you want to do probabilistic data analysis, you should obey the rules, and clearly state what you can and cannot conclude from the data.

At lunch, Jeno Sokolowski (Columbia) spoke about accreting white dwarfs in orbit around red giant stars. I realized during her talk that we can potentially generate a catalog of enormous numbers of these from our work (with Anna Ho) on LAMOST.

April 23, 2016

Terence TaoIMU Graduate Breakout Fellowships – Call for Nominations

The International Mathematical Union (with the assistance of the Friends of the International Mathematical Union and The World Academy of Sciences, and supported by Ian Agol, Simon Donaldson, Maxim Kontsevich, Jacob Lurie, Richard Taylor, and myself) has just launched the Graduate Breakout Fellowships, which will offer highly qualified students from developing countries a full scholarship to study for a PhD in mathematics at an institution that is also located in a developing country.  Nominations for this fellowship (which should be from a sponsoring mathematician, preferably a mentor of the nominee) have just opened (with an application deadline of June 22); details on the nomination process and eligibility requirements can be found at this page.

Filed under: advertising Tagged: Breakout Fellowship

April 22, 2016

Terence TaoA quick application of the closed graph theorem

In functional analysis, it is common to endow various (infinite-dimensional) vector spaces with a variety of topologies. For instance, a normed vector space can be given the strong topology as well as the weak topology; if the vector space has a predual, it also has a weak-* topology. Similarly, spaces of operators have a number of useful topologies on them, including the operator norm topology, strong operator topology, and the weak operator topology. For function spaces, one can use topologies associated to various modes of convergence, such as uniform convergence, pointwise convergence, locally uniform convergence, or convergence in the sense of distributions. (A small minority of such modes are not topologisable, though, the most common of which is pointwise almost everywhere convergence; see Exercise 8 of this previous post).

Some of these topologies are much stronger than others (in that they contain many more open sets, or equivalently that they have many fewer convergent sequences and nets). However, even the weakest topologies used in analysis (e.g. convergence in distributions) tend to be Hausdorff, since this at least ensures the uniqueness of limits of sequences and nets, which is a fundamentally useful feature for analysis. On the other hand, some Hausdorff topologies used are “better” than others in that many more analysis tools are available for those topologies. In particular, topologies that come from Banach space norms are particularly valued, as such topologies (and their attendant norm and metric structures) grant access to many convenient additional results such as the Baire category theorem, the uniform boundedness principle, the open mapping theorem, and the closed graph theorem.

Of course, most topologies placed on a vector space will not come from Banach space norms. For instance, if one takes the space {C_0({\bf R})} of continuous functions on {{\bf R}} that converge to zero at infinity, the topology of uniform convergence comes from a Banach space norm on this space (namely, the uniform norm {\| \|_{L^\infty}}), but the topology of pointwise convergence does not; and indeed all the other usual modes of convergence one could use here (e.g. {L^1} convergence, locally uniform convergence, convergence in measure, etc.) do not arise from Banach space norms.

I recently realised (while teaching a graduate class in real analysis) that the closed graph theorem provides a quick explanation for why Banach space topologies are so rare:

Proposition 1 Let {V = (V, {\mathcal F})} be a Hausdorff topological vector space. Then, up to equivalence of norms, there is at most one norm {\| \|} one can place on {V} so that {(V,\| \|)} is a Banach space whose topology is at least as strong as {{\mathcal F}}. In particular, there is at most one topology stronger than {{\mathcal F}} that comes from a Banach space norm.

Proof: Suppose one had two norms {\| \|_1, \| \|_2} on {V} such that {(V, \| \|_1)} and {(V, \| \|_2)} were both Banach spaces with topologies stronger than {{\mathcal F}}. Now consider the graph of the identity function {\hbox{id}: V \rightarrow V} from the Banach space {(V, \| \|_1)} to the Banach space {(V, \| \|_2)}. This graph is closed; indeed, if {(x_n,x_n)} is a sequence in this graph that converged in the product topology to {(x,y)}, then {x_n} converges to {x} in {\| \|_1} norm and hence in {{\mathcal F}}, and similarly {x_n} converges to {y} in {\| \|_2} norm and hence in {{\mathcal F}}. But limits are unique in the Hausdorff topology {{\mathcal F}}, so {x=y}. Applying the closed graph theorem (see also previous discussions on this theorem), we see that the identity map is continuous from {(V, \| \|_1)} to {(V, \| \|_2)}; similarly for the inverse. Thus the norms {\| \|_1, \| \|_2} are equivalent as claimed. \Box

By using various generalisations of the closed graph theorem, one can generalise the above proposition to Fréchet spaces, or even to F-spaces. The proposition can fail if one drops the requirement that the norms be stronger than a specified Hausdorff topology; indeed, if {V} is infinite dimensional, one can use a Hamel basis of {V} to construct a linear bijection on {V} that is unbounded with respect to a given Banach space norm {\| \|}, and which can then be used to give an inequivalent Banach space structure on {V}.

One can interpret Proposition 1 as follows: once one equips a vector space with some “weak” (but still Hausdorff) topology, there is a canonical choice of “strong” topology one can place on that space that is stronger than the “weak” topology but arises from a Banach space structure (or at least a Fréchet or F-space structure), provided that at least one such structure exists. In the case of function spaces, one can usually use the topology of convergence in distribution as the “weak” Hausdorff topology for this purpose, since this topology is weaker than almost all of the other topologies used in analysis. This helps justify the common practice of describing a Banach or Fréchet function space just by giving the set of functions that belong to that space (e.g. {{\mathcal S}({\bf R}^n)} is the space of Schwartz functions on {{\bf R}^n}) without bothering to specify the precise topology to serve as the “strong” topology, since it is usually understood that one is using the canonical such topology (e.g. the Fréchet space structure on {{\mathcal S}({\bf R}^n)} given by the usual Schwartz space seminorms).

Of course, there are still some topological vector spaces which have no “strong topology” arising from a Banach space at all. Consider for instance the space {c_c({\bf N})} of finitely supported sequences. A weak, but still Hausdorff, topology to place on this space is the topology of pointwise convergence. But there is no norm {\| \|} stronger than this topology that makes this space a Banach space. For, if there were, then letting {e_1,e_2,e_3,\dots} be the standard basis of {c_c({\bf N})}, the series {\sum_{n=1}^\infty 2^{-n} e_n / \| e_n \|} would have to converge in {\| \|}, and hence pointwise, to an element of {c_c({\bf N})}, but the only available pointwise limit for this series lies outside of {c_c({\bf N})}. But I do not know if there is an easily checkable criterion to test whether a given vector space (equipped with a Hausdorff “weak” toplogy) can be equipped with a stronger Banach space (or Fréchet space or {F}-space) topology.

Filed under: 245B - Real analysis, expository, math.FA Tagged: Banach spaces, closed graph theorem, strong topology, weak topology

Doug NatelsonThe Pittsburgh Quantum Institute: PQI2016 - Quantum Challenges

For the last 2.5 days I've been at the PQI2016:  Quantum Challenges symposium.  It's been a very fun meeting, bringing together talks spanning physical chemistry, 2d materials, semiconductor and oxide structures, magnetic systems, plasmonics, cold atoms, and quantum information.  Since the talks are all going to end up streamable online from the PQI website, I'll highlight just a couple of things that I learned rather than trying to summarize everything.

  • If you can make a material such that the dielectric permittivity \( \epsilon \equiv \kappa \epsilon_{0} \) is zero over some frequency range, you end up with a very odd situation.   The phase velocity of EM waves at that frequency would go to infinity, and the in-medium wavelength at that frequency would therefore become infinite.  Everything in that medium (at that frequency) would be in the near-field of everything else.  See here for a paper about what this means for transmission of EM waves through such a region, and here for a review.  
  • Screening of charge and therefore carrier-carrier electrostatic interactions in 2d materials like transition metal dichalcogenides varies in a complicated way with distance.  At short range,  screening is pretty effective (logarithmic with distance, basically the result you'd get if you worried about the interaction potential from an infinitely long charged rod), and at longer distances the field lines leak out into empty space, so the potential falls like \(1/\epsilon_{0}r\).  This has a big effect on the binding of electrons and holes into excitons in these materials.
  • There are a bunch of people working on unconventional transistor designs, including devices based on band-to-band tunneling between band-offset 2d materials.
  • In a discussion about growth and shapes of magnetic domains in a particular system, I learned about the Wulff construction, and this great paper by Conyers Herring on why crystal take the shapes that they do.  
  • After a public talk by Michel Devoret, I think I finally have some sense of the fundamental differences between the Yale group's approach to quantum computing and the John Martinis/Google group's approach.  This deserves a longer post later.  
  • Oxide interfaces continue to show interesting and surprising properties - again, I hope to say more later.
  • On a more science-outreach note, I learned about an app called Periscope (basically part of twitter) that allows people to do video broadcasting from their phones.  Hat tip to Julia Majors (aka Feynwoman) who pointed this out to me and that it's becoming a platform for a lot of science education work.
I'll update this post later with links to the talks when those become available.

BackreactionDear Dr B: Why is Lorentz-invariance in conflict with discreteness?

Can we build up space-time from
discrete entities?
“Could you elaborate (even) more on […] the exact tension between Lorentz invariance and attempts for discretisation?



Dear Noa:

Discretization is a common procedure to deal with infinities. Since quantum mechanics relates large energies to short (wave) lengths, introducing a shortest possible distance corresponds to cutting off momentum integrals. This can remove infinites that come in at large momenta (or, as the physicists say “in the UV”).

Such hard cut-off procedures were quite common in the early days of quantum field theory. They have since been replaced with more sophisticated regulation procedures, but these don’t work for quantum gravity. Hence it lies at hand to use discretization to get rid of the infinities that plague quantum gravity.

Lorentz-invariance is the symmetry of Special Relativity; it tells us how observables transform from one reference frame to another. Certain types of observables, called “scalars,” don’t change at all. In general, observables do change, but they do so under a well-defined procedure that is by the application of Lorentz-transformations.We call these “covariant.” Or at least we should. Most often invariance is conflated with covariance in the literature.

(To be precise, Lorentz-covariance isn’t the full symmetry of Special Relativity because there are also translations in space and time that should maintain the laws of nature. If you add these, you get Poincaré-invariance. But the translations aren’t so relevant for our purposes.)

Lorentz-transformations acting on distances and times lead to the phenomena of Lorentz-contraction and time-dilatation. That means observers at relative velocities to each other measure different lengths and time-intervals. As long as there aren’t any interactions, this has no consequences. But once you have objects that can interact, relativistic contraction has measurable consequences.

Heavy ions for example, which are collided in facilities like RHIC or the LHC, are accelerated to almost the speed of light, which results in a significant length contraction in beam direction, and a corresponding increase in the density. This relativistic squeeze has to be taken into account to correctly compute observables. It isn’t merely an apparent distortion, it’s a real effect.

Now consider you have a regular cubic lattice which is at rest relative to you. Alice comes by in a space-ship at high velocity, what does she see? She doesn’t see a cubic lattice – she sees a lattice that is squeezed into one direction due to Lorentz-contraction. Who of you is right? You’re both right. It’s just that the lattice isn’t invariant under the Lorentz-transformation, and neither are any interactions with it.

The lattice can therefore be used to define a preferred frame, that is a particular reference frame which isn’t like any other frame, violating observer independence. The easiest way to do this would be to use the frame in which the spacing is regular, ie your restframe. If you compute any observables that take into account interactions with the lattice, the result will now explicitly depend on the motion relative to the lattice. Condensed matter systems are thus generally not Lorentz-invariant.

A Lorentz-contraction can convert any distance, no matter how large, into another distance, no matter how short. Similarly, it can blue-shift long wavelengths to short wavelengths, and hence can make small momenta arbitrarily large. This however runs into conflict with the idea of cutting off momentum integrals. For this reason approaches to quantum gravity that rely on discretization or analogies to condensed matter systems are difficult to reconcile with Lorentz-invariance.

So what, you may say, let’s just throw out Lorentz-invariance then. Let us just take a tiny lattice spacing so that we won’t see the effects. Unfortunately, it isn’t that easy. Violations of Lorentz-invariance, even if tiny, spill over into all kinds of observables even at low energies.

A good example is vacuum Cherenkov radiation, that is the spontaneous emission of a photon by an electron. This effect is normally – ie when Lorentz-invariance is respected – forbidden due to energy-momentum conservation. It can only take place in a medium which has components that can recoil. But Lorentz-invariance violation would allow electrons to radiate off photons even in empty space. No such effect has been seen, and this leads to very strong bounds on Lorentz-invariance violation.

And this isn’t the only bound. There are literally dozens of particle interactions that have been checked for Lorentz-invariance violating contributions with absolutely no evidence showing up. Hence, we know that Lorentz-invariance, if not exact, is respected by nature to extremely high precision. And this is very hard to achieve in a model that relies on a discretization.

Having said that, I must point out that not every quantity of dimension length actually transforms as a distance. Thus, the existence of a fundamental length scale is not a priori in conflict with Lorentz-invariance. The best example is maybe the Planck length itself. It has dimension length, but it’s defined from constants of nature that are themselves frame-independent. It has units of a length, but it doesn’t transform as a distance. For the same reason string theory is perfectly compatible with Lorentz-invariance even though it contains a fundamental length scale.

The tension between discreteness and Lorentz-invariance appears always if you have objects that transform like distances or like areas or like spatial volumes. The Causal Set approach therefore is an exception to the problems with discreteness (to my knowledge the only exception). The reason is that Causal Sets are a randomly distributed collection of (unconnected!) points with a four-density that is constant on the average. The random distribution prevents the problems with regular lattices. And since points and four-volumes are both Lorentz-invariant, no preferred frame is introduced.

It is remarkable just how difficult Lorentz-invariance makes it to reconcile general relativity with quantum field theory. The fact that no violations of Lorentz-invariance have been found and the insight that discreteness therefore seems an ill-fated approach has significantly contributed to the conviction of string theorists that they are working on the only right approach. Needless to say there are some people who would disagree, such as probably Carlo Rovelli and Garrett Lisi.

Either way, the absence of Lorentz-invariance violations is one of the prime examples that I draw upon to demonstrate that it is possible to constrain theory development in quantum gravity with existing data. Everyone who still works on discrete approaches must now make really sure to demonstrate there is no conflict with observation.

Thanks for an interesting question!

John BaezBleaching of the Great Barrier Reef

The chatter of gossip distracts us from the really big story, the Anthropocene: the new geological era we are bringing about. Here’s something that should be dominating the headlines: Most of the Great Barrier Reef, the world’s largest coral reef system, now looks like a ghostly graveyard.

Most corals are colonies of tiny genetically identical animals called polyps. Over centuries, their skeletons build up reefs, which are havens for many kinds of sea life. Some polyps catch their own food using stingers. But most get their food by symbiosis! They cooperate with single-celled organism called zooxanthellae. Zooxanthellae get energy from the sun’s light. They actually live inside the polyps, and provide them with food. Most of the color of a coral reef comes from these zooxanthellae.

When a polyp is stressed, the zooxanthellae living inside it may decide to leave. This can happen when the sea water gets too hot. Without its zooxanthellae, the polyp is transparent and the coral’s white skeleton is revealed—as you see here. We say the coral is bleached.

After they bleach, the polyps begin to starve. If conditions return to normal fast enough, the zooxanthellae may come back. If they don’t, the coral will die.

The Great Barrier Reef, off the northeast coast of Australia, contains over 2,900 reefs and 900 islands. It’s huge: 2,300 kilometers long, with an area of about 340,000 square kilometers. It can be seen from outer space!

With global warming, this reef has been starting to bleach. Parts of it bleached in 1998 and again in 2002. But this year, with a big El Niño pushing world temperatures to new record highs, is the worst.

Scientists have being flying over the Great Barrier Reef to study the damage, and divers have looked at some of the reefs in detail. Of the 522 reefs surveyed in the northern sector, over 80% are severely bleached and less than 1% are not bleached at all. The damage is less further south where the water is cooler—but most of the reefs are in the north:

The top expert on coral reefs in Australia, Terry Hughes, wrote:

I showed the results of aerial surveys of bleaching on the Great Barrier Reef to my students. And then we wept.

Imagine devoting your life to studying and trying to protect coral reefs, and then seeing this.

Some of the bleached reefs may recover. But as oceans continue to warm, the prospects look bleak. The last big El Niño was in 1998. With a lot of hard followup work, scientists showed that in the end, 16% of the world’s corals died in that event.

This year is quite a bit hotter.

So, global warming is not a problem for the future: it’s a problem now. It’s not good enough to cut carbon emissions eventually. We’ve got to get serious now.

I need to recommit myself to this. For example, I need to stop flying around to conferences. I’ve cut back, but I need to do much better. Future generations, living in the damaged world we’re creating, will not have much sympathy for our excuses.

Clifford JohnsonAll Thumbs…

I'm not going to lie. If you're not in the mood, thumbnailing can be the most utterly tedious thing: (click for larger view)

Yet, as the key precursor to getting more detailed page layout right, and [...] Click to continue reading this post

The post All Thumbs… appeared first on Asymptotia.

John BaezStatistical Laws of Darwinian Evolution

guest post by Matteo Smerlak

Biologists like Steven J. Gould like to emphasize that evolution is unpredictable. They have a point: there is absolutely no way an alien visiting the Earth 400 million years ago could have said:

Hey, I know what’s gonna happen here. Some descendants of those ugly fish will grow wings and start flying in the air. Others will walk the surface of the Earth for a few million years, but they’ll get bored and they’ll eventually go back to the oceans; when they do, they’ll be able to chat across thousands of kilometers using ultrasound. Yet others will grow arms, legs, fur, they’ll climb trees and invent BBQ, and, sooner or later, they’ll start wondering “why all this?”.

Nor can we tell if, a week from now, the flu virus will mutate, become highly pathogenic and forever remove the furry creatures from the surface of the Earth.

Evolution isn’t gravity—we can’t tell in which directions things will fall down.

One reason we can’t predict the outcomes of evolution is that genomes evolve in a super-high dimensional combinatorial space, which a ginormous number of possible turns at every step. Another is that living organisms interact with one another in a massively non-linear way, with, feedback loops, tipping points and all that jazz.

Life’s a mess, if you want my physicist’s opinion.

But that doesn’t mean that nothing can be predicted. Think of statistics. Nobody can predict who I’ll vote for in the next election, but it’s easy to tell what the distribution of votes in the country will be like. Thus, for continuous variables which arise as sums of large numbers of independent components, the central limit theorem tells us that the distribution will always be approximately normal. Or take extreme events: the max of N independent random variables is distributed according to a member of a one-parameter family of so-called “extreme value distributions”: this is the content of the famous Fisher–Tippett–Gnedenko theorem.

So this is the problem I want to think about in this blog post: is evolution ruled by statistical laws? Or, in physics terms: does it exhibit some form of universality?

Fitness distributions are the thing

One lesson from statistical physics is that, to uncover universality, you need to focus on relevant variables. In the case of evolution, it was Darwin’s main contribution to figure out the main relevant variable: the average number of viable offspring, aka fitness, of an organism. Other features—physical strength, metabolic efficiency, you name it—matter only insofar as they are correlated with fitness. If we further assume that fitness is (approximately) heritable, meaning that descendants have the same fitness as their ancestors, we get a simple yet powerful dynamical principle called natural selection: in a given population, the lineage with the highest fitness eventually dominates, i.e. its fraction goes to one over time. This principle is very general: it applies to genes and species, but also to non-living entities such as algorithms, firms or language. The general relevance of natural selection as a evolutionary force is sometimes referred to as “Universal Darwinism”.

The general idea of natural selection is pictured below (reproduced from this paper):

It’s not hard to write down an equation which expresses natural selection in general terms. Consider an infinite population in which each lineage grows with some rate x. (This rate is called the log-fitness or Malthusian fitness to contrast it with the number of viable offspring w=e^{x\Delta t} with \Delta t the lifetime of a generation. It’s more convenient to use x than w in what follows, so we’ll just call x “fitness”). Then the distribution of fitness at time t satisfies the equation

\displaystyle{ \frac{\partial p_t(x)}{\partial t} =\left(x-\int d y\, y\, p_t(y)\right)p_t(x) }

whose explicit solution in terms of the initial fitness distribution p_0(x):

\displaystyle{ p_t(x)=\frac{e^{x t}p_0(x)}{\int d y\, e^{y t}p_0(y)} }

is called the Cramér transform of p_0(x) in large deviations theory. That is, viewed as a flow in the space of probability distributions, natural selection is nothing but a time-dependent exponential tilt. (These equations and the results below can be generalized to include the effect of mutations, which are critical to maintain variation in the population, but we’ll skip this here to focus on pure natural selection. See my paper referenced below for more information.)

An immediate consequence of these equations is that the mean fitness \mu_t=\int dx\, x\, p_t(x) grows monotonically in time, with a rate of growth given by the variance \sigma_t^2=\int dx\, (x-\mu_t)^2\, p_t(x):

\displaystyle{ \frac{d\mu_t}{dt}=\sigma_t^2\geq 0 }

The great geneticist Ronald Fisher (yes, the one in the extreme value theorem!) was very impressed with this relationship. He thought it amounted to an biological version of the second law of thermodynamics, writing in his 1930 monograph

Professor Eddington has recently remarked that “The law that entropy always increases—the second law of thermodynamics—holds, I think, the supreme position among the laws of nature”. It is not a little instructive that so similar a law should hold the supreme position among the biological sciences.

Unfortunately, this excitement hasn’t been shared by the biological community, notably because this Fisher “fundamental theorem of natural selection” isn’t predictive: the mean fitness \mu_t grows according to the fitness variance \sigma_t^2, but what determines the evolution of \sigma_t^2? I can’t use the identity above to predict the speed of evolution in any sense. Geneticists say it’s “dynamically insufficient”.

Two limit theorems

But the situation isn’t as bad as it looks. The evolution of p_t(x) may be decomposed into the evolution of its mean \mu_t, of its variance \sigma_t^2, and of its shape or type

\overline{p}_t(x)=\sigma_t p_t(\sigma_t x+\mu_t).

(We also call \overline{p}_t(x) the “standardized fitness distribution”.) With Ahmed Youssef we showed that:

• If p_0(x) is supported on the whole real line and decays at infinity as

-\ln\int_x^{\infty}p_0(y)d y\underset{x\to\infty}{\sim} x^{\alpha}

for some \alpha > 1, then \mu_t\sim t^{\overline{\alpha}-1}, \sigma_t^2\sim t^{\overline{\alpha}-2} and \overline{p}_t(x) converges to the standard normal distribution as t\to\infty. Here \overline{\alpha} is the conjugate exponent to \alpha, i.e. 1/\overline{\alpha}+1/\alpha=1.

• If p_0(x) has a finite right-end point x_+ with

p(x)\underset{x\to x_+}{\sim} (x_+-x)^\beta

for some \beta\geq0, then x_+-\mu_t\sim t^{-1}, \sigma_t^2\sim t^{-2} and \overline{p}_t(x) converges to the flipped gamma distribution

\displaystyle{ p^*_\beta(x)= \frac{(1+\beta)^{(1+\beta)/2}}{\Gamma(1+\beta)} \Theta[x-(1+\beta)^{1/2}] }

\displaystyle { e^{-(1+\beta)^{1/2}[(1+\beta)^{1/2}-x]}\Big[(1+\beta)^{1/2}-x\Big]^\beta }

Here and below the symbol \sim means “asymptotically equivalent up to a positive multiplicative constant”; \Theta(x) is the Heaviside step function. Note that p^*_\beta(x) becomes Gaussian in the limit \beta\to\infty, i.e. the attractors of cases 1 and 2 form a continuous line in the space of probability distributions; the other extreme case, \beta\to0, corresponds to a flipped exponential distribution.

The one-parameter family of attractors p_\beta^*(x) is plotted below:

These results achieve two things. First, they resolve the dynamical insufficiency of Fisher’s fundamental theorem by giving estimates of the speed of evolution in terms of the tail behavior of the initial fitness distribution. Second, they show that natural selection is indeed subject to a form of universality, whereby the relevant statistical structure turns out to be finite dimensional, with only a handful of “conserved quantities” (the \alpha and \beta exponents) controlling the late-time behavior of natural selection. This amounts to a large reduction in complexity and, concomitantly, an enhancement of predictive power.

(For the mathematically-oriented reader, the proof of the theorems above involves two steps: first, translate the selection equation into a equation for (cumulant) generating functions; second, use a suitable Tauberian theorem—the Kasahara theorem—to relate the behavior of generating functions at large values of their arguments to the tail behavior of p_0(x). Details in our paper.)

It’s useful to consider the convergence of fitness distributions to the attractors p_\beta^*(x) for 0\leq\beta\leq \infty in the skewness-kurtosis plane, i.e. in terms of the third and fourth cumulants of p_t(x).

The red curve is the family of attractors, with the normal at the bottom right and the flipped exponential at the top left, and the dots correspond to numerical simulations performed with the classical Wright–Fisher model and with a simple genetic algorithm solving a linear programming problem. The attractors attract!

Conclusion and a question

Statistics is useful because limit theorems (the central limit theorem, the extreme value theorem) exist. Without them, we wouldn’t be able to make any population-level prediction. Same with statistical physics: it only because matter consists of large numbers of atoms, and limit theorems hold (the H-theorem, the second law), that macroscopic physics is possible in the first place. I believe the same perspective is useful in evolutionary dynamics: it’s true that we can’t predict how many wings birds will have in ten million years, but we can tell what shape fitness distributions should have if natural selection is true.

I’ll close with an open question for you, the reader. In the central limit theorem as well as in the second law of thermodynamics, convergence is driven by a Lyapunov function, namely entropy. (In the case of the central limit theorem, it’s a relatively recent result by Arstein et al.: the entropy of the normalized sum of n i.i.d. random variables, when it’s finite, is a monotonically increasing function of n.) In the case of natural selection for unbounded fitness, it’s clear that entropy will also be eventually monotonically increasing—the normal is the distribution with largest entropy at fixed variance and mean.

Yet it turns out that, in our case, entropy isn’t monotonic at all times; in fact, the closer the initial distribution p_0(x) is to the normal distribution, the later the entropy of the standardized fitness distribution starts to increase. Or, equivalently, the closer the initial distribution p_0(x) to the normal, the later its relative entropy with respect to the normal. Why is this? And what’s the actual Lyapunov function for this process (i.e., what functional of the standardized fitness distribution is monotonic at all times under natural selection)?

In the plots above the blue, orange and green lines correspond respectively to

\displaystyle{ p_0(x)\propto e^{-x^2/2-x^4}, \quad p_0(x)\propto e^{-x^2/2-.01x^4}, \quad p_0(x)\propto e^{-x^2/2-.001x^4} }


• S. J. Gould, Wonderful Life: The Burgess Shale and the Nature of History, W. W. Norton & Co., New York, 1989.

• M. Smerlak and A. Youssef, Limiting fitness distributions in evolutionary dynamics, 2015.

• R. A. Fisher, The Genetical Theory of Natural Selection, Oxford University Press, Oxford, 1930.

• S. Artstein, K. Ball, F. Barthe and A. Naor, Solution of Shannon’s problem on the monotonicity of entropy, J. Am. Math. Soc. 17 (2004), 975–982.

April 21, 2016

Sean CarrollBeing: Human

cover6-150Anticipation is growing — in my own mind, if nowhere else — for the release of The Big Picture, which will be out on May 10. I’ve finally been able to hold the physical book in my hand, which is always a highlight of the book-writing process. And yes, there will be an audio book, which should come out the same time. I spent several days in March recording it, which taught me a valuable lesson: write shorter books.

There will also be something of a short book tour, hitting NYC, DC, Boston, Seattle, and the Bay Area, likely with a few other venues to be added later this summer. For details see my calendar page.

In many ways, the book is a celebration of naturalism in general, and poetic naturalism in particular. So to get you in the mood, here is a lovely short video from the Mothlight Creative, which celebrates naturalism in a more visual and visceral way. “I want to shiver with awe and wonder at the universe.”

Being: Human from Mothlight Creative on Vimeo.

Clifford JohnsonGoodbye, and Thank You

May 27th 2011, at the Forum in Los Angeles. What a wonderful show. So generous - numerous encores and special guests well into the night. Thank you for the music, Prince (click for larger view):


(Amy. Tina. Jason.)

-cvj Click to continue reading this post

The post Goodbye, and Thank You appeared first on Asymptotia.

Scott AaronsonMe interviewed by John Horgan (the author of “The End of Science”)

You can read it here.

It’s long (~12,000 words).  Rather than listing what this interview covers, it would be easier to list what it doesn’t cover.  (My favorite soda flavors?)

If you read this blog, much of what I say there will be old hat, but some of it will be new.  I predict that you’ll enjoy the interview iff you enjoy the blog.  Comments welcome.

n-Category Café Type Theory and Philosophy at Kent

I haven’t been around here much lately, but I would like to announce this workshop I’m running on 9-10 June, Type Theory and Philosophy. Following some of the links there will show, I hope, the scope of what may be possible.

One link is to the latest draft of an article I’m writing, Expressing ‘The Structure of’ in Homotopy Type Theory, which has evolved a little over the year since I posted The Structure of A.

Robert HellingThe Quantum in Quantum Computing

I am sure, by now, all of you have seen Canada's prime minister  "explain" quantum computers at Perimeter. It's really great that politicians care about these things and he managed to say what is the standard explanation for the speed up of quantum computers compared to their classical cousins: It is because you can have superpositions of initial states and therefore "perform many operations in parallel".

Except of course, that this is bullshit. This is not the reason for the speed up, you can do the same with a classical computer, at least with a  probabilistic one: You can also as step one perform a random process (throw a coin, turn a Roulette wheel, whatever) to determine the initial state you start your computer with. Then looking at it from the outside, the state of the classical computer is mixed and the further time evolution also "does all the computations in parallel". That just follows from the formalism of (classical) statistical mechanics.

Of course, that does not help much since the outcome is likely also probabilistic. But it has the same parallelism. And as the state space of a qubit is all of a Bloch sphere, the state space of a classical bit (allowing mixed states) is also an interval allowing a continuum of intermediate states.

The difference between quantum and classical is elsewhere. And it has to do with non-commuting operators (as those are essential for quantum properties) and those allow for entanglement.

To be more specific, let us consider one of the most famous quantum algorithms, Grover's database lookup, There the problem (at least in its original form) is to figure out which of $N$ possible "boxes" contains the hidden coin. Classically, you cannot do better than opening one after the other (or possibly in a random pattern), which takes $O(N)$ steps (on average).

For the quantum version, you first have to say how to encode the problem. The lore is, that you start with an $N$-dimensional Hilbert space with a basis $|1\rangle\cdots|N\rangle$. The secret is that one of these basis vectors is picked. Let's call it $|\omega\rangle$ and it is given to you in terms of a projection operator $P=|\omega\rangle\langle\omega|$.

Furthermore, you have at your disposal a way to create the flat superposition $|s\rangle = \frac1{\sqrt N}\sum_{i=1}^N |i\rangle$ and a number operator $K$ that act like $K|k\rangle= k|k\rangle$, i.e. is diagonal in the above basis and is able to distinguish the basis elements in terms of its eigenvalues.

Then, what you are supposed to do is the following: You form two unitary operators $U_\omega = 1 - 2P$  (this multiplies $|\omega\rangle$ by -1 while being the identity on the orthogonal subspace, i.e. is a reflection on the plane orthogonal to $|\omega\rangle$) and $U_s = 2|s\rangle\langle s| - 1$ which reflects the vectors orthogonal to $|s\rangle$.

It is not hard to see that both $U_s$ and $U_\omega$ map the two dimensional place spanned by $|s\rangle$ and $|\omega\rangle$ into itself. They are both reflections and thus their product is a rotation by twice the angle between the two planes which is given in terms of the scalar product $\langle s|\omega\rangle =1/\sqrt{N}$ as $\phi =\sin^{-1}\langle s|\omega\rangle$.

But obviously, using a rotation by $\cos^{-1}\langle s|\omega\rangle$, one can rotate $|s\rangle$ onto $\omega$. So all we have to do is to apply the product $(U_sU\omega)^k$ where $k$ is the ratio between these two angles which is $O(\sqrt{N})$. (No need to worry that this is not an integer, the error is $O(1/N)$ and has no influence). Then you have turned your initial state $|s\rangle$ into $|omega\rangle$ and by measuring the observable $K$ above you know which box contained the coin.

Since this took only $O(\sqrt{N})$ steps this is a quadratic speed up compared to the classical case.

So how did we get this? As I said, it's not the superposition. Classically we could prepare the probabilistic state that opens each box with probability $1/N$. But we have to expect that we have to do that $O(N)$ times, so this is essential as fast as systematically opening one box after the other.

To have a better unified classical-quantum language, let us say that we have a state space spanned by $N$ pure states $1,\ldots,N$. What we can do in the quantum case is to turn an initial state which had probability $1/N$ to be in each of these pure states into one that is deterministically in the sought after state.

Classically, this is impossible since no time evolution can turn a mixed state into a pure state. One way to see this is that the entropy of the probabilistic state is $\log(N)$ while it is 0 for the sought after state. If you like classically, we only have the observables given by C*-algebra generated by $K$, i.e. we can only observe which box we are dealing with. Both $P$ and $U_\omega$ are also in this classical algebra (they are diagonal in the special basis) and the strict classical analogue would be that we are given a rank one projector in that algebra and we have to figure out which one.

But quantum mechanically, we have more, we also have $U_s$ which does not commute with $K$ and is thus not in the classical algebra. The trick really is that in this bigger quantum algebra generated by both $K$ and $U_s$, we can form a pure state that becomes the probabilistic state when restricted to the classical algebra. And as a pure state, we can come up with a time evolution that turns it into the pure state $|\omega\rangle$.

So, this is really where the non-commutativity and thus the quantumness comes in. And we shouldn't really expect Trudeau to be able to explain this in a two sentence statement.

PS: The actual speed up in the end comes of course from the fact that probabilities are amplitudes squared and the normalization in $|s\rangle$ is $1/\sqrt{N}$ which makes the angle to be rotated by proportional to $1/\sqrt{N}$.

Robert HellingOne more resuscitation

This blog has been silent for almost two years for a number of reasons. First, I myself stopped reading blogs on a daily basis as in open Google Reader right after the arXiv an checking what's new. I had already stopped doing that due to time constraints before Reader was shut down by Google and I must say I don't miss anything. My focus shifted much more to Twitter and Facebook and from there, I am directed to the occasional blog post, but as I said, I don't check them systematically anymore. And I assume others do the same.

But from time to time I run into things that I would like to discuss on a blog. Where (as my old readers probably know) I am mainly interested in discussions. I don't write here to educate (others) but only myself. I write about something I found interesting and would like to have further input on.

Plus, this should be more permanent than a Facebook post (which is gone once scrolled out of the bottom of the screen) and more than the occasional 160 character remark on Twitter.

Assuming that others have adopted their reading habits in a similar way to the year 2016, I have set up If This Than That to announce new posts to FB and Twitter so others might have a chance to find them.

David Hoggobey typographic rules of Bringhurst

After Andy Casey brought our in-refereeing paper up to AASTeX6, I found myself very unhappy! The version 6 paper formatting makes many changes, and it is not clear how these changes were related to either community issues with AASTeX5 or else standards of typography and typesetting. I figured out ways to punk the formatting back to something that comes close to the recommendations in Bringhurst's The Elements of Typographic Style (my bible); I spent quite a bit of time on that today. Once I figure out how to raise issues on AASTeX6, I will do that, and also release a user-configurable patch package. A piece of unsolicited typographic advice for everyone out there: Obey what Bringhurst advises, unless you have very good reasons to do otherwise!

April 20, 2016

n-Category Café Coalgebraic Geometry

Hi everyone! As some of you may remember, some time ago I was invited to post on the Café, but regrettably I never got around to doing so until now. Mainly I thought that the posts I wanted to write would be old hat to Café veterans, and also I wasn’t used to the interface.

Recently I decided I could at least try occasionally linking to posts I’ve written over at Annoying Precision and seeing how that goes. So, I’ve written two posts on how to start thinking about cocommutative coalgebras as “distributions” on spaces of some sort:

For experts, there’s a background fact I’m dancing around but not stating explicitly, which is that over a field, the category of cocommutative coalgebras is equivalent to the opposite of the category of profinite commutative algebras, which we can interpret as a category of formal schemes. But these posts were already getting too long; I’m trying to say fewer things about each topic I write about so I can write about more topics.

Scott AaronsonGrading Trudeau on quantum computing

Update (4/19): Inspired by Trudeau’s performance (which they clocked at 35 seconds), Maclean’s magazine asked seven quantum computing researchers—me, Krysta Svore, Aephraim Steinberg, Barry Sanders, Davide Venturelli, Martin Laforest, and Murray Thom—to also explain quantum computing in 35 seconds or fewer.  You can see all the results here (here’s the audio from my entry).

The emails starting hitting me like … a hail of maple syrup from the icy north.  Had I seen the news?  Justin Trudeau, the dreamy young Prime Minister of Canada, visited the Perimeter Institute for Theoretical Physics in Waterloo, one of my favorite old haunts.  At a news conference at PI, as Trudeau stood in front of a math-filled blackboard, a reporter said to him: “I was going to ask you to explain quantum computing, but — when do you expect Canada’s ISIL mission to begin again, and are we not doing anything in the interim?”

Rather than answering immediately about ISIL, Trudeau took the opportunity to explain quantum computing:

“Okay, very simply, normal computers work, uh, by [laughter, applause] … no no no, don’t interrupt me.  When you walk out of here, you will know more … no, some of you will know far less about quantum computing, but most of you … normal computers work, either there’s power going through a wire, or not.  It’s 1, or a 0, they’re binary systems.  Uh, what quantum states allow for is much more complex information to be encoded into a single bit.  Regular computer bit is either a 1 or a 0, on or off.  A quantum state can be much more complex than that, because as we know [speeding up dramatically] things can be both particle and wave at the same times and the uncertainty around quantum states [laughter] allows us to encode more information into a much smaller computer.  So, that’s what exciting about quantum computing and that’s… [huge applause] don’t get me going on this or we’ll be here all day, trust me.”

What marks does Trudeau get for this?  On the one hand, the widespread praise for this reply surely says more about how low the usual standards for politicians are, and about Trudeau’s fine comic delivery, than about anything intrinsic to what he said.  Trudeau doesn’t really assert much here: basically, he just says that normal computers work using 1’s and 0’s, and that quantum computers are more complicated than that in some hard-to-explain way.  He gestures toward the uncertainty principle and wave/particle duality, but he doesn’t say anything about the aspects of QM most directly relevant to quantum computing—superposition or interference or the exponential size of Hilbert space—nor does he mention what quantum computers would or wouldn’t be used for.

On the other hand, I’d grade Trudeau’s explanation as substantially more accurate than what you’d get from a typical popular article.  For pay close attention to what the Prime Minister never says: he never says that a qubit would be “both 0 and 1 at the same time,” or any equivalent formulation.  (He does say that quantum states would let us “encode more information into a much smaller computer,” but while Holevo’s Theorem says that’s false for a common interpretation of “information,” it’s true for other reasonable interpretations.)  The humorous speeding up as he mentions particle/wave duality and the uncertainty principle clearly suggests that he knows it’s more subtle than just “0 and 1 at the same time,” and he also knows that he doesn’t really get it and that the journalists in the audience don’t either.  When I’m grading exams, I always give generous partial credit for honest admissions of ignorance.  B+.

Anyway, I’d be curious to know who at PI prepped Trudeau for this, and what they said.  Those with inside info, feel free to share in the comments (anonymously if you want!).

(One could also compare against Obama’s 2008 answer about bubblesort, which was just a mention of a keyword by comparison.)

Update: See also a Motherboard article where Romain Alléaume, Amr Helmy, Michele Mosca, and Aephraim Steinberg rate Trudeau’s answer, giving it 7/10, no score, 9/10, and 7/10 respectively.

April 19, 2016

Chad OrzelPhysics Blogging Round-Up: ARPES, Optics, Band Gaps, Radiation Pressure, Home Science, and Catastrophe

It’s been a while since I last rounded up physics posts from Forbes, so there’s a good bunch of stuff on this list:

How Do Physicists Know What Electrons Are Doing Inside Matter?: An explanation of Angle-Resolved Photo-Electron Spectroscopy (ARPES), one of the major experimental techniques in condensed matter. I’m trying to figure out a way to list “got 1,800 people to read a blog post about ARPES” as one of my professional accomplishments on my CV.

The Optics Of Superman’s X-Ray Vision: Spinning off a post of Rhett’s, a look at why humanoid eyes just aren’t set up to work with x-rays.

Why Do Solids Have Band Gaps?: A conceptual way to see why there are some energies that electrons simply can not have inside a periodic structure.

How Tropical Birds Use Quantum Physics: Blue feathers on many birds aren’t blue because of pigment, but thanks to the same physics that gives solids band gaps.

Why Do We Teach Old Physics? Because It Works: We had another round of people lamenting the emphasis on “old” topics in introductory courses; here’s my defense of the standard curricular order.

How Hard Does The Sun Push On the Earth? In which one of The Pip’s silly superhero books gets me thinking about radiation pressure forces.

How To Use A Laser Pointer To Measure Tiny Things: In which I use a green laser to settle the question of who in Chateau Steelypips has the thickest hair.

Don’t Just Talk About Science With Your Kids, DO Science With Your Kids: A simple home experiment, and a pitch for the importance of doing simple experiments at home.

How Quantum Physics Starts With Your Toaster: A blog version of my half-hour fake class on the “ultraviolet catastrophe” and why Planck needed the quantum hypothesis to solve black-body radiation.

Both blogs are likely to be on a sort of hiatus for the next little bit. I’m giving a talk at Mount Holyoke tonight, which will get me home really late, then Thursday and Friday I’m going to NYC for a space conference. Then on Saturday, we’re flying to Florida with the kids and my parents, and going on a Disney cruise in the Caribbean for all of next week. Which will provide a badly needed opportunity to kick back by the pool, because oh, God, so busy…

April 17, 2016

Doug NatelsonSci-fi time, part 2: Really big lasers

I had a whole post written about laser weapons, and then the announcement came out about trying to build laser-launched interstellar probes, so I figured I should revise and talk about that as well.

Now that the future is here, and space-faring rockets can land upright on autonomous ships, it's clearly time to look at other formerly science fiction technologies.  Last August I wrote a post looking at whether laser pistols really make practical physics sense as weapons.  The short answer:  Not really, at least not with present power densities.

What about laser cannons?  The US military has been looking at bigger, high power lasers for things like anti-aircraft and ship defense applications.  Given that Navy ships would not have to worry so much about portability and size, and that in principle nuclear-powered ships should have plenty of electrical generating capacity, do big lasers make more sense here?  It's not entirely clear.  Supposedly the operating costs of the laser systems are less than $1/shot, though that's also not a transparent analysis.

Let's look first at the competition.  The US Navy has been using the Phalanx gun system for ship defense, a high speed 20mm cannon that can spew out 75 rounds per second, each about 100 g and traveling at around 1100 m/s.   That's an effective output power, in kinetic energy alone, of 4.5 MW (!).  Even ignoring explosive munitions, each projectile carries 60 kJ of kinetic energy.  The laser weapons being tested are typically 150 kW.  To transfer the same amount of energy to the target as a single kinetic slug from the Phalanx would require keeping the beam focused on the target (assuming complete absorption) for about 0.4 sec, which is a pretty long time if the target is an inbound antiship missile traveling at supersonic speeds.   Clearly, as with hand-held weapons, kinetic projectiles are pretty serious in terms of power and delivered energy on target, and beating that with lasers is not simple.

The other big news story recently about big lasers was the announcement by Yuri Milner and Stephen Hawking of the Starshot project, an attempt to launch many extremely small and light probes toward Alpha Centauri using ground-based lasers for propulsion.  One striking feature of the plan is the idea of using a ground-based optical phased array laser system with about 100 GW of power (!) to boost the probes up to about 0.2 c in a few minutes.  As far as I can tell, the reason for the very high power and quick boost is to avoid problems with pointing the lasers for long periods of time as the earth rotates and the probes become increasingly distant.  Needless to say, pulling this off is an enormous technical challenge.  That power would be about equivalent to 50 large city-serving powerplants.   I really wonder if it would be easier to drop the power by a factor of 1000, increase the boost time by a factor of 1000, and use a 100 MW nuclear reactor in solar orbit (i.e. at the earth-sun L1 or L2 point) to avoid the earth rotation or earth orbital velocity constraint.  That level of reactor power is comparable to what is used in naval ships, and I have a feeling like the pain of working out in space may be easier to overcome than the challenge of building a 100 GW laser array.  Still, exciting times that anyone is even entertaining the idea of trying this.

Chad Orzel213-228/366: Kid-Light Photo Dump

As promised in the last catch-up post, a set of pictures less devoted to cute-kid shots.

213/366: Zone Defense

The superhero-themed "zones" of our house, as defined by the Pip.

The superhero-themed “zones” of our house, as defined by the Pip.

The Pip has been making one of his preschool teachers draw superheroes for him. At some point, he cut these out with scissors (or possibly made Kate cut them out), and hung them up in different rooms. So the living room is now the Captain America Zone, the kitchen is the Flash Zone, and the library is the Spider-Man Zone. I’m not sure what exactly that means, but it’s a thing.

214/366: Red In Tooth And Claw

A hawk with the grackle it killed in my parents' yard.

A hawk with the grackle it killed in my parents’ yard.

This raptor was in the side yard at my parents’ house during our Easter visit. It’s not the greatest picture, because I had the 24mm lens on so at to be able to get good wide shots of the kids, so this is cropped down from a much larger image. I think this is a Cooper’s Hawk, because the head and beak looked a little more blue than the Sharp-Shinned Hawk pictures in the bird books I looked at, but I could be wrong.

215/366: Chilly Birds

Birds at the feeder in the snow.

Birds at the feeder in the snow.

One of the notable events of the stretch where I was bad about photo posting was the most significant snowfall we got this winter. Which of course happened in April. Anyway, these birds were not amused.

216/366: Soaring

Turkey vulture from Thacher Park.

Turkey vulture from Thacher Park.

Speaking of birds, we took the kids up to Thacher Park, where the kids were super excited to be on the edge of a high cliff. The Pip insisted on skipping and dancing all over, which wasn’t nerve-wracking at all, nope, no way. While we were on the clifftop, there was a flock of eight or so turkey vultures soaring and swooping just off the edge, and I got some really good pictures despite only having the 24mm and 50mm lenses with me.

217/366: Trompe L’Oeil

My room at the Col. Blackinton Inn near Wheaton College.

My room at the Col. Blackinton Inn near Wheaton College.

A big part of my lack of posting has been travel-related. In this stretch, I made a run down to give a talk at Wheaton College south of Boston (not the one with the big ugly controversy over the firing of a professor who said nice things about Muslims). They put me up at the Col. Blackinton Inn, which had nice paintings on the walls and ceiling.

218/366: A View, Who Knew?

View from the empty field where Sawyer Library used to be.

View from the empty field where Sawyer Library used to be.

On the way back from Wheaton, I got on the road early and made a detour over the Mohawk Trail to stop in Williamstown for lunch. Where I went and admired the empty field where the main library stood when I was a student– turns out, there’s a nice view from that spot, which I never knew because there was a giant ugly brick cube there.

219/366: One Of My Favorite Places

The rugby pitch on Cole Field.

The rugby pitch on Cole Field.

Same mountain, different angle.

220/366: Another One

The Science Quad at Williams, looking at the Physics building.

The Science Quad at Williams, looking at the Physics building.

As I tend to do, I swung through the Science Quad and visited a bit with the physics faculty.

221/366: Yet Another Favorite Place

Whitney Point Lake.

Whitney Point Lake.

222/366: Spring!

A softball game at Union.

A softball game at Union.

As noted in the previous post, SteelyKid is going to be playing rec softball this spring, and is pretty fired up. Accordingly, we went over to watch a game at Union, because it really, truly is Spring, and we do sometimes get nice weather.

223/366: Light

The empty bird feeder at Chateau Steelypips glowing in the morning light.

The empty bird feeder at Chateau Steelypips glowing in the morning light.

One of the things I struggle with, photo-wise, is trying to capture cool light effects like the early-morning sun when we’re getting the kids off to school. This one worked better than most of my attempts, as the sunlight coming around the house makes the empty bird feeder look almost like a lamp.

224/366: Failed Attempt

Set-up for an experiment that didn't work.

Set-up for an experiment that didn’t work.

At one point in this stretch, I was working on an experiment for a blog post that didn’t really pan out. Here’s the set-up for the video I was trying to make, before I had a problem with the camera that scuttled the whole thing.

225/366: Microscope

SteelyKid's digital microscope looking at hair on a Lego frame.

SteelyKid’s digital microscope looking at hair on a Lego frame.

Here’s the set-up from an experiment that really did lead to a blog post.

226/366: Height

My height at different times of day.

My height at different times of day.

Another experiment that turned into a blog post.

227/366: Fake Class

The audience the fake class I did for prospective students.

The audience the fake class I did for prospective students.

Yet another of the many things I had to do during this stretch: The nice folks in Admissions asked if I would give a half-hour “mock class” for a Saturday program they were doing for accepted students. So I put together a short bit on the “ultraviolet catastrophe” and how Planck’s quantum hypothesis avoids that problem. This is a cell-phone snap of the audience filing in– there were even more people than this by the end, probably pushing the limits of the fire code. It was a fun time.

228/366: Waterfall

One of the waterfalls at John Boyd Thacher State Park.

One of the waterfalls at John Boyd Thacher State Park.

Finally, here’s the “featured image” from up at the top of the post, showing one of the seasonal waterfalls at Thacher Park. This is, I think, the only actual appearance of the kids in this set of photos– you can see the on the left side, where SteelyKid is climbing a rock. This is the Indian Ladder Trail, which goes down the cliff face, under a couple of waterfalls, and back up on the other side. The kids were super fired up for this, and watching The Pip do the Monty Python knight gallop along a narrow path above a steep drop only took a couple of years off our lives. Fortunately, the three times he tripped over rocks and faceplanted all took place in the park at the top…

And that brings us just about up to the present.

BackreactionDark matter might connect galaxies through wormholes

Tl;dr: A new paper shows that one of the most popular types of dark matter – the axion – could make wormholes possible if strong electromagnetic fields, like those found around supermassive black holes, are present. Unclear remains how such wormholes would be formed and whether they would be stable.
Wormhole dress.
Source: Shenova.

Wouldn’t you sometimes like to vanish into a hole and crawl out in another galaxy? It might not be as impossible as it seems. General relativity has long been known to allow for “wormholes” that are short connections between seemingly very distant places. Unfortunately, these wormholes are unstable and cannot be traversed unless filled by “exotic matter,” which must have negative energy density to keep the hole from closing. And no matter that we have ever seen has this property.

The universe, however, contains a lot of matter that we have never seen, which might give you hope. We observe this “dark matter” only through its gravitational pull, but this is enough to tell that it behaves pretty much like regular matter. Dark matter too is thus not exotic enough to help with stabilizing wormholes. Or so we thought.

In a recent paper, Konstantinos Dimopoulos from the “Consortium for Fundamental Physics” at Lancaster University points out that dark matter might be able to mimic the behavior of exotic matter when caught in strong electromagnetic fields:
    Active galaxies may harbour wormholes if dark matter is axionic
    By Konstantinos Dimopoulos
    arXiv:1603.04671 [astro-ph.HE]
Axions are one of the most popular candidates for dark matter. The particles themselves are very light, but they form a condensate in the early universe that should still be around today, giving rise to the observed dark matter distribution. Like all other dark matter candidates, axions have been searched for but so far not been detected.

In his paper, Dimopoulos points out that, due to their peculiar coupling to electromagnetic fields, axions can acquire an apparent mass which makes a negative contribution to their energy. This effect isn’t so unusual – it is similar to the way that fermions obtain masses by coupling to the Higgs or that scalar fields can obtain effective masses by coupling to electromagnetic fields. In other words, it’s not totally unheard of.

Dimopoulos then estimates how strong an electromagnetic field is necessary to turn axions into exotic matter and finds that around supermassive black holes the conditions would just be right. Hence, he concludes, axionic dark matter might keep wormholes open and traversable.

In his present work, Dimopoulos has however not done a fully relativistic computation. He considers the axions in the background of the black hole, but not the coupled solution of axions plus black hole. The analysis so far also does not check whether the wormhole would indeed be stable, or if it would instead blow off the matter that is supposed to stabilize it. And finally, it leaves open the question how the wormhole would form. It is one thing to discuss configurations that are mathematically possible, but it’s another thing entirely to demonstrate that they can actually come into being in our universe.

So it’s an interesting idea, but it will take a little more to convince me that this is possible.

And in case you warmed up to the idea of getting out of this galaxy, let me remind you that the closest supermassive black hole is still 26,000 light years away.

Note added: As mentioned by a commenter (see below) the argument in the paper might be incorrect. I asked the author for comment, but no reply so far.
Another note: The author says he has revised and replaced the paper, and that the conclusions are not affected.

April 16, 2016

April 14, 2016

Chad Orzel198-212/366: Kid-Centric Photo Dump

A bunch of stuff happened that knocked me out of the habit of editing and posting photos– computer issues, travel, catching up on work missed because of travel, and a couple of bouts with a stomach bug the kids brought home. I have been taking pictures, though, and will make an attempt to catch up. Given the huge delay, though, I’m going to drop the pretense of doing one photo a day, and do the right number, but grouped more thematically.

This span included Easter, which meant a lot of family time, which means photos of the kids doing stuff. So here’s a big group of those.

The Sillyheads Together:

198/366: Swinging

Swings are fun.

Swings are fun.

This is on the playground at the JCC; SteelyKid is dressed as a ninja because we were there for the Purim carnival (which the kids decided they wanted nothing to do with…).

199/366: Builders

Power tools plus Lincoln Logs equals fun.

Power tools plus Lincoln Logs equals fun.

My parents have a mix of old and new toys at the house for the kids to play with, which is why The Pip is using a power drill to build a house with Lincoln Logs. SteelyKid is assembling the plane that the drill actually goes with.

200/366: Climbing

Climbing on the Adams building playground.

Climbing on the Adams building playground.

I like the dramatic lighting in this one.

201/366: Baking

Helping Grandma make a cake for Easter.

Helping Grandma make a cake for Easter.

Wouldn’t be Easter without a bunny-shaped cake.

202/366: King of the Hill

Teaming up against Grandpa.

Teaming up against Grandpa.

It’s a little hard to see the “hill” in this, but they were very determined to keep Grandpa from reaching the top of it.

SteelyKid Solo

203/366: Move BACK!

Watching tv from the dog's bed.

Watching tv from the dog’s bed.

This is SteelyKid’s preferred location for watching television at my parents’, and she needs to be told to back up about six times a day.

204/366: Flying Turtle



SteelyKid riding her physics demonstration on the bike path.

205/366: Slugger

SteelyKid hitting a softball.

SteelyKid hitting a softball.

SteelyKid’s going to be playing rec softball this spring (starting in a couple of weeks), so my parents got her some gear for Easter. She’s got a really good eye, and can get a bat on just about any pitch.

206/366: Catch

Catch with Grandpa.

Catch with Grandpa.

The Pip solo

207/366: Boo!

Boo, Haman!

Boo, Haman!

I went to see the Purim costume parade at the JCC day care, where they did a dramatic reading of the Purim story. Apparently it’s traditional to boo and shake noisemakers whenever the bad king in the story gets mentioned, which The Pip really enjoyed.

208/366: Headfirst

This is totally safe, I'm sure.

This is totally safe, I’m sure.

Our Little Dude can be pretty daring.

209/366: Building Is Hard

Legos demand intense concentration.

Legos demand intense concentration.

The kids both got Lego sets for Easter, and The Pip worked very hard to assemble his little Spider-Man set.

210/366: Ironic Duck



The Pip has developed quite the array of skeptical facial expressions, probably because of my habit of deliberately getting things wrong. These are kind of hard to get on camera, but I like his look here.

211/366: Pipside Down

Wave your legs in the air like you just don't care.

Wave your legs in the air like you just don’t care.

Sometimes, you just need to sit on your head.

School Pictures

The photo shelf in my parent's living room, with school pictures of the kids.

The photo shelf in my parent’s living room, with school pictures of the kids.

Also the “featured image” up top, this is the photo shelf in my parents’ living room, with both the first school pictures the kids got taken, and the most recent of their school pictures. The Pip is very pleased with his Batman hoodie.

And that’s about half of the pictures I need to get caught up, in terms of numbers, so the next photo-dump post will be devoted to non-kid pictures.

Clifford JohnsonGreat Big Exchange


Here's a fun Great Big Story (CNN) video piece about the Science and Entertainment Exchange (and a bit about my work on Agent Carter). Click here for the piece.

(Yeah, the headline. Seems you can't have a story about science connecting with the rest of the culture without the word "nerd" being used somewhere...)

-cvj Click to continue reading this post

The post Great Big Exchange appeared first on Asymptotia.

April 12, 2016

John BaezDiamonds and Triamonds

The structure of a diamond crystal is fascinating. But there’s an equally fascinating form of carbon, called the triamond, that’s theoretically possible but never yet seen in nature. Here it is:

In the triamond, each carbon atom is bonded to three others at 120° angles, with one double bond and two single bonds. Its bonds lie in a plane, so we get a plane for each atom.

But here’s the tricky part: for any two neighboring atoms, these planes are different. In fact, if we draw the bond planes for all the atoms in the triamond, they come in four kinds, parallel to the faces of a regular tetrahedron!

If we discount the difference between single and double bonds, the triamond is highly symmetrical. There’s a symmetry carrying any atom and any of its bonds to any other atom and any of its bonds. However, the triamond has an inherent handedness, or chirality. It comes in two mirror-image forms.

A rather surprising thing about the triamond is that the smallest rings of atoms are 10-sided. Each atom lies in 15 of these 10-sided rings.

Some chemists have argued that the triamond should be ‘metastable’ at room temperature and pressure: that is, it should last for a while but eventually turn to graphite. Diamonds are also considered metastable, though I’ve never seen anyone pull an old diamond ring from their jewelry cabinet and discover to their shock that it’s turned to graphite. The big difference is that diamonds are formed naturally under high pressure—while triamonds, it seems, are not.

Nonetheless, the mathematics behind the triamond does find its way into nature. A while back I told you about a minimal surface called the ‘gyroid’, which is found in many places:

The physics of butterfly wings.

It turns out that the pattern of a gyroid is closely connected to the triamond! So, if you’re looking for a triamond-like pattern in nature, certain butterfly wings are your best bet:

• Matthias Weber, The gyroids: algorithmic geometry III, The Inner Frame, 23 October 2015.

Instead of trying to explain it here, I’ll refer you to the wonderful pictures at Weber’s blog.

Building the triamond

I want to tell you a way to build the triamond. I saw it here:

• Toshikazu Sunada, Crystals that nature might miss creating, Notices of the American Mathematical Society 55 (2008), 208–215.

This is the paper that got people excited about the triamond, though it was discovered much earlier by the crystallographer Fritz Laves back in 1932, and Coxeter named it the Laves graph.

To build the triamond, we can start with this graph:

It’s called \mathrm{K}_4, since it’s the complete graph on four vertices, meaning there’s one edge between each pair of vertices. The vertices correspond to four different kinds of atoms in the triamond: let’s call them red, green, yellow and blue. The edges of this graph have arrows on them, labelled with certain vectors

e_1, e_2, e_3, f_1, f_2, f_3 \in \mathbb{R}^3

Let’s not worry yet about what these vectors are. What really matters is this: to move from any atom in the triamond to any of its neighbors, you move along the vector labeling the edge between them… or its negative, if you’re moving against the arrow.

For example, suppose you’re at any red atom. It has 3 nearest neighbors, which are blue, green and yellow. To move to the blue neighbor you add f_1 to your position. To move to the green one you subtract e_2, since you’re moving against the arrow on the edge connecting blue and green. Similarly, to go to the yellow neighbor you subtract the vector f_3 from your position.

Thus, any path along the bonds of the triamond determines a path in the graph \mathrm{K}_4.

Conversely, if you pick an atom of some color in the triamond, any path in \mathrm{K}_4 starting from the vertex of that color determines a path in the triamond! However, going around a loop in \mathrm{K}_4 may not get you back to the atom you started with in the triamond.

Mathematicians summarize these facts by saying the triamond is a ‘covering space’ of the graph \mathrm{K}_4.

Now let’s see if you can figure out those vectors.

Puzzle 1. Find vectors e_1, e_2, e_3, f_1, f_2, f_3 \in \mathbb{R}^3 such that:

A) All these vectors have the same length.

B) The three vectors coming out of any vertex lie in a plane at 120° angles to each other:

For example, f_1, -e_2 and -f_3 lie in a plane at 120° angles to each other. We put in two minus signs because two arrows are pointing into the red vertex.

C) The four planes we get this way, one for each vertex, are parallel to the faces of a regular tetrahedron.

If you want, you can even add another constraint:

D) All the components of the vectors e_1, e_2, e_3, f_1, f_2, f_3 are integers.

Diamonds and hyperdiamonds

That’s the triamond. Compare the diamond:

Here each atom of carbon is connected to four others. This pattern is found not just in carbon but also other elements in the same column of the periodic table: silicon, germanium, and tin. They all like to hook up with four neighbors.

The pattern of atoms in a diamond is called the diamond cubic. It’s elegant but a bit tricky. Look at it carefully!

To build it, we start by putting an atom at each corner of a cube. Then we put an atom in the middle of each face of the cube. If we stopped there, we would have a face-centered cubic. But there are also four more carbons inside the cube—one at the center of each tetrahedron we’ve created.

If you look really carefully, you can see that the full pattern consists of two interpenetrating face-centered cubic lattices, one offset relative to the other along the cube’s main diagonal.

The face-centered cubic is the 3-dimensional version of a pattern that exists in any dimension: the Dn lattice. To build this, take an n-dimensional checkerboard and alternately color the hypercubes red and black. Then, put a point in the center of each black hypercube!

You can also get the Dn lattice by taking all n-tuples of integers that sum to an even integer. Requiring that they sum to something even is a way to pick out the black hypercubes.

The diamond is also an example of a pattern that exists in any dimension! I’ll call this the hyperdiamond, but mathematicians call it Dn+, because it’s the union of two copies of the Dn lattice. To build it, first take all n-tuples of integers that sum to an even integer. Then take all those points shifted by the vector (1/2, …, 1/2).

In any dimension, the volume of the unit cell of the hyperdiamond is 1, so mathematicians say it’s unimodular. But only in even dimensions is the sum or difference of any two points in the hyperdiamond again a point in the hyperdiamond. Mathematicians call a discrete set of points with this property a lattice.

If even dimensions are better than odd ones, how about dimensions that are multiples of 4? Then the hyperdiamond is better still: it’s an integral lattice, meaning that the dot product of any two vectors in the lattice is again an integer.

And in dimensions that are multiples of 8, the hyperdiamond is even better. It’s even, meaning that the dot product of any vector with itself is even.

In fact, even unimodular lattices are only possible in Euclidean space when the dimension is a multiple of 8. In 8 dimensions, the only even unimodular lattice is the 8-dimensional hyperdiamond, which is usually called the E8 lattice. The E8 lattice is one of my favorite entities, and I’ve written a lot about it in this series:

Integral octonions.

To me, the glittering beauty of diamonds is just a tiny hint of the overwhelming beauty of E8.

But let’s go back down to 3 dimensions. I’d like to describe the diamond rather explicitly, so we can see how a slight change produces the triamond.

It will be less stressful if we double the size of our diamond. So, let’s start with a face-centered cubic consisting of points whose coordinates are even integers summing to a multiple of 4. That consists of these points:

(0,0,0)   (2,2,0)   (2,0,2)   (0,2,2)

and all points obtained from these by adding multiples of 4 to any of the coordinates. To get the diamond, we take all these together with another face-centered cubic that’s been shifted by (1,1,1). That consists of these points:

(1,1,1)   (3,3,1)   (3,1,3)   (1,3,3)

and all points obtained by adding multiples of 4 to any of the coordinates.

The triamond is similar! Now we start with these points

(0,0,0)   (1,2,3)   (2,3,1)   (3,1,2)

and all the points obtain from these by adding multiples of 4 to any of the coordinates. To get the triamond, we take all these together with another copy of these points that’s been shifted by (2,2,2). That other copy consists of these points:

(2,2,2)   (3,0,1)   (0,1,3)   (1,3,0)

and all points obtained by adding multiples of 4 to any of the coordinates.

Unlike the diamond, the triamond has an inherent handedness, or chirality. You’ll note how we used the point (1,2,3) and took cyclic permutations of its coordinates to get more points. If we’d started with (3,2,1) we would have gotten the other, mirror-image version of the triamond.

Covering spaces

I mentioned that the triamond is a ‘covering space’ of the graph \mathrm{K}_4. More precisely, there’s a graph T whose vertices are the atoms of the triamond, and whose edges are the bonds of the triamond. There’s a map of graphs

p: T \to \mathrm{K}_4

This automatically means that every path in T is mapped to a path in \mathrm{K}_4. But what makes T a covering space of \mathrm{K}_4 is that any path in T comes from a path in \mathrm{K}_4, which is unique after we choose its starting point.

If you’re a high-powered mathematician you might wonder if T is the universal covering space of \mathrm{K}_4. It’s not, but it’s the universal abelian covering space.

What does this mean? Any path in \mathrm{K}_4 gives a sequence of vectors e_1, e_2, e_3, f_1, f_2, f_3 and their negatives. If we pick a starting point in the triamond, this sequence describes a unique path in the triamond. When does this path get you back where you started? The answer, I believe, is this: if and only if you can take your sequence, rewrite it using the commutative law, and cancel like terms to get zero. This is related to how adding vectors in \mathbb{R}^3 is a commutative operation.

For example, there’s a loop in \mathrm{K}_4 that goes “red, blue, green, red”. This gives the sequence of vectors

f_1, -e_3, e_2

We can turn this into an expression

f_1 - e_3 + e_2

However, we can’t simplify this to zero using just the commutative law and cancelling like terms. So, if we start at some red atom in the triamond and take the unique path that goes “red, blue, green, red”, we do not get back where we started!

Note that in this simplification process, we’re not allowed to use what the vectors “really are”. It’s a purely formal manipulation.

Puzzle 2. Describe a loop of length 10 in the triamond using this method. Check that you can simplify the corresponding expression to zero using the rules I described.

A similar story works for the diamond, but starting with a different graph:

The graph formed by a diamond’s atoms and the edges between them is the universal abelian cover of this little graph! This graph has 2 vertices because there are 2 kinds of atom in the diamond. It has 4 edges because each atom has 4 nearest neighbors.

Puzzle 3. What vectors should we use to label the edges of this graph, so that the vectors coming out of any vertex describe how to move from that kind of atom in the diamond to its 4 nearest neighbors?

There’s also a similar story for graphene, which is hexagonal array of carbon atoms in a plane:

Puzzle 4. What graph with edges labelled by vectors in \mathbb{R}^2 should we use to describe graphene?

I don’t know much about how this universal abelian cover trick generalizes to higher dimensions, though it’s easy to handle the case of a cubical lattice in any dimension.

Puzzle 5. I described higher-dimensional analogues of diamonds: are there higher-dimensional triamonds?


The Wikipedia article is good:

• Wikipedia, Laves graph.

They say this graph has many names: the K4 crystal, the (10,3)-a network, the srs net, the diamond twin, and of course the triamond. The name triamond is not very logical: while each carbon has 3 neighbors in the triamond, each carbon has not 2 but 4 neighbors in the diamond. So, perhaps the diamond should be called the ‘quadriamond’. In fact, the word ‘diamond’ has nothing to do with the prefix ‘di-‘ meaning ‘two’. It’s more closely related to the word ‘adamant’. Still, I like the word ‘triamond’.

This paper describes various attempts to find the Laves graph in chemistry:

• Stephen T. Hyde, Michael O’Keeffe, and Davide M. Proserpio, A short history of an elusive yet ubiquitous structure in chemistry, materials, and mathematics, Angew. Chem. Int. Ed. 47 (2008), 7996–8000.

This paper does some calculations arguing that the triamond is a metastable form of carbon:

• Masahiro Itoh et al, New metallic carbon crystal, Phys. Rev. Lett. 102 (2009), 055703.

Abstract. Recently, mathematical analysis clarified that sp2 hybridized carbon should have a three-dimensional crystal structure (\mathrm{K}_4) which can be regarded as a twin of the sp3 diamond crystal. In this study, various physical properties of the \mathrm{K}_4 carbon crystal, especially for the electronic properties, were evaluated by first principles calculations. Although the \mathrm{K}_4 crystal is in a metastable state, a possible pressure induced structural phase transition from graphite to \mathrm{K}_4 was suggested. Twisted π states across the Fermi level result in metallic properties in a new carbon crystal.

The picture of the \mathrm{K}_4 crystal was placed on Wikicommons by someone named ‘Workbit’, under a Creative Commons Attribution-Share Alike 4.0 International license. The picture of the tetrahedron was made using Robert Webb’s Stella software and placed on Wikicommons. The pictures of graphs come from Sunada’s paper, though I modified the picture of \mathrm{K}_4. The moving image of the diamond cubic was created by H.K.D.H. Bhadeshia and put into the public domain on Wikicommons. The picture of graphene was drawn by Dr. Thomas Szkopek and put into the public domain on Wikicommons.

Doug Natelson"Joulies": the coffee equivalent of whiskey stones, done right

Once upon a time I wrote a post about whiskey stones, rocks that you cool down and then place into your drink to chill your Scotch without dilution, and why they are rather lousy at controlling your drink's temperature.  The short version:  Ice is so effective, per mass, at cooling your drink because its melting is a phase transition.  Add heat to a mixture of ice and water, and the mixture sits there at zero degrees Celsius, sucking up energy (the "latent heat") as the solid ice is converted into liquid water.  Conversely, a rock just gets warmer.

Now look at Joulies, designed to keep your hot beverage of choice at about 60 degrees Celsius.  Note:  I've never used these, so I don't know how well-made they are, but the science behind them is right.  They're stainless steel and contain a material that happens to have a melting phase transition right at 60 C and a pretty large latent heat - more on that below.  If you put them into coffee that's hotter than this, the coffee will transfer heat to the Joulies until their interior warms up to the transition, and then the temperature of the coffee+Joulies will sit fixed at 60 C as the filling partially melts.   Then, if you leave the coffee sitting there and it loses heat to the environment through evaporation, conduction, convection, and radiation, the Joulies will transfer heat back to the coffee as their interior solidifies, again doing their level best to keep the (Joulies+coffee) at 60 C as long as there is a liquid/solid mixture within the Joulies.  This is how you regulate the temperature of your beverage.  (Note that we can estimate the total latent heat of the filling of the Joulies - you'd want it to be enough that cooling 375 ml of coffee from 100 C to 60 C would not completely melt the filling.  At 4.18 J/g for the specific heat of water (close enough), the total latent heat of the Joulies filling should be more than 375 g  \( \times \) 40 degrees C  \( \times \) 4.18 J/g = 62700 J. )

Unsurprisingly, the same company offers a version filled with a different material, one that melts a bit below 0 C, for cooling your cold beverages.  Basically they function like an ice cube, but with the melting liquid contained within a thin stainless steel shell so that it doesn't dilute your drink.

Random undergrad anecdote:  As a senior in college I was part of an undergrad senior design team in a class where the theme was satellites and spacecraft.  We designed a probe to land on Venus, and a big part of our design was a temperature-regulating reservoir of a material with a big latent heat of melting and a melting point at something like 100 C, to keep the interior of the probe comparatively cool for as long as possible.  Clearly we should've been early investors in Joulies.

April 11, 2016

Terence TaoConcatenation theorems for anti-Gowers-uniform functions and Host-Kra characteristic factors; polynomial patterns in primes

Tamar Ziegler and I have just uploaded to the arXiv two related papers: “Concatenation theorems for anti-Gowers-uniform functions and Host-Kra characteoristic factors” and “polynomial patterns in primes“, with the former developing a “quantitative Bessel inequality” for local Gowers norms that is crucial in the latter.

We use the term “concatenation theorem” to denote results in which structural control of a function in two or more “directions” can be “concatenated” into structural control in a joint direction. A trivial example of such a concatenation theorem is the following: if a function {f: {\bf Z} \times {\bf Z} \rightarrow {\bf R}} is constant in the first variable (thus {x \mapsto f(x,y)} is constant for each {y}), and also constant in the second variable (thus {y \mapsto f(x,y)} is constant for each {x}), then it is constant in the joint variable {(x,y)}. A slightly less trivial example: if a function {f: {\bf Z} \times {\bf Z} \rightarrow {\bf R}} is affine-linear in the first variable (thus, for each {y}, there exist {\alpha(y), \beta(y)} such that {f(x,y) = \alpha(y) x + \beta(y)} for all {x}) and affine-linear in the second variable (thus, for each {x}, there exist {\gamma(x), \delta(x)} such that {f(x,y) = \gamma(x)y + \delta(x)} for all {y}) then {f} is a quadratic polynomial in {x,y}; in fact it must take the form

\displaystyle f(x,y) = \epsilon xy + \zeta x + \eta y + \theta \ \ \ \ \ (1)


for some real numbers {\epsilon, \zeta, \eta, \theta}. (This can be seen for instance by using the affine linearity in {y} to show that the coefficients {\alpha(y), \beta(y)} are also affine linear.)

The same phenomenon extends to higher degree polynomials. Given a function {f: G \rightarrow K} from one additive group {G} to another, we say that {f} is of degree less than {d} along a subgroup {H} of {G} if all the {d}-fold iterated differences of {f} along directions in {H} vanish, that is to say

\displaystyle \partial_{h_1} \dots \partial_{h_d} f(x) = 0

for all {x \in G} and {h_1,\dots,h_d \in H}, where {\partial_h} is the difference operator

\displaystyle \partial_h f(x) := f(x+h) - f(x).

(We adopt the convention that the only {f} of degree less than {0} is the zero function.)

We then have the following simple proposition:

Proposition 1 (Concatenation of polynomiality) Let {f: G \rightarrow K} be of degree less than {d_1} along one subgroup {H_1} of {G}, and of degree less than {d_2} along another subgroup {H_2} of {G}, for some {d_1,d_2 \geq 1}. Then {f} is of degree less than {d_1+d_2-1} along the subgroup {H_1+H_2} of {G}.

Note the previous example was basically the case when {G = {\bf Z} \times {\bf Z}}, {H_1 = {\bf Z} \times \{0\}}, {H_2 = \{0\} \times {\bf Z}}, {K = {\bf R}}, and {d_1=d_2=2}.

Proof: The claim is trivial for {d_1=1} or {d_2=1} (in which {f} is constant along {H_1} or {H_2} respectively), so suppose inductively {d_1,d_2 \geq 2} and the claim has already been proven for smaller values of {d_1-1}.

We take a derivative in a direction {h_1 \in H_1} along {h_1} to obtain

\displaystyle T^{-h_1} f = f + \partial_{h_1} f

where {T^{-h_1} f(x) = f(x+h_1)} is the shift of {f} by {-h_1}. Then we take a further shift by a direction {h_2 \in H_2} to obtain

\displaystyle T^{-h_1-h_2} f = T^{-h_2} f + T^{-h_2} \partial_{h_1} f = f + \partial_{h_2} f + T^{-h_2} \partial_{h_1} f

leading to the cocycle equation

\displaystyle \partial_{h_1+h_2} f = \partial_{h_2} f + T^{-h_2} \partial_{h_1} f.

Since {f} has degree less than {d_1} along {H_1} and degree less than {d_2} along {H_2}, {\partial_{h_1} f} has degree less than {d_1-1} along {H_1} and less than {d_2} along {H_2}, so is degree less than {d_1+d_2-2} along {H_1+H_2} by induction hypothesis. Similarly {\partial_{h_2} f} is also of degree less than {d_1+d_2-2} along {H_1+H_2}. Combining this with the cocycle equation we see that {\partial_{h_1+h_2}f} is of degree less than {d_1+d_2-2} along {H_1+H_2} for any {h_1+h_2 \in H_1+H_2}, and hence {f} is of degree less than {d_1+d_2-1} along {H_1+H_2}, as required. \Box

While this proposition is simple, it already illustrates some basic principles regarding how one would go about proving a concatenation theorem:

  • (i) One should perform induction on the degrees {d_1,d_2} involved, and take advantage of the recursive nature of degree (in this case, the fact that a function is of less than degree {d} along some subgroup {H} of directions iff all of its first derivatives along {H} are of degree less than {d-1}).
  • (ii) Structure is preserved by operations such as addition, shifting, and taking derivatives. In particular, if a function {f} is of degree less than {d} along some subgroup {H}, then any derivative {\partial_k f} of {f} is also of degree less than {d} along {H}, even if {k} does not belong to {H}.

Here is another simple example of a concatenation theorem. Suppose an at most countable additive group {G} acts by measure-preserving shifts {T: g \mapsto T^g} on some probability space {(X, {\mathcal X}, \mu)}; we call the pair {(X,T)} (or more precisely {(X, {\mathcal X}, \mu, T)}) a {G}-system. We say that a function {f \in L^\infty(X)} is a generalised eigenfunction of degree less than {d} along some subgroup {H} of {G} and some {d \geq 1} if one has

\displaystyle T^h f = \lambda_h f

almost everywhere for all {h \in H}, and some functions {\lambda_h \in L^\infty(X)} of degree less than {d-1} along {H}, with the convention that a function has degree less than {0} if and only if it is equal to {1}. Thus for instance, a function {f} is an generalised eigenfunction of degree less than {1} along {H} if it is constant on almost every {H}-ergodic component of {G}, and is a generalised function of degree less than {2} along {H} if it is an eigenfunction of the shift action on almost every {H}-ergodic component of {G}. A basic example of a higher order eigenfunction is the function {f(x,y) := e^{2\pi i y}} on the skew shift {({\bf R}/{\bf Z})^2} with {{\bf Z}} action given by the generator {T(x,y) := (x+\alpha,y+x)} for some irrational {\alpha}. One can check that {T^h f = \lambda_h f} for every integer {h}, where {\lambda_h: x \mapsto e^{2\pi i \binom{h}{2} \alpha} e^{2\pi i h x}} is a generalised eigenfunction of degree less than {2} along {{\bf Z}}, so {f} is of degree less than {3} along {{\bf Z}}.

We then have

Proposition 2 (Concatenation of higher order eigenfunctions) Let {(X,T)} be a {G}-system, and let {f \in L^\infty(X)} be a generalised eigenfunction of degree less than {d_1} along one subgroup {H_1} of {G}, and a generalised eigenfunction of degree less than {d_2} along another subgroup {H_2} of {G}, for some {d_1,d_2 \geq 1}. Then {f} is a generalised eigenfunction of degree less than {d_1+d_2-1} along the subgroup {H_1+H_2} of {G}.

The argument is almost identical to that of the previous proposition and is left as an exercise to the reader. The key point is the point (ii) identified earlier: the space of generalised eigenfunctions of degree less than {d} along {H} is preserved by multiplication and shifts, as well as the operation of “taking derivatives” {f \mapsto \lambda_k} even along directions {k} that do not lie in {H}. (To prove this latter claim, one should restrict to the region where {f} is non-zero, and then divide {T^k f} by {f} to locate {\lambda_k}.)

A typical example of this proposition in action is as follows: consider the {{\bf Z}^2}-system given by the {3}-torus {({\bf R}/{\bf Z})^3} with generating shifts

\displaystyle T^{(1,0)}(x,y,z) := (x+\alpha,y,z+y)

\displaystyle T^{(0,1)}(x,y,z) := (x,y+\alpha,z+x)

for some irrational {\alpha}, which can be checked to give a {{\bf Z}^2} action

\displaystyle T^{(n,m)}(x,y,z) := (x+n\alpha, y+m\alpha, z+ny+mx+nm\alpha).

The function {f(x,y,z) := e^{2\pi i z}} can then be checked to be a generalised eigenfunction of degree less than {2} along {{\bf Z} \times \{0\}}, and also less than {2} along {\{0\} \times {\bf Z}}, and less than {3} along {{\bf Z}^2}. One can view this example as the dynamical systems translation of the example (1) (see this previous post for some more discussion of this sort of correspondence).

The main results of our concatenation paper are analogues of these propositions concerning a more complicated notion of “polynomial-like” structure that are of importance in additive combinatorics and in ergodic theory. On the ergodic theory side, the notion of structure is captured by the Host-Kra characteristic factors {Z^{<d}_H(X)} of a {G}-system {X} along a subgroup {H}. These factors can be defined in a number of ways. One is by duality, using the Gowers-Host-Kra uniformity seminorms (defined for instance here) {\| \|_{U^d_H(X)}}. Namely, {Z^{<d}_H(X)} is the factor of {X} defined up to equivalence by the requirement that

\displaystyle \|f\|_{U^d_H(X)} = 0 \iff {\bf E}(f | Z^{<d}_H(X) ) = 0.

An equivalent definition is in terms of the dual functions {{\mathcal D}^d_H(f)} of {f} along {H}, which can be defined recursively by setting {{\mathcal D}^0_H(f) = 1} and

\displaystyle {\mathcal D}^d_H(f) = {\bf E}_h T^h f {\mathcal D}^{d-1}( f \overline{T^h f} )

where {{\bf E}_h} denotes the ergodic average along a Følner sequence in {G} (in fact one can also define these concepts in non-amenable abelian settings as per this previous post). The factor {Z^{<d}_H(X)} can then be alternately defined as the factor generated by the dual functions {{\mathcal D}^d_H(f)} for {f \in L^\infty(X)}.

In the case when {G=H={\bf Z}} and {X} is {G}-ergodic, a deep theorem of Host and Kra shows that the factor {Z^{<d}_H(X)} is equivalent to the inverse limit of nilsystems of step less than {d}. A similar statement holds with {{\bf Z}} replaced by any finitely generated group by Griesmer, while the case of an infinite vector space over a finite field was treated in this paper of Bergelson, Ziegler, and myself. The situation is more subtle when {X} is not {G}-ergodic, or when {X} is {G}-ergodic but {H} is a proper subgroup of {G} acting non-ergodically, when one has to start considering measurable families of directional nilsystems; see for instance this paper of Austin for some of the subtleties involved (for instance, higher order group cohomology begins to become relevant!).

One of our main theorems is then

Proposition 3 (Concatenation of characteristic factors) Let {(X,T)} be a {G}-system, and let {f} be measurable with respect to the factor {Z^{<d_1}_{H_1}(X)} and with respect to the factor {Z^{<d_2}_{H_2}(X)} for some {d_1,d_2 \geq 1} and some subgroups {H_1,H_2} of {G}. Then {f} is also measurable with respect to the factor {Z^{<d_1+d_2-1}_{H_1+H_2}(X)}.

We give two proofs of this proposition in the paper; an ergodic-theoretic proof using the Host-Kra theory of “cocycles of type {<d} (along a subgroup {H})”, which can be used to inductively describe the factors {Z^{<d}_H}, and a combinatorial proof based on a combinatorial analogue of this proposition which is harder to state (but which roughly speaking asserts that a function which is nearly orthogonal to all bounded functions of small {U^{d_1}_{H_1}} norm, and also to all bounded functions of small {U^{d_2}_{H_2}} norm, is also nearly orthogonal to alll bounded functions of small {U^{d_1+d_2-1}_{H_1+H_2}} norm). The combinatorial proof parallels the proof of Proposition 2. A key point is that dual functions {F := {\mathcal D}^d_H(f)} obey a property analogous to being a generalised eigenfunction, namely that

\displaystyle T^h F = {\bf E}_k \lambda_{h,k} F_k

where {F_k := T^k F} and {\lambda_{h,k} := {\mathcal D}^{d-1}( T^h f \overline{T^k f} )} is a “structured function of order {d-1}” along {H}. (In the language of this previous paper of mine, this is an assertion that dual functions are uniformly almost periodic of order {d}.) Again, the point (ii) above is crucial, and in particular it is key that any structure that {F} has is inherited by the associated functions {\lambda_{h,k}} and {F_k}. This sort of inheritance is quite easy to accomplish in the ergodic setting, as there is a ready-made language of factors to encapsulate the concept of structure, and the shift-invariance and {\sigma}-algebra properties of factors make it easy to show that just about any “natural” operation one performs on a function measurable with respect to a given factor, returns a function that is still measurable in that factor. In the finitary combinatorial setting, though, encoding the fact (ii) becomes a remarkably complicated notational nightmare, requiring a huge amount of “epsilon management” and “second-order epsilon management” (in which one manages not only scalar epsilons, but also function-valued epsilons that depend on other parameters). In order to avoid all this we were forced to utilise a nonstandard analysis framework for the combinatorial theorems, which made the arguments greatly resemble the ergodic arguments in many respects (though the two settings are still not equivalent, see this previous blog post for some comparisons between the two settings). Unfortunately the arguments are still rather complicated.

For combinatorial applications, dual formulations of the concatenation theorem are more useful. A direct dualisation of the theorem yields the following decomposition theorem: a bounded function which is small in {U^{d_1+d_2-1}_{H_1+H_2}} norm can be split into a component that is small in {U^{d_1}_{H_1}} norm, and a component that is small in {U^{d_2}_{H_2}} norm. (One may wish to understand this type of result by first proving the following baby version: any function that has mean zero on every coset of {H_1+H_2}, can be decomposed as the sum of a function that has mean zero on every {H_1} coset, and a function that has mean zero on every {H_2} coset. This is dual to the assertion that a function that is constant on every {H_1} coset and constant on every {H_2} coset, is constant on every {H_1+H_2} coset.) Combining this with some standard “almost orthogonality” arguments (i.e. Cauchy-Schwarz) give the following Bessel-type inequality: if one has a lot of subgroups {H_1,\dots,H_k} and a bounded function is small in {U^{2d-1}_{H_i+H_j}} norm for most {i,j}, then it is also small in {U^d_{H_i}} norm for most {i}. (Here is a baby version one may wish to warm up on: if a function {f} has small mean on {({\bf Z}/p{\bf Z})^2} for some large prime {p}, then it has small mean on most of the cosets of most of the one-dimensional subgroups of {({\bf Z}/p{\bf Z})^2}.)

There is also a generalisation of the above Bessel inequality (as well as several of the other results mentioned above) in which the subgroups {H_i} are replaced by more general coset progressions {H_i+P_i} (of bounded rank), so that one has a Bessel inequailty controlling “local” Gowers uniformity norms such as {U^d_{P_i}} by “global” Gowers uniformity norms such as {U^{2d-1}_{P_i+P_j}}. This turns out to be particularly useful when attempting to compute polynomial averages such as

\displaystyle \sum_{n \leq N} \sum_{r \leq \sqrt{N}} f(n) g(n+r^2) h(n+2r^2) \ \ \ \ \ (2)


for various functions {f,g,h}. After repeated use of the van der Corput lemma, one can control such averages by expressions such as

\displaystyle \sum_{n \leq N} \sum_{h,m,k \leq \sqrt{N}} f(n) f(n+mh) f(n+mk) f(n+m(h+k))

(actually one ends up with more complicated expressions than this, but let’s use this example for sake of discussion). This can be viewed as an average of various {U^2} Gowers uniformity norms of {f} along arithmetic progressions of the form {\{ mh: h \leq \sqrt{N}\}} for various {m \leq \sqrt{N}}. Using the above Bessel inequality, this can be controlled in turn by an average of various {U^3} Gowers uniformity norms along rank two generalised arithmetic progressions of the form {\{ m_1 h_1 + m_2 h_2: h_1,h_2 \le \sqrt{N}\}} for various {m_1,m_2 \leq \sqrt{N}}. But for generic {m_1,m_2}, this rank two progression is close in a certain technical sense to the “global” interval {\{ n: n \leq N \}} (this is ultimately due to the basic fact that two randomly chosen large integers are likely to be coprime, or at least have a small gcd). As a consequence, one can use the concatenation theorems from our first paper to control expressions such as (2) in terms of global Gowers uniformity norms. This is important in number theoretic applications, when one is interested in computing sums such as

\displaystyle \sum_{n \leq N} \sum_{r \leq \sqrt{N}} \mu(n) \mu(n+r^2) \mu(n+2r^2)


\displaystyle \sum_{n \leq N} \sum_{r \leq \sqrt{N}} \Lambda(n) \Lambda(n+r^2) \Lambda(n+2r^2)

where {\mu} and {\Lambda} are the Möbius and von Mangoldt functions respectively. This is because we are able to control global Gowers uniformity norms of such functions (thanks to results such as the proof of the inverse conjecture for the Gowers norms, the orthogonality of the Möbius function with nilsequences, and asymptotics for linear equations in primes), but much less control is currently available for local Gowers uniformity norms, even with the assistance of the generalised Riemann hypothesis (see this previous blog post for some further discussion).

By combining these tools and strategies with the “transference principle” approach from our previous paper (as improved using the recent “densification” technique of Conlon, Fox, and Zhao, discussed in this previous post), we are able in particular to establish the following result:

Theorem 4 (Polynomial patterns in the primes) Let {P_1,\dots,P_k: {\bf Z} \rightarrow {\bf Z}} be polynomials of degree at most {d}, whose degree {d} coefficients are all distinct, for some {d \geq 1}. Suppose that {P_1,\dots,P_k} is admissible in the sense that for every prime {p}, there are {n,r} such that {n+P_1(r),\dots,n+P_k(r)} are all coprime to {p}. Then there exist infinitely many pairs {n,r} of natural numbers such that {n+P_1(r),\dots,n+P_k(r)} are prime.

Furthermore, we obtain an asymptotic for the number of such pairs {n,r} in the range {n \leq N}, {r \leq N^{1/d}} (actually for minor technical reasons we reduce the range of {r} to be very slightly less than {N^{1/d}}). In fact one could in principle obtain asymptotics for smaller values of {r}, and relax the requirement that the degree {d} coefficients be distinct with the requirement that no two of the {P_i} differ by a constant, provided one had good enough local uniformity results for the Möbius or von Mangoldt functions. For instance, we can obtain an asymptotic for triplets of the form {n, n+r,n+r^d} unconditionally for {d \leq 5}, and conditionally on GRH for all {d}, using known results on primes in short intervals on average.

The {d=1} case of this theorem was obtained in a previous paper of myself and Ben Green (using the aforementioned conjectures on the Gowers uniformity norm and the orthogonality of the Möbius function with nilsequences, both of which are now proven). For higher {d}, an older result of Tamar and myself was able to tackle the case when {P_1(0)=\dots=P_k(0)=0} (though our results there only give lower bounds on the number of pairs {(n,r)}, and no asymptotics). Both of these results generalise my older theorem with Ben Green on the primes containing arbitrarily long arithmetic progressions. The theorem also extends to multidimensional polynomials, in which case there are some additional previous results; see the paper for more details. We also get a technical refinement of our previous result on narrow polynomial progressions in (dense subsets of) the primes by making the progressions just a little bit narrower in the case of the density of the set one is using is small.

. This latter Bessel type inequality is particularly useful in combinatorial and number-theoretic applications, as it allows one to convert “global” Gowers uniformity norm (basically, bounds on norms such as {U^{2d-1}_{H_i+H_j}}) to “local” Gowers uniformity norm control.

Filed under: math.CO, math.DS, math.NT, paper Tagged: characteristic factor, concatenation theorems, Gowers uniformity norms, polynomial recurrence, Tamar Ziegler

April 08, 2016

ResonaancesApril Fools' 16: Was LIGO a hack?

This post is an April Fools' joke. LIGO's gravitational waves are for real. At least I hope so ;) 

We have had recently a few scientific embarrassments, where a big discovery announced with great fanfares was subsequently overturned by new evidence.  We still remember OPERA's faster than light neutrinos which turned out to be a loose cable, or BICEP's gravitational waves from inflation, which turned out to be galactic dust emission... It seems that another such embarrassment is coming our way: the recent LIGO's discovery of gravitational waves emitted in a black hole merger may share a similar fate. There are reasons to believe that the experiment was hacked, and the signal was injected by a prankster.

From the beginning, one reason to be skeptical about LIGO's discovery was that the signal  seemed too beautiful to be true. Indeed, the experimental curve looked as if taken out of a textbook on general relativity, with a clearly visible chirp signal from the inspiral phase, followed by a ringdown signal when the merged black hole relaxes to the Kerr state. The reason may be that it *is* taken out of a  textbook. This is at least what is strongly suggested by recent developments.

On EvilZone, a well-known hacker's forum, a hacker using a nickname Madhatter was boasting that it was possible to tamper with scientific instruments, including the LHC, the Fermi satellite, and the LIGO interferometer.  When challenged, he or she uploaded a piece of code that allows one to access LIGO computers. Apparently, the hacker took advantage the same backdoor that allows the selected members of the LIGO team to inject a fake signal in order to test the analysis chain.  This was brought to attention of the collaboration members, who  decided to test the code. To everyone's bewilderment, the effect was to reproduce exactly the same signal in the LIGO apparatus as the one observed in September last year!

Even though the traces of a hack cannot be discovered, there is little doubt now that there was a foul play involved. It is not clear what was the motif of the hacker: was it just a prank, or maybe an elaborate plan to discredit the scientists. What is even more worrying is that the same thing could happen in other experiments. The rumor is that the ATLAS and CMS collaborations are already checking whether the 750 GeV diphoton resonance signal could also be injected by a hacker.

April 07, 2016

Backreaction10 Essentials of Quantum Mechanics

Vortices in a Bose-Einstein condensate.
Source: NIST.

Trying to score at next week’s dinner party? Here’s how to intimidate your boss by fluently speaking quantum.

1. Everything is quantum

It’s not like some things are quantum mechanical and other things are not. Everything obeys the same laws of quantum mechanics – it’s just that quantum effects of large objects are very hard to notice. This is why quantum mechanics was a latecomer in theoretical physics: It wasn’t until physicists had to explain why electrons sit on shells around the atomic nucleus that quantum mechanics became necessary to make accurate predictions.

2. Quantization doesn’t necessarily imply discreteness

“Quanta” are discrete chunks, but not everything becomes chunky on short scales. Electromagnetic waves are made of quanta called “photons,” so the waves can be thought of as a discretized. And electron shells around the atomic nucleus can only have certain discrete radii. But other particle properties do not become discrete even in a quantum theory. The position of electrons in the conducting band of a metal for example is not discrete – the electron can occupy any place within the band. And the energy values of the photons that make up electromagnetic waves are not discrete either. For this reason, quantizing gravity – should we finally succeed at it – also does not necessarily mean that space and time have to be made discrete.

3. Entanglement is not the same as superposition

A quantum superposition is the ability of a system to be in two different states at the same time, and yet, when measured, one always finds one particular state, never a superposition. Entanglement on the other hand is a correlation between parts of a system – something entirely different. Superpositions are not fundamental: Whether a state is or isn’t a superposition depends on what you want to measure. A state can for example be in a superposition of positions and not in a superposition of momenta – so the whole concept is ambiguous. Entanglement on the other hand is unambiguous: It is an intrinsic property of each system and the so-far best known measure of a system’s quantum-ness. (For more details, read “What is the difference between entanglement and superposition?”)

4. There is no spooky action at a distance

Nowhere in quantum mechanics is information ever transmitted non-locally, so that it jumps over a stretch of space without having to go through all places in between. Entanglement is itself non-local, but it doesn’t do any action – it is a correlation that is not connected to non-local transfer of information or any other observable. It was a great confusion in the early days of quantum mechanics, but we know today that the theory can be made perfectly compatible with Einstein’s theory of Special Relativity in which information cannot be transferred faster than the speed of light.

5. It’s an active research area

It’s not like quantum mechanics is yesterday’s news. True, the theory originated more than a century ago. But many aspects of it became testable only with modern technology. Quantum optics, quantum information, quantum computing, quantum cryptography, quantum thermodynamics, and quantum metrology are all recently formed and presently very active research areas. With the new technology, also interest in the foundations of quantum mechanics has been reignited.

6. Einstein didn’t deny it

Contrary to popular opinion, Einstein was not a quantum mechanics denier. He couldn’t possibly be – the theory was so successful early on that no serious scientist could dismiss it. Einstein instead argued that the theory was incomplete, and believed the inherent randomness of quantum processes must have a deeper explanation. It was not that he thought the randomness was wrong, he just thought that this wasn’t the end of the story. For an excellent clarification of Einstein’s views on quantum mechanics, I recommend George Musser’s article “What Einstein Really Thought about Quantum Mechanics” (paywalled, sorry).

7. It’s all about uncertainty

The central postulate of quantum mechanics is that there are pairs of observables that cannot simultaneously be measured, like for example the position and momentum of a particle. These pairs are called “conjugate variables,” and the impossibility to measure both their values precisely is what makes all the difference between a quantized and a non-quantized theory. In quantum mechanics, this uncertainty is fundamental, not due to experimental shortcomings.

8. Quantum effects are not necessarily small...

We do not normally observe quantum effects on long distances because the necessary correlations are very fragile. Treat them carefully enough however, and quantum effects can persist over long distances. Photons have for example been entangled over separations as much as several hundreds of kilometer. And in Bose-Einstein condensates, up to several million of atoms have been brought into one coherent quantum state. Some researchers even believe that dark matter has quantum effects which span through whole galaxies.

9. ...but they dominate the small scales

In quantum mechanics, every particle is also a wave and every wave is also a particle. The effects of quantum mechanics become very pronounced once one observes a particle on distances that are comparable to the associated wavelength. This is why atomic and subatomic physics cannot be understood without quantum mechanics, whereas planetary orbits are entirely unaffected by quantum behavior.

10. Schrödinger’s cat is dead. Or alive. But not both.

It was not well-understood in the early days of quantum mechanics, but the quantum behavior of macroscopic objects decays very rapidly. This “decoherence” is due to constant interactions with the environment which are, in relatively warm and dense places like those necessary for life, impossible to avoid. Bringing large objects into superpositions of two different states is therefore extremely difficult and the superposition fades rapidly.

The heaviest object that has so far been brought into a superposition of locations is a carbon-60 molecule, and it has been proposed to do this experiment also for viruses or even heavier creatures like bacteria. Thus, the paradox that Schrödinger’s cat once raised – the transfer of a quantum superposition (the decaying atom) to a large object (the cat) – has been resolved. We now understand that while small things like atoms can exist in superpositions for extended amounts of time, a large object would settle extremely rapidly in one particular state. That’s why we never see cats that are both dead and alive.

[This post previously appeared on Starts With A Bang.]

April 04, 2016

John BaezComputing the Uncomputable

I love the more mind-blowing results of mathematical logic:

Surprises in logic.

Here’s a new one:

• Joel David Hamkins, Any function can be computable.

Let me try to explain it without assuming you’re an expert on mathematical logic. That may be hard, but I’ll give it a try. We need to start with some background.

First, you need to know that there are many different ‘models’ of arithmetic. If you write down the usual axioms for the natural numbers, the Peano axioms (or ‘PA’ for short), you can then look around for different structures that obey these axioms. These are called ‘models’ of PA.

One of them is what you think the natural numbers are. For you, the natural numbers are just 0, 1, 2, 3, …, with the usual way of adding and multiplying them. This is usually called the ‘standard model’ of PA. The numbers 0, 1, 2, 3, … are called the ‘standard’ natural numbers.

But there are also nonstandard models of arithmetic. These models contain extra numbers beside the standard ones! These are called ‘nonstandard’ natural numbers.

This takes a while to get used to. There are several layers of understanding to pass through.

For starters, you should think of these extra ‘nonstandard’ natural numbers as bigger than all the standard ones. So, imagine a whole bunch of extra numbers tacked on after the standard natural numbers, with the operations of addition and multiplication cleverly defined in such a way that all the usual axioms still hold.

You can’t just tack on finitely many extra numbers and get this to work. But there can be countably many, or uncountably many. There are infinitely many different ways to do this. They are all rather hard to describe.

To get a handle on them, it helps to realize this. Suppose you have a statement S in arithmetic that is neither provable nor disprovable from PA. Then S will hold in some models of arithmetic, while its negation not(S) will hold in some other models.

For example, the Gödel sentence G says “this sentence is not provable in PA”. If Peano arithmetic is consistent, neither G nor not(G) is provable in PA. So G holds in some models, while not(G) holds in others.

Thus, you can intuitively think of different models as “possible worlds”. If you have an undecidable statement, meaning one that you can’t prove or disprove in PA, then it holds in some worlds, while its negation holds in other worlds.

In the case of the Gödel sentence G, most mathematicians think G is “true”. Why the quotes? Truth is a slippery concept in logic—there’s no precise definition of what it means for a sentence in arithmetic to be “true”. All we can precisely define is:

1) whether or not a sentence is provable from some axioms


2) whether or not a sentence holds in some model.

Nonetheless, mathematicians are human, so they have beliefs about what’s true. Many mathematicians believe that G is true: indeed, in popular accounts one often hears that G is “true but unprovable in Peano arithmetic”. So, these mathematicians are inclined to say that any model where G doesn’t hold is nonstandard.

The result

Anyway, what is Joel David Hamkins’ result? It’s this:

There is a Turing machine T with the following property. For any function f from the natural numbers to the natural numbers, there is a model of PA such that in this model, if we give T any standard natural n as input, it halts and outputs f(n).

So, take f to be your favorite uncomputable function. Then there’s a model of arithmetic such that in this model, the Turing machine computes f, at least when you feed the machine standard numbers as inputs.

So, very very roughly, there’s a possible world in which your uncomputable function becomes computable!

But you have to be very careful about how you interpret this result.

The trick

What’s the trick? The proof is beautiful, but it would take real work to improve on Hamkins’ blog article, so please read that. I’ll just say that he makes extensive use of Rosser sentences, which say:

“For any proof of this sentence in theory T, there is a smaller proof of the negation of this sentence.”

Rosser sentences are already mind-blowing, but Hamkins uses an infinite sequence of such sentences and their negations, chosen in a way that depends on the function f, to cleverly craft a model of arithmetic in which the Turing machine T computes this function on standard inputs.

But what’s really going on? How can using a nonstandard model make an uncomputable function become computable for standard natural numbers? Shouldn’t nonstandard models agree with the standard one on this issue? After all, the only difference is that they have extra nonstandard numbers tacked on after all the standard ones! How can that make a Turing machine succeed in computing f on standard natural numbers?

I’m not 100% sure, but I think I know the answer. I hope some logicians will correct me if I’m wrong.

You have to read the result rather carefully:

There is a Turing machine T with the following property. For any function f from the natural numbers to the natural numbers, there is a model of PA such that in this model, if we give T any standard natural n as input, it halts and computes f(n).

When we say the Turing machine halts, we mean it halts after N steps for some natural number N. But this may not be a standard natural number! It’s a natural number in the model we’re talking about.

So, the Turing machine halts… but perhaps only after a nonstandard number of steps.

In short: you can compute the uncomputable, but only if you’re willing to wait long enough. You may need to wait a nonstandard amount of time.

It’s like that old Navy saying:


But the trick becomes more evident if you notice that one single Turing machine T computes different functions from the natural numbers to the natural numbers… in different models. That’s even weirder than computing an uncomputable function.

The only way to build a machine that computes n+1 in one model and n+2 in another to build a machine that doesn’t halt in a standard amount of time in either model. It only halts after a nonstandard amount of time. In one model, it halts and outputs n+1. In another, it halts and outputs n+2.

A scary possibility

To dig a bit deeper—and this is where it gets a bit scary—we have to admit that the standard model is a somewhat elusive thing. I certainly didn’t define it when I said this:

For you, the natural numbers are just 0, 1, 2, 3, …, with the usual way of adding and multiplying them. This is usually called the standard model of PA. The numbers 0, 1, 2, 3, … are called the ‘standard’ natural numbers.

The point is that “0, 1, 2, 3, …” here is vague. It makes sense if you already know what the standard natural numbers are. But if you don’t already know, those three dots aren’t going to tell you!

You might say the standard natural numbers are those of the form 1 + ··· + 1, where we add 1 to itself some finite number of times. But what does ‘finite number’ mean here? It means a standard natural number! So this is circular.

So, conceivably, the concept of ‘standard’ natural number, and the concept of ‘standard’ model of PA, are more subjective than most mathematicians think. Perhaps some of my ‘standard’ natural numbers are nonstandard for you!

I think most mathematicians would reject this possibility… but not all. Edward Nelson tackled it head-on in his marvelous book Internal Set Theory. He writes:

Perhaps it is fair to say that “finite” does not mean what we have always thought it to mean. What have we always thought it to mean? I used to think that I knew what I had always thought it to mean, but I no longer think so.

If we go down this road, Hamkins’ result takes on a different significance. It says that any subjectivity in the notion of ‘natural number’ may also infect what it means for a Turing machine to halt, and what function a Turing machine computes when it does halt.

BackreactionNew link between quantum computing and black hole may solve information loss problem

[image source: IBT]

If you leave the city limits of Established Knowledge and pass the Fields of Extrapolation, you enter the Forest of Speculations. As you get deeper into the forest, larger and larger trees impinge on the road, strangely deformed, knotted onto themselves, bent over backwards. They eventually grow so close that they block out the sunlight. It must be somewhere here, just before you cross over from speculation to insanity, that Gia Dvali looks for new ideas and drags them into the sunlight.

Dvali’s newest idea is that every black hole is a quantum computer. And not just any quantum computer, but a quantum computer made of a Bose-Einstein condensate that self-tunes to the quantum critical point. In one sweep, he has combined everything that is cool in physics at the moment.

This link between black holes and Bose-Einstein condensates is based on simple premises. Dvali set out to find some stuff that would share properties with black holes, notably the relation between entropy and mass (BH entropy), the decrease in entropy during evaporation (Page time), and the ability to scramble information quickly (scrambling time). What he found was that certain condensates do exactly this.

Consequently he went and conjectured that this is more than a coincidence, and that black holes themselves are condensates – condensates of gravitons, whose quantum criticality allows the fast scrambling. The gravitons equip black holes with quantum hair on horizon scale, and hence provide a solution to the black hole information loss problem by first storing information and then slowly leaking it out.

Bose-Einstein condensates on the other hand contain long-range quantum effects that make them good candidates for quantum computers. The individual q-bits that have been proposed for use in these condensates are normally correlated atoms trapped in optical lattices. Based on his analogy with black holes however, Dvali suggests to use a different type of state for information storage, which would optimize the storage capacity.

I had the opportunity to speak with Immanuel Bloch from the Max Planck Institute for Quantum Optics about Dvali’s idea, and I learned that while it seems possible to create a self-tuned condensate to mimic the black hole, addressing the states that Dvali has identified is difficult and, at least presently, not practical. You can read more about this in my recent Aeon essay.

But really, you may ask, what isn’t a quantum computer? Doesn’t anything that changes in time according to the equations of quantum mechanics process information and compute something? Doesn’t every piece of chalk execute the laws of nature and evaluate its own fate, doing a computation that somehow implies something with quantum?

That’s right. But when physicists speak of quantum computers, they mean a particularly powerful collection of entangled states, assemblies that allow to hold and manipulate much more information than a largely classical state. It’s this property of quantum computers specifically that Dvali claims black holes must also possess. The chalk just won’t do.

If it is correct what Dvali says, a real black hole out there in space doesn’t compute anything in particular. It merely stores the information of what fell in and spits it back out again. But a better understanding of how to initialize a state might allow us one day – give it some hundred years – to make use of nature’s ability to distribute information enormously quickly.

The relevant question is of course, can you test that it’s true?

I first heard of Dvali’s idea on a conference I attended last year in July. In his talk, Dvali spoke about possible observational evidence for the quantum hair due to modifications of orbits nearby the black hole. At least that’s my dim recollection almost a year later. He showed some preliminary results of this, but the paper hasn’t gotten published and the slides aren’t online. Instead, together with some collaborators, he published a paper arguing that the idea is compatible with the Hawking, Perry, Strominger proposal to solve the black hole information loss, which also relies on black hole hair.

In November then, I heard another talk by Stefan Hofmann, who had also worked on some aspects of the idea that black holes are Bose-Einstein condensates. He told the audience that one might see a modification in the gravitational wave signal of black hole merger ringdowns. Which have since indeed been detected. Again though, there is no paper.

So I am tentatively hopeful that we can look for evidence of this idea in the soon future, but so far there aren’t any predictions. I have an own proposal to add for observational consequences of this approach, which is to look at the scattering cross-section of the graviton condensate with photons in the wave-length regime of the horizon-size (ie radio-waves). I don’t have time to really work on this, but if you’re looking for one-year project in quantum gravity phenomenology, this one seems interesting.

Dvali’s idea has some loose ends of course. Notably it isn’t clear how the condensate escapes collapse, at least it isn’t clear to me and not clear to anyone I talked to. The general argument is that for the condensate the semi-classical limit is a bad approximation, and thus the singularity theorems are rather meaningless. While that might be, it’s too vague for my comfort. The idea also seems superficially similar to the fuzzball proposal, and it would be good to know the relation or differences.

After these words of caution, let me add that this link between condensed matter, quantum information, and black holes isn’t as crazy as it seems at first. In the last years, a lot of research has piled up that tightens the connections between these fields. Indeed, a recent paper by Brown et al hypothesizes that black holes are not only the most efficient storage devices but indeed the fastest computers.

It’s amazing just how much we have learned from a single solution to Einstein’s field equations, and not even a particularly difficult one. “Black hole physics” really should be a research field on its own right.

John Preskilllittle by little and gate by gate

Washington state was drizzling on me. I was dashing from a shuttle to Building 112 on Microsoft’s campus. Microsoft has headquarters near Seattle. The state’s fir trees refreshed me. The campus’s vastness awed me. The conversations planned for the day enthused me. The drizzle dampened me.

Building 112 houses QuArC, one of Microsoft’s research teams. “QuArC” stands for “Quantum Architectures and Computation.” Team members develop quantum algorithms and codes. QuArC members write, as their leader Dr. Krysta Svore says, “software for computers that don’t exist.”

Microsoft 2

Small quantum computers exist. Large ones have eluded us like gold at the end of a Washington rainbow. Large quantum computers could revolutionize cybersecurity, materials engineering, and fundamental physics. Quantum computers are growing, in labs across the world. When they mature, the computers will need software.

Software consists of instructions. Computers follow instructions as we do. Suppose you want to find and read the poem “anyone lived in a pretty how town,” by 20th-century American poet e e cummings. You follow steps—for example:

1) Wake up your computer.
2) Type your password.
3) Hit “Enter.”
4) Kick yourself for entering the wrong password.
5) Type the right password.
6) Hit “Enter.”
7) Open a web browser.
8) Navigate to Google.
9) Type “anyone lived in a pretty how town e e cummings” into the search bar.
10) Hit “Enter.”
11) Click the Academy of American Poets’ link.
12) Exclaim, “Really? April is National Poetry Month?”
13) Read about National Poetry Month for four-and-a-half minutes.
14) Remember that you intended to look up a poem.
15) Return to the Academy of American Poets’ “anyone lived” webpage.
16) Read the poem.

We break tasks into chunks executed sequentially. So do software writers. Microsoft researchers break up tasks intended for quantum computers to perform.

Your computer completes tasks by sending electrons through circuits. Quantum computers will have circuits. A circuit contains wires, which carry information. The wires run through circuit components called gates. Gates manipulate the information in the wires. A gate can, for instance, add the number carried by this wire to the number carried by that wire.

Running a circuit amounts to completing a task, like hunting a poem. Computer engineers break each circuit into wires and gates, as we broke poem-hunting into steps 1-16.1

Circuits hearten me, because decomposing tasks heartens me. Suppose I demanded that you read a textbook in a week, or create a seminar in a day, or crack a cybersecurity system. You’d gape like a visitor to Washington who’s realized that she’s forgotten her umbrella.


Suppose I demanded instead that you read five pages, or create one Powerpoint slide, or design one element of a quantum circuit. You might gape. But you’d have more hope.2 Life looks more manageable when broken into circuit elements.

Circuit decomposition—and life decomposition—brings to mind “anyone lived in a pretty how town.” The poem concerns two characters who revel in everyday events. Laughter, rain, and stars mark their time. The more the characters attune to nature’s rhythm, the more vibrantly they live:3

          little by little and was by was

          all by all and deep by deep
          and more by more they dream their sleep

Those lines play in my mind when a seminar looms, or a trip to Washington coincident with a paper deadline, or a quantum circuit I’ve no idea how to parse. Break down the task, I tell myself. Inch by inch, we advance. Little by little and drop by drop, step by step and gate by gate.

IBM circuit

Not what e e cummings imagined when composing “anyone lived in a pretty how town”

Unless you’re dashing through raindrops to gate designers at Microsoft. I don’t recommend inching through Washington’s rain. But I would have dashed in a drought. What sees us through everyday struggles—the inching of science—if not enthusiasm? We tackle circuits and struggles because, beyond the drizzle, lie ideas and conversations that energize us to run.


e e cummings

With thanks to QuArC members for their time and hospitality.

1One might object that Steps 4 and 14 don’t belong in the instructions. But software involves error correction.

2Of course you can design a quantum-circuit element. Anyone can quantum.

3Even after the characters die.

April 01, 2016

Jordan EllenbergNew bounds on curve tangencies and orthogonalities (with Solymosi and Zahl)

New paper up on the arXiv, with Jozsef Solymosi and Josh Zahl.  Suppose you have n plane curves of bounded degree.  There ought to be about n^2 intersections between them.  But there are intersections and there are intersections!  Generically, an intersection between two curves is a node.  But maybe the curves are mutually tangent at a point — that’s a more intense kind of singularity called a tacnode.  You might think, well, OK, a tacnode is just some singularity of bounded multiplicity, so maybe there could still be a constant multiple of n^2 mutual tangencies.

No!  In fact, we show there are O(n^{3/2}).  (Megyesi and Szabo had previously given an upper bound of the form n^{2-delta} in the case where the curves are all conics.)

Is n^{3/2} best possible?  Good question.  The best known lower bound is given by a configuration of n circles with about n^{4/3} mutual tangencies.

Here’s the main idea.  If a curve C starts life in A^2, you can lift it to a curve C’ in A^3 by sending each point (x,y) to (x,y,z) where z is the slope of C at (x,y); of course, if multiple branches of the curve go through (x,y), you are going to have multiple points in C’ over (x,y).  So C’ is isomorphic to C at the smooth points of C, but something’s happening at the singularities of C; basically, you’ve blown up!  And when you blow up a tacnode, you get a regular node — the two branches of C through (x,y) have the same slope there, so they remain in contact even in C’.

Now you have a bunch of bounded degree curves in A^3 which have an unexpectedly large amount of intersection; at this point you’re right in the mainstream of incidence geometry, where incidences between points and curves in 3-space are exactly the kind of thing people are now pretty good at bounding.  And bound them we do.

Interesting to let one’s mind wander over this stuff.  Say you have n curves of bounded degree.  So yes, there are roughly n^2 intersection points — generically, these will be distinct nodes, but you can ask how non-generic can the intersection be?  You have a partition of const*n^2 coming from the multiplicity of intersection points, and you can ask what that partition is allowed to look like.  For instance, how much of the “mass” can come  from points where the multiplicity of intersection is at least r?  Things like that.


March 31, 2016

n-Category Café Foundations of Mathematics

Roux Cody recently posted an interesting article complaining about FOM — the foundations of mathematics mailing list:

Cody argued that type theory and especially homotopy type theory don’t get a fair hearing on this list, which focuses on traditional set-theoretic foundations.

This will come as no surprise to people who have posted about category-theoretic foundations on this list. But the discussion became more interesting when Harvey Friedman, the person Cody was implicitly complaining about, joined in. Friedman is a famous logician who posts frequently on Foundations of Mathematics. He explained his “sieve” — his procedure for deciding what topics are worth studying further — and why this sieve has so far filtered out homotopy type theory.

This made me think — and not for the first time — about why different communities with different attitudes toward “foundations” have trouble understanding each other. They argue, but the arguments aren’t productive, because they talk past each other.

After much discussion, Mario Carneiro wrote:

So I avoid any notion of “standard foundations of mathematcs”, and prefer a multi-foundational approach, where the “true” theorems are those that make sense in any (and hence all) foundational systems.

I replied:

One nice thing your comment clarifies is that different people have very different attitudes toward foundations, which need to be discussed before true communication about the details can occur.

Harvey Friedman seems to believe that foundations should be “simple” (a concept that begs further explication) and sufficiently powerful to formalize all, or perhaps most, mathematics. For example, he wrote:

When I go to the IAS and ask in person where the most elementary place is that there is a real problem in formalizing mathematics using the standard f.o.m….

as part of an explanation of his “sieve”.

Most people interested in categorical foundations have a completely different attitude. This is why nothing they do will ever pass through Friedman’s sieve.

First, they — I might even say “we” — want an approach to mathematics that formalizes how modern mathematicians actually think, with a minimum of arbitrary “encoding”. For us, it is unsatisfactory to be told that 3\sqrt{-3} is a set. In ZFC it is encoded as a set, and it’s very good that this is possible, but mathematicians don’t usually think of complex numbers as sets, and if you repeatedly raised your hand and asked what are the members of various complex numbers, you’d be laughed out of a seminar.

Second, when I speak of “how modern mathematicians think”, this is not a monolithic thing: there are different historical strata that need to be considered. There seems to be a huge divide between people like Lawvere, who were content with topoi, and people who are deeply involved in the homotopification of mathematics, for whom anything short of an (,1)(\infty,1)-topos seems insanely restrictive. Homotopy type theory is mainly appealing to the latter people.

Anyone who has not been keeping up with modern mathematics will not understand what I mean by “homotopification”. Yuri Manin explained it very nicely in interview:

But fundamental psychological changes also occur. Nowadays these changes take the form of complicated theories and theorems, through which it turns out that the place of old forms and structures, for example, the natural numbers, is taken by some geometric, right-brain objects. Instead of sets, clouds of discrete elements, we envisage some sorts of vague spaces, which can be very severely deformed, mapped one to another, and all the while the specific space is not important, but only the space up to deformation. If we really want to return to discrete objects, we see continuous components, the pieces whose form or even dimension does not matter. Earlier, all these spaces were thought of as Cantor sets with topology, their maps were Cantor maps, some of them were homotopies that should have been factored out, and so on.

I am pretty strongly convinced that there is an ongoing reversal in the collective consciousness of mathematicians: the right hemispherical and homotopical picture of the world becomes the basic intuition, and if you want to get a discrete set, then you pass to the set of connected components of a space defined only up to homotopy. That is, the Cantor points become continuous components, or attractors, and so on — almost from the start. Cantor’s problems of the infinite recede to the background: from the very start, our images are so infinite that if you want to make something finite out of them, you must divide them by another infinity.

This will probably sound vague and mysterious to people who haven’t learned homotopy theory and haven’t seen how it’s transforming algebra (e.g. the so-called “brave new algebra”), geometry (“higher gauge theory”) and other subjects. Until the textbooks are written, to truly understand this ongoing revolution one must participate in it.

Again, all this stuff can be formalized in ZFC, but only at the cost of arbitrary encodings that do violence to the essence of the ideas. A homotopy type can be encoded as a topological space, or as a simplicial set, or as a cubical set, or as a globular \infty-groupoid… but all of these do some violence to the basic idea, which is — as Manin points out — something much more primitive and visual in nature. A homotopy type is a very flexible blob, not made of points.

Third, because people interested in categorical foundations are interested in formalizing mathematics in a way that fits how mathematicians actually think, and different mathematicians think in different ways at different times, we tend to prefer what you call a “multi-foundational” approach. Personally I don’t think the metaphor of “foundations” is even appropriate for this approach. I prefer a word like “entrance”. A building has one foundation, which holds up everything else. But mathematics doesn’t need anything to hold it up: there is no “gravity” that pulls mathematics down and makes it collapse. What mathematics needs is “entrances”: ways to get in. And it would be very inconvenient to have just one entrance.

Jordan EllenbergDissecting squares into equal-area triangles: idle questions

Love this post from Matt Baker in which he explains the tropical / 2-adic proof (in fact the only proof!) that you can’t dissect a square into an odd number of triangles of equal area.  In fact, his argument proves more, I think — you can’t dissect a square into triangles whose areas are all rational numbers with odd denominator!

  • The space of quadrilaterals in R^2, up to the action of affine linear transformations, is basically just R^2, right?  Because you can move three vertices to (0,0), (0,1), (1,0) and then you’re basically out of linear transformations.   And the property “can be decomposed into n triangles of equal area” is invariant under those transformations.  OK, so — for which choices of the “fourth vertex” do you get a quadrilateral that has a decomposition into an odd number of equal-area triangles? (I think once you’re not a parallelogram you lose the easy decomposition into 2 equal area triangles, so I suppose generically maybe there’s NO equal-area decomposition?)  When do you have a decomposition into triangles whose area has odd denominator?
  • What if you replace the square with the torus R^2 / Z^2; for which n can you decompose the torus into equal-area triangles?  What about a Riemann surface with constant negative curvature?  (Now a “triangle” is understood to be a geodesic triangle.)  If I have this right, there are plenty of examples of such surfaces with equal-area triangulations — for instance, Voight gives lots of examples of Shimura curves corresponding to cocompact arithmetic subgroups which are finite index in triangle groups; I think that lets you decompose the Riemann surface into a union of fundamental domains each of which are geodesic triangles of the same area.

John BaezShock Breakout

Here you can see the brilliant flash of a supernova as its core blasts through its surface. This is an animated cartoon made by NASA based on observations of a red supergiant star that exploded in 2011. It has been sped up by a factor of 240. You can see a graph of brightness showing the actual timescale at lower right.

When a star like this runs out of fuel for nuclear fusion, its core cools. That makes the pressure drop—so the core collapses under the force of gravity.

When the core of a supernova collapses, the infalling matter can reach almost a quarter the speed of light. So when it hits the center, this matter becomes very hot! Indeed, the temperature can reach 100 billion kelvin. That’s 6000 times the temperature of our Sun’s core!

For a supernova less than 25 solar masses, the collapse stops only when the core is compressed into a neutron star. As this happens, lots of electrons and protons become neutrons and neutrinos. Most of the resulting energy is instantly carried away by a ten-second burst of neutrinos. This burst can have an energy of 1046 joules.

It’s hard to comprehend this. It’s what you’d get if you suddenly converted the mass of 18,000 Earths into energy! Astronomers use a specially huge unit with such energies: the foe, which stands for ten to the fifty-one ergs.

That’s 1044 joules. So, a supernova can release 100 foe in neutrinos. By comparison, only 1 or 2 foe come out as light.

Why? Neutrinos can effortlessly breeze through matter. Light cannot! So it takes longer to actually see things happen at the star’s surface—especially since a red supergiant is large. This one was about 500 times the radius of our Sun.

So what happened? A shock wave rushed upward through the star. First it broke through the star’s surface in the form of finger-like plasma jets, which you can see in the animation.

20 minutes later, the full fury of the shock wave reached the surface—and the doomed star exploded in a blinding flash! This is called the shock breakout.

Then the star expanded as a blue-hot ball of plasma.

Here’s how the star’s luminosity changed with time, measured in multiples of the Sun’s luminosity:

Note that while the shock breakout seems very bright, it’s ultimately dwarfed by the luminosity of the expanding ball of plasma. So, KSN2011d was actually one of the first two supernovae for which the shock breakout was seen! For details, read this:

• P. M. Garnavich, B. E. Tucker, A. Rest, E. J. Shaya, R. P. Olling, D. Kasen and A. Villar, Shock breakout and early light curves of Type II-P supernovae observed with Kepler.

A Type II supernova is one that shows hydrogen in its spectral lines: these are commonly formed by the collapse of a star that has run out of fuel in its core, but retains hydrogen in its outer layers. A Type II-P is one that shows a plateau in its light curve: the P is for ‘plateau’. These are more common than the Type II-L, which show a more rapid (‘linear’) decay in their luminosity:

March 30, 2016

Jordan EllenbergBugbiter

Dream:  I meet two 9-year-old boys with identical long curly hair.  They’re in a band, the band is called Bugbiter.  They explain to me that most of their songs are about products, and they share their songs via videos they post on Amazon.

I share this dream with you mostly because I think Bugbiter is actually a legitimately good band name.

March 29, 2016

Sean CarrollScience Career Stories

The Story Collider is a wonderful institution with a simple mission: getting scientists to share stories with a broad audience. Literal, old-fashioned storytelling: standing up in front of a group of people and spinning a tale, typically with a scientific slant but always about real human life. It was founded in 2010 by Ben Lillie and Brian Wecht; I got to know Ben way back when he was a postdoc at Argonne and the University of Chicago, before he switched from academia to the less well-trodden paths of communication and the wrangling of non-profit organizations.

By now the Story Collider has accumulated quite a large number of great tales from scientists young and old, and I encourage you to catch a live show or crawl through their archives. I was able to participate in one about a year ago, where I shared the stage with a number of fascinating scientific storytellers. One of them was one of my mentors and favorite physicists, Alan Guth. Of course he has an advantage at this game in comparison to most other scientists, as he gets to tell the story of how he came up with one of the most influential ideas in modern cosmology: the inflationary universe.

It’s a great story, both for the science and for the personal aspect: Alan was near the end of his third postdoc at the time, and his academic prospects were far from clear. You just need that one brilliant idea to pop up at the right time.

But everyone’s path is different. Here, from a different event, is my young Caltech colleague Chiara Mingarelli, who explains how she ended up studying gravitational waves at the center of the universe.

Finally, it is my blog, so here is the story I told. I basically talked about myself, but I used my (occasionally humorous) interactions with Stephen Hawking as a hook. Never be afraid to hitch a ride on the coattails of someone immensely more successful, I always say.

March 26, 2016

Noncommutative GeometryAn indirect consequence of the famous Lucas congruence...

So, in the course of function field arithmetic, one runs into the binomial coefficients (like one does most everywhere in mathematics); or rather the coefficients modulo a prime p. The primary result about binomial coefficients modulo p is of course the congruence of Lucas. In function field arithmetic one seems to be unable to avoid the group obtained by permuting p-adic (or q-adic) coefficients

Scott AaronsonA postdoc post

I apologize that this announcement is late in this year’s hiring season, but here goes.  I’m seeking postdocs in computational complexity and/or quantum information science to join me at UT Austin starting in Fall of 2016.  As I mentioned before, there’s a wonderful CS theory group at UT that you can work with and benefit from, including Adam Klivans, David Zuckerman, Anna Gal, Vijaya Ramachandran, Brent Waters, Eric Price, Greg Plaxton, and of course my wife Dana Moshkovitz, who will be joining UT as well.  If you’re interested, please email me a CV and a short cover letter, and ask your PhD adviser and one or two others to email me recommendation letters.  The postdoc would be for two years by default.

    Update (March 26): If you want to be considered for next year, please get your application to me by March 31st.

    Another Update: I’m very honored, along with fourteen others, to have received a 2016 US National Security Science and Engineering Faculty Fellowship (NSSEFF), which supports unclassified basic research related in some way to DoD interests. My project is called “Paths to Quantum Supremacy.” Now that my Waterman award has basically been spent down, this is where much of the funding for quantum computing initiatives at UT Austin will come from for the next five years.

March 24, 2016

John PreskillRemember to take it slow

“Spiros, can you explain to me this whole business about time being an illusion?”

These were William Shatner’s words to me, minutes after I walked into the green room at Silicon Valley’s Comic Con. The iconic Star Trek actor, best known for his portrayal of James Tiberius Kirk, captain of the starship Enterprise, was chatting with Andy Weir, author of The Martian, when I showed up at the door. I was obviously in the wrong room. I had been looking for the room reserved for science panelists, but had been sent up an elevator to the celebrity green room instead (a special room reserved for VIPs during their appearance at the convention). Realizing quickly that something was off, I did what anyone else would do in my position. I sat down. To my right was Mr. Weir and to my left was Mr. Shatner and his agent, Mr. Gary Hasson. For the first few minutes I was invisible, listening in casually as Mr. Weir revealed juicy details about his upcoming novel. And then, it happened. Mr. Shatner turned to me and asked: “And who are you?” Keep calm young man. You can outrun him if you have to. You are as entitled to the free croissants as any of them. “I am Spiros,” I replied. “And what do you do, Spiros?” he continued. “I am a quantum physicist at Caltech.” Drop the mic. Boom. Now I will see myself out before security…


“Spiros, can you explain to me this whole business about time being an illusion?”

Huh, I wonder if he means the… “You know, how there is no past, present or future in quantum mechanics,” Mr. Shatner continued. “Well, yes,” I responded, “that is called the arrow of time, an emergent direction in the time parameter found in the equation describing evolution in quantum physics. By the way, that time parameter itself is also emergent.” And then things got out of hand. “Wait a minute, are you telling me that not just the arrow of time, but time itself as a concept is an illusion?” asked Mr. Shatner with genuine excitement. “Yes. For starters, the arrow of time itself is a consequence of an emergent asymmetry between events that are all equally likely at the microscopic level. Think about flipping a fair coin one hundred times, for example. The probability of getting all heads is astronomically small. Zero point zero zero zero… with thirty zeroes before the one. Same is true if I ask you how likely it is that you flip fifty heads and then fifty tails,” I said and waited. “OK… still following,” Mr. Shatner assured me, so I continued, “but, say that you have trouble keeping track of all the different positions of the heads and tails; all you care about is counting how many times you flipped heads and how many times you flipped tails. What is the probability that you would count one hundred heads?” I asked. Mr. Shatner thought for a second, and so did Mr. Weir, before they answered almost in unison, “Well, it is still astronomically small. Just like before.” Yes! Holy cow, Batman, this is actually happening. I am having a conversation about physics with captain Kirk and the mastermind behind this year’s Golden Globe winner for Best Motion Picture: Musical or Comedy! This makes no sense! And I am not talking about the movie award – The Martian was hilarious.


“Exactly,” I replied. “But what about flipping the coin and counting fifty heads and fifty tails?” I asked. I could see that their wheels were spinning. What was I getting at? How was this different from before? “Does it have to be the first fifty heads, or can it be any which way, as long as it is fifty?” asked Mr. Weir. Bingo. “Any which way. We can only keep track of the number of them, not their position,” I reminded him. “Well, there are many more ways then to get fifty heads,” noted Mr. Shatner. “Yes there are,” I agreed and continued, “In fact, there are about one thousand billion billion billion combinations that all give fifty heads and fifty tails. In other words, one in ten times you flip a coin a hundred times, you will count exactly fifty heads and fifty tails. Think about this for a second. The probability of counting exactly fifty heads the first time you flip a coin a hundred times is thirty orders of magnitude larger than counting one hundred heads. Remember that any particular configuration of heads and tails is equally – astronomically – unlikely. But if you zoom out, then magic happens and an emergent asymmetry appears. A really huge asymmetry, at that.” They were hooked. It was time for the grand finale. “So, which events then are more likely for us to experience in the next second, if all of them are equally likely at some fundamental level?” I asked. Mr. Shatner responded first: “The ones that have billions of microscopic configurations that all look the same when you zoom out. Like the fifty heads thing.” Then, Mr. Weir, turning to Mr. Shatner added, “That’s the arrow of time following the direction of entropy as it increases.” I nodded (maybe a little too eagerly) and looked at my phone to see that it was close to noon. It would take me about five minutes to walk to Room 2 of the San Jose convention center, where Mr. Weir was to headline a panel titled “Let’s Go to Mars!” There was no way I was missing that panel. I knew that by now there would be a very long line of eager attendees waiting to hear Mr. Weir and Mr. Adam Savage (of Mythbusters fame) talk about Mars exploration. With some luck, I could walk there with Mr. Weir and sneak in without being noticed by the door police. I told Mr. Weir that it was time for us to go downstairs. He got up, I got up and…

“Spiros, where do you think you are going? Come here, sit right next to me. You promised to explain how time works. You can’t leave me hanging now!” Mr. Shatner was adamant.

I looked to Mr. Hasson and Mr. Weir, who were caught in the middle of this. “I… I can come back and we can talk more after Andy’s panel… My panel isn’t until 2 o’ clock,” I pleaded. Mr. Shatner did not think so. Science could not wait another second. He was actually interested in what I had to say, so I turned to Mr. Weir apologetically and he nodded with understanding and a “good luck, kid” kind-of-smile. Mr. Hasson seemed pleased with my choice and made some room for me to sit next to the captain.


“Now, where were we? Ah yes, you were going to explain to me how time itself is an illusion. Something about time in quantum evolution being emergent. What do you mean?” asked Mr. Shatner, cutting right to the chase. It was time for me to go all in: “Well, you see, there is this equation in quantum mechanics – Erwin Schrodinger came up with it – that tells us how the state of the universe at the quantum level changes with time. But where does time come from? Is it a fundamental concept, or is there something out there without which time itself cannot exist?” I waited for a second, as Mr. Shatner contemplated my question. He was stumped. What could possibly be more fundamental than time? Hmm… “Change,” I said. “Without change, there is no time and, thus, no quantum evolution. And without quantum evolution there is no classical evolution, no arrow of time. So everything hinges on the ability of the quantum state of the visible universe to change.” I paused to make sure he was following, then continued, “But if there is change, then where does it come from? Wherever it comes from, unless we end up with a timeless, unchanging and featureless entity, we will always be on the hook for explaining why it is changing, how it is changing and why it looks the way it does and not some other way,” I said and waited a second to let this sink in. “Spiros, if you are right, then how the heck can you get something out of nothing? If the whole thing is static, how come we are not frozen in time?” asked pointedly Mr. Shatner. “We are not the whole thing,” I said, maybe a bit too abruptly. “What do you mean we are not the whole thing? What else is there?” questioned Mr. Shatner. At this point I could see a large smile forming on Mr. Hasson’s face. His old friend, Bill Shatner, was having fun. A different kind of fun. A different kind of Comic Con. Sure, Bill still had to sit at a table in the main Exhibit Hall to greet thousands of fans, sign their favorite pictures of him and, for a premium, stand next to them for a picture that they would frame and display in their homes for decades to come. “Spiros, do you have a card?” interjected Mr. Hasson. Hmm, how do I say that this is not a thing among scientists… “I ran out. Sorry, everyone wants one these days, so… Here, I can type my email and number in your phone. Would that work?” I said, stretching the truth 1/slightly. “That would be great, thanks,” replied Mr. Hasson.


With Mr. Stan Lee at the Silicon Valley Comic Con. At 93, Mr. Lee spent the whole weekend with fans, not once showing up at the green room to take a break. So I hunted him down with help from Mr. Hasson.

“Hey, stop distracting him! We are so close to the good stuff!” blasted Mr. Shatner. “Go on, now, Spiros. How does anything ever change?” asked Mr. Shatner with some urgency in his voice. “Dynamic equilibrium,” I replied. “Like a chemical reaction that is in equilibrium. You look from afar and see nothing happening. No bubbles, nothing. But zoom in a little and you see products and reactants dissolving and recombining like crazy, but always in perfect balance. The whole remains static, while the parts experience dramatic change.” I let this simmer for a moment. “We are not the whole. We are just a part of the whole. We are too big to see the quantum evolution as it happens in all its glory. But we are also too small to remain unchanged. Our visible universe is in dynamic equilibrium with a clock universe with which we are maximally entangled. We change only because the state of the clock universe changes randomly and we have no control over it, but to change along with it so that the whole remains unchanged,” I concluded, hoping that he would be convinced by a theory that had not seen the light of day until that fateful afternoon. He was not convinced yet. “Wait a minute, why would that clock universe change in the first place?” he asked suspiciously. “It doesn’t have to,” I replied, anticipating this excellent question, and went on, “It could remain in the same state for a million years. But we wouldn’t know it, because the state of our visible universe would have to remain in the same state also for a million years. We wouldn’t be able to tell that a million years passed between every microsecond of change, just like a person under anesthesia can’t tell that they are undergoing surgery for hours, only to wake up thinking it was just a moment earlier that they were counting down to zero.” He fell silent for a moment and then a big smile appeared on his face. “Spiros, you have an accent,” he said, as if stating the obvious. “Can I offer you a piece of advise?” he asked, in a calm voice. I nodded. “One day you will be in front of a large crowd talking about this stuff. When you are up there, make sure you talk slow so people can keep up. When you get excited, you start speaking faster and faster. Take breaks in-between,” he offered. I smiled and thanked him for the advise. By then, it was almost one o’ clock and Mr. Weir’s panel was about to end. I needed to go down there for real this time and meet up with my co-panelists, Shaun Maguire and Laetitia Garriott de Cayeux, since our panel was coming up next. I got up and as I was leaving the room, I heard from behind,

“Remember to take it slow, Spiros. When you are back, you will tell me all about how space is also an illusion.”

Aye aye captain!

March 22, 2016

Jordan EllenbergMath bracket 2016

It’s that time again — March Math Madness, where we fill out an NCAA men’s tournament bracket with the best math department winning every game.  As always, this bracket was filled out by a highly trained team consisting of myself and a group of procrastinating grad students, making decisions by voice vote, and if you disapprove of one of our choices, I’m sure it’s somebody else’s fault.  This is Berkeley’s first championship after falling to Harvard in 2012; meanwhile, Michigan sees its second final in three years but falls short again…

Screen Shot 2016-03-16 at 16 Mar 10.51.PM

Update: In the 34th percentile at ESPN after one day of play — thanks, Yale!

Update:  Down to the 5th percentile and only Duke and UVa are left out of my final 8 picks.  Not gonna be the math bracket’s finest year.

March 20, 2016

Chad Orzel193-197/366: March Meeting

I didn’t take the DSLR to March Meeting with me, but I did throw a point-and-shoot in my bag. A few of these are still just cell-phone snapshots, because I didn’t have the bag with me all the time.

193/366: Stadium View

When I checked into the hotel, they told me I had a “stadium view” room on the sixth floor. I was in a hurry to get to a social event, so I didn’t really look that night, but they were right:

Football and baseball stadiums in downtown Baltimore, from my hotel room.

Football and baseball stadiums in downtown Baltimore, from my hotel room.

194/366: Convention hall

The primary purpose of the trip was, of course, to attend the March Meeting, and that meant going to lots of talks. Along with a lot of other people:

The meeting area of the Baltimore Convention Center.

The meeting area of the Baltimore Convention Center.

The Convention Center in Baltimore isn’t really set up in a way that lets you get good crowd shots, but this gives you a little sense of it. Not only people bustling around on their way to and from talks, but also a lot of groups collaborating on laptops at tables.

195/366: Eromitlab

Just to prove that I was, in fact, in Baltimore, here’s a shot that includes (to the right) the famous Bromo-Seltzer Tower:

Buildings in downtown Baltimore, near the convention center.

Buildings in downtown Baltimore, near the convention center.

This is probably the structure that is most distinctly Baltimorean. And was prominently featured in the sniper plotline of the late, great, Homicide: Life on the Street.

196/366: Posters

No physics meeting would be complete without a poster session, so:

Crowd shot at one of the March Meeting poster sessions.

Crowd shot at one of the March Meeting poster sessions.

Honestly, I kind of hate these. Mostly because, as a large and self-conscious guy, whenever I’m in a big crowd I feel like I’m about to awkwardly trample somebody. The March Meeting made these easy to avoid, though, by scheduling them at the same time as sessions of talks, and also by moving the time for the poster session around from one day to the next (2-5 on Tuesday, 11-2 on Wednesday, 1-4 on Thursday), so I actually completely missed one that a bunch of Union students were presenting in. Not sure who thought that was a good idea.

197/366: America!

Really, the physics-conference part of these trips isn’t very photogenic, so here’s another shot of Baltimore:

Camden Yards from the outside deck at the Convention Center.

Camden Yards from the outside deck at the Convention Center.

Okay, fine, it’s a little cliche to do the baseball-stadium-and-American-flag thing, but I like the way it came out. And it’s my blog. So there.

Chad OrzelPhysics Blogging Round-Up: Mostly March Meeting

I was at the APS March Meeting last week, because I needed tp give a talk reporting on the Schrödinger Sessions. But as long as I was going to be there anyway, I figured I should check out the huge range of talks on areas of physics that aren’t my normal thing– in fact, I deliberately avoided going to DAMOP-sponsored sessions.

This also affected my blogging, so the last few weeks’ worth of posts at Forbes have mostly been on March Meeting-related areas:

How Cold Atoms Might Help Physicists Understand Superconductors: A post about the connection between ultra-cold atomic physics and condensed matter, prompted by a visit to Illinois and the impending March Meeting.

Why Physicists Want Their Best Theory To Fail: Another few-sigma result from the LHC got a bit of attention, prompting some thoughts on why everyone is so anxious for the Standard Model to break.

Why Isn’t The Biggest Conference In Physics More Popular? Money. Dig down far enough, and the answer is always money.

Soot And Diamonds: Progress And Perspective In The Practice Of Physics: A remark by Jim Kakalios at dinner spins off into some thoughts about the factors that drive the choice of systems to study in physics.

Physics Will Never Be Over: Most of the stuff I went to at March Meeting was quantum, but the last day was all powered by classical physics, proving we’re not done with Newton’s Laws yet.

As always, traffic to the blog passeth all understanding. Of these five, the one I’m happiest with is “Soot And Diamonds,” which is last in terms of readership, by a factor of two. Go figure. But I’m not unhappy with any of these, even though two were written in airports and a third banged out over breakfast. If I were ever to bang together an ebook collection of physics blog posts, these last couple of weeks would be well represented…

March 19, 2016

ResonaancesDiphoton update

Today at the Moriond conference ATLAS and CMS updated their diphoton resonance searches. There's been a rumor of an ATLAS analysis with looser cuts on the photons where the significance of the 750 GeV excess grows to a whopping 4.7 sigma. The rumor had it that the this analysis would be made public today, so the expectations were high. However, the loose-cuts analysis was not approved in time by the collaboration, and the fireworks display was cancelled.  In any case,  there was some good news today, and some useful info for model builders was provided.

Let's start with ATLAS. For the 13 TeV results, they now have two analyses: one called spin-0 and one called spin-2. Naively, the cuts in the latter are not optimized not for a spin-2 resonance but rather for a high-mass resonance  (where there's currently no significant excess), so the spin-2 label should not be treated too seriously in this case. Both analyses show a similar excess at 750 GeV: 3.9 and 3.6 sigma respectively for a wide resonance. Moreover, ATLAS provides additional information about the diphoton events, such as the angular distribution of the photons, the number of accompanying jets, the amount of missing energy, etc. This may be very useful for theorists entertaining less trivial models, for example when the 750 GeV resonance is produced  from a decay of a heavier parent particle. Finally, ATLAS shows a re-analysis of the diphoton events collected at 8 TeV center-of-energy of the LHC. The former run-1 analysis was a bit sloppy in the interesting mass range; for example, no limits at all were given for a 750 GeV scalar hypothesis.  Now the run-1 data have been cleaned up and analyzed using the same methods as in run-2. Excitingly, there's a 2 sigma excess in the spin-0 analysis in run-1, roughly compatible with what one would expect given the observed run-2 excess!   No significant excess is seen for the spin-2 analysis, and the tension between the run-1 and run-2 data is quite severe in this case. Unfortunately, ATLAS does not quote the combined significance and the best fit cross section for the 750 GeV resonance.

For CMS, the big news is that the amount of 13 TeV data at their disposal has increased by 20%. Using MacGyver skills, they managed to make sense of the chunk of data collected when the CMS magnet was off due to a technical problem. Apparently it was worth it, as new diphoton events have been found in the 750 GeV ballpark. Thanks to that, and a better calibration,  the significance of the diphoton excess in run-2  actually increases up to 2.9 sigma!  Furthermore, much like ATLAS, CMS updated their run-1 diphoton analyses and combined them with the run-2 ones.  Again, the combination increases the significance of the 750 GeV excess. The combined significance quoted by CMS is 3.4 sigma,  similar for spin-0 and spin-2 analyses. Unlike in ATLAS, the best fit is for a narrow resonance, which is the more preferred option from the theoretical point of view.

In summary, the diphoton excess survived the first test.  After adding more data and improving the analysis techniques the significance slightly increases rather than decreases, as expected for a real particle.  The signal is now a bit more solid: both experiments have a similar amount of diphoton data and they both claim a similar significance of the  750 GeV bump.  It may be a good moment to rename the ATLAS diphoton excess as the LHC diphoton excess :)  So far, the story of 2012 is repeating itself: the initial hints of a new resonance solidify into a consistent picture. Are we going to have another huge discovery this summer?

March 18, 2016

Matt StrasslerThe Two-Photon Excess at LHC Brightens Slightly

Back in December 2015, there was some excitement when the experiments ATLAS and CMS at the Large Hadron Collider [LHC] — especially ATLAS — reported signs of an unexpectedly large number of proton-proton collisions in which

  • two highly energetic photons [particles of light] were produced, and
  • the two photons could possibly have been produced in a decay of an unknown particle, whose mass would be about six times the mass of the Higgs particle (which ATLAS and CMS discovered in 2012.)

This suggested the possibility of an unknown particle of some type with rest mass of 750 GeV/c².  However, the excess could just be a statistical fluke, of no scientific importance and destined to vanish with more data.

The outlook for that bump on a plot at 750 GeV has gotten a tad brighter… because not only do we have ATLAS’s plot, we now have increasing evidence for a similar bump on CMS’s plot. This is thanks largely to some hard work on the part of the CMS experimenters.  Some significant improvements at CMS,

  1. improved understanding of their photon energy measurements in their 2015 data,
  2. ability to use 2015 collisions taken when their giant magnet wasn’t working — fortunately, the one type of particle whose identity and energy can be measured without a magnet is… a photon!
  3. combination of the 2015 data with their 2012 data,

have increased the significance of their observed excess by a moderate amount. Here’s the scorecard.*

  • CMS 2015 data (Dec.): excess is 2.6σ local, < 1σ global
  • CMS 2015 data (improved, Mar.) 2.9σ local, < 1σ global
  • CMS 2015+2012 data: 3.4σ local, 1.6σ global
  • ATLAS 2015 data (Dec. and Mar.): 3.6σ local, 2.0σ global to get a narrow bump [and 3.9σ local , 2.3σ global to get a somewhat wider bump, but notice this difference is quite insignificant, so narrow and wider are pretty much equally ok.]
  • ATLAS 2015+2012 data: not reported, but clearly goes up a bit more, by perhaps half a sigma?

You can read a few more details at Resonaances.

*Significance is measured in σ (“standard deviations”) and for confidence in potentially revolutionary results we typically want to see local significance approaching 5σ and global approaching 3σ in both experiments. (The “local” significance tells you how unlikely it is to see a random bump of a certain size at a particular location in the plot, while the “global” significance tells you how unlikely it is to see such a bump anywhere in the plot … obviously smaller because of the look-elsewhere effect.)

This is good news, but it doesn’t really reflect a qualitative change in the situation. It leaves us slightly more optimistic (which is much better than the alternative!) but, as noted in December, we still won’t actually know anything until we have either (a) more data to firm up the evidence for these bumps, or (b) a discovery of a completely independent clue, perhaps in existing data. Efforts for (b) are underway, and of course (a) will get going when the LHC starts again… soon!  Next news on this probably not til June at the earliest… unless we’re very lucky!

Filed under: LHC News, Particle Physics Tagged: atlas, cms, diphoton, LHC, photons

March 17, 2016

Jordan EllenbergThose who leave and those who stay

Just finished the third of Ferrante’s Neapolitan novels.  Greco, the narrator, is constantly yearning for a quiet space, away from competition.  The sense is that you can only make art in such a quiet space.  But it seems there’s no interaction between people without one striving to fuck, thwart, or destroy the other.   So maybe no quiet space exists, though Greco again and again almost seems to find it.  Ferrante puts the football down in front of her, Ferrante pulls it away.  And you’re surprised every time.

March 15, 2016

Resonaances750 GeV: the bigger picture

This Thursday the ATLAS and CMS experiments will present updated analyses of the 750 GeV diphoton excess. CMS will extend their data set by the diphoton events collected in the periods when the detector was running without the magnetic field (which is not essential for this particular study), so the amount of available data will slightly increase. We will then enter the Phase-II of the excitement protocol,  hopefully followed this summer by another 4-th-of-July-style discovery. To close the Phase-I, here's a long-promised post about the bigger picture. There's at least 750 distinct models that can accommodate the diphoton signal observed by ATLAS and CMS. However, a larger framework for physics beyond the Standard Model it which these phenomenological models can be embedded is a more tricky question. Here is a bunch of speculations.

Whenever a new fluctuation is spotted at the LHC one cannot avoid mentioning supersymmetry. However,  the 750 GeV resonance cannot be naturally interpreted in this framework, not the least because it cannot be identified as a superpartner of any known particles. The problem is that explaining the observed signal strength requires introducing new particles with large couplings, and the complete theory typically enters into a strong coupling regime at the energy scale of a few TeV. This is not the usual SUSY paradigm, with weakly coupled physics at the TeV scale followed by a desert up to the grand unification scale. Thus, even if the final answer may still turn out to be supersymmetric, it will not be the kind of SUSY we've been expecting all along. Weakly coupled supersymmetric explanations are still possible in somewhat more complicated scenarios with new very light sub-GeV particles and cascade decays, see e.g. this NMSSM model.

Each time you see a diphoton peak you want to cry Higgs, since this is how the 125 GeV Higgs boson was first spotted. Many theories predict an extended Higgs sector with multiple heavy scalar particles, but again such a framework is not the most natural one for interpreting the 750 GeV resonance. There are two main reasons. One is that different Higgs scalars typically mix, but the mixing angle in this case is severely constrained by Higgs precision studies and non-observation of 750 GeV diboson resonances in other channel. The other is that, for a 750 GeV Higgs scalar, the branching fraction into the diphoton final state is typically tiny (e.g., ~10^-7 for a Standard-Model-Higgs-like scalar) and a complicated model gymnastics is needed to enhance it. The possibility that the 750 GeV resonance is a heavy Higgs boson is by no means excluded, but I would be surprised if this were the case.  

It is more tempting to interpret the diphoton resonance as a bound state of new strong interactions with a confinement scale in the TeV range. We know that the Quantum Chromodynamics (QCD) theory, which describes the strong interactions of the Standard Model quarks, gives rise to many scalar mesons and higher-spin resonances at low energies. Such a behavior is characteristic for a large class of similar theories.  Furthermore,  if the new strong sector contains mediator particles  that carry color and electromagnetic charges, the production in gluon fusion and decay into photons is possible for the composite states, see e.g. here.  The problem is that, much as for QCD, one would expect not one but an entire battalion of resonances. One needs to understand how the remaining resonances predicted by typical strongly interacting models could have avoided detection so far.

One way this could happen is if the 750 GeV resonance is a scalar that, for symmetry reasons, is much lighter than most of the particles in the strong sector. Here again our QCD may offer us a clue, as it contains pseudo-scalar particles, the so-called pions,  which are a factor of 10 lighter than the typical mass scale of other resonances. In QCD, pions are Goldstone bosons of the chiral symmetry spontaneously broken by the vacuum quark condensate. In other words, the smallness of the pion mass is  protected by a symmetry, and general theorems worked out in the 60s  ensure the quantum stability of such an arrangement. The similar mechanism can be easily implemented in other strongly interacting theories,  and it is possible to realize the 750 GeV resonance as a new kind of pion, see e.g. here.   Even the mechanism for decaying into photons -- via chiral anomalies -- can be borrowed directly from QCD. However, the symmetry protecting the 750 GeV scalar could also be completely different that the ones we have seen so far. One example is the dilaton, that is   a Goldstone boson of a spontaneously broken conformal symmetry, see e.g. here. This is a theoretically interesting possibility, since approximate conformal symmetry often arises as a feature of strongly interacting theories. All in all, the 750 GeV particle may well be  a pion or dilaton harbinger of new strong interactions at a TeV scale. One can then further speculate that the Higgs boson also originates from that sector, but that is a separate story that may or may not be true.

Another larger framework worth mentioning here is that of extra dimensions. In the modern view, theories with the new 4th dimension of space are merely an effective description of strongly interacting sectors discussed above. For example, the famous Randall-Sundrum model, with the Standard Model living in a section of a 5D AdS5 space, is a weakly coupled dual description of strongly coupled theories with a conformal symmetry and a large N gauge symmetry. These models thus offer a calculable way to embed the 750 GeV resonance in a strongly interacting theory. For example, the dilaton can be effectively described in the Randall-Sundrum model as the radion - a scalar particle corresponding to fluctuations of the size of the 5th dimension, see e.g. here. Moreover, the Randall-Sundrum framework  provides a simple way to realize the 750 GeV particle as a spin-2 resonance. Indeed, the model always contains massive Kaluza-Klein  excitations of the graviton, whose couplings to matter can be much stronger than that of the massless graviton. This possibility have been relatively less explored so far, see e.g.  here,  but that may change next week...

Clearly, it is impossible to say anything conclusive at this point. More data in multiple decay channels is absolutely necessary  for a more concrete picture to emerge. For me personally, a confirmation of the 750 GeV excess would be a strong hint for new strong interactions at a few TeV scale. And if this is indeed the case,  one may seriously think that our  40-years-long brooding about the hierarchy problem has not been completely misguided...