Planet Musings

July 01, 2015

Scott AaronsonQuantum query complexity: the other shoe drops

Two weeks ago I blogged about a breakthrough in query complexity: namely, the refutation by Ambainis et al. of a whole slew of conjectures that had stood for decades (and that I mostly believed, and that had helped draw me into theoretical computer science as a teenager) about the largest possible gaps between various complexity measures for total Boolean functions. Specifically, Ambainis et al. built on a recent example of Göös, Pitassi, and Watson to construct bizarre Boolean functions f with, among other things, near-quadratic gaps between D(f) and R0(f) (where D is deterministic query complexity and R0 is zero-error randomized query complexity), near-1.5th-power gaps between R0(f) and R(f) (where R is bounded-error randomized query complexity), and near-4th-power gaps between D(f) and Q(f) (where Q is bounded-error quantum query complexity). See my previous post for more about the definitions of these concepts and the significance of the results (and note also that Mukhopadhyay and Sanyal independently obtained weaker results).

Because my mental world was in such upheaval, in that earlier post I took pains to point out one thing that Ambainis et al. hadn’t done: namely, they still hadn’t shown any super-quadratic separation between R(f) and Q(f), for any total Boolean function f. (Recall that a total Boolean function, f:{0,1}n→{0,1}, is one that’s defined for all 2n possible input strings x∈{0,1}n. Meanwhile, a partial Boolean function is one where there’s some promise on x: for example, that x encodes a periodic sequence. When you phrase them in the query complexity model, Shor’s algorithm and other quantum algorithms achieving exponential speedups work only for partial functions, not for total ones. Indeed, a famous result of Beals et al. from 1998 says that D(f)=O(Q(f)6) for all total functions f.)

So, clinging to a slender reed of sanity, I said it “remains at least a plausible conjecture” that, if you insist on a fair comparison—i.e., bounded-error quantum versus bounded-error randomized—then the biggest speedup quantum algorithms can ever give you over classical ones, for total Boolean functions, is the square-root speedup that Grover’s algorithm easily achieves for the n-bit OR function.

Today, I can proudly report that my PhD student, Shalev Ben-David, has refuted that conjecture as well.  Building on the Göös et al. and Ambainis et al. work, but adding a new twist to it, Shalev has constructed a total Boolean function f such that R(f) grows roughly like Q(f)2.5 (yes, that’s Q(f) to the 2.5th power). Furthermore, if a conjecture that Ambainis and I made in our recent “Forrelation” paper is correct—namely, that a problem called “k-fold Forrelation” has randomized query complexity roughly Ω(n1-1/k)—then one would get nearly a cubic gap between R(f) and Q(f).

The reason I found this question so interesting is that it seemed obvious to me that, to produce a super-quadratic separation between R and Q, one would need a fundamentally new kind of quantum algorithm: one that was unlike Simon’s and Shor’s algorithms in that it worked for total functions, but also unlike Grover’s algorithm in that it didn’t hit some impassable barrier at the square root of the classical running time.

Flummoxing my expectations once again, Shalev produced the super-quadratic separation, but not by designing any new quantum algorithm. Instead, he cleverly engineered a Boolean function for which you can use a combination of Grover’s algorithm and the Forrelation algorithm (or any other quantum algorithm that gives a huge speedup for some partial Boolean function—Forrelation is just the maximal example), to get an overall speedup that’s a little more than quadratic, while still keeping your Boolean function total. I’ll let you read Shalev’s short paper for the details, but briefly, it once again uses the Göös et al. / Ambainis et al. trick of defining a Boolean function that equals 1 if and only if the input string contains some hidden substructure, and the hidden substructure also contains a pointer to a “certificate” that lets you quickly verify that the hidden substructure was indeed there. You can use a super-fast algorithm—let’s say, a quantum algorithm designed for partial functions—to find the hidden substructure assuming it’s there. If you don’t find it, you can simply output 0. But if you do find it (or think you found it), then you can use the certificate, together with Grover’s algorithm, to confirm that you weren’t somehow misled, and that the substructure really was there. This checking step ensures that the function remains total.

Are there further separations to be found this way? Almost certainly! Indeed, Shalev, Robin Kothari, and I have already found some more things (as well as different/simpler proofs of known separations), though nothing quite as exciting as the above.

Update (July 1): Ronald de Wolf points out in the comments that this “trust-but-verify” trick, for designing total Boolean functions with unexpectedly low quantum query complexities, was also used in a recent paper by himself and Ambainis (while Ashley Montanaro points out that a similar trick was used even earlier, in a different context, by Le Gall).  What’s surprising, you might say, is that it took as long as it did for people to realize how many applications this trick has.

John BaezTrends in Reaction Network Theory (Part 2)

Here in Copenhagen we’ll soon be having a bunch of interesting talks on chemical reaction networks:

Workshop on Mathematical Trends in Reaction Network Theory, 1-3 July 2015, Department of Mathematical Sciences, University of Copenhagen. Organized by Elisenda Feliu and Carsten Wiuf.

Looking through the abstracts, here are a couple that strike me.

First of all, Gheorghe Craciun claims to have proved the biggest open conjecture in this field: the Global Attractor Conjecture!

• Gheorge Craciun, Toric differential inclusions and a proof of the global attractor conjecture.

This famous old conjecture says that for a certain class of chemical reactions, the ones coming from ‘complex balanced reaction networks’, the chemicals will approach equilibrium no matter what their initial concentrations are. Here’s what Craciun says:

Abstract. In a groundbreaking 1972 paper Fritz Horn and Roy Jackson showed that a complex balanced mass-action system must have a unique locally stable equilibrium within any compatibility class. In 1974 Horn conjectured that this equilibrium is a global attractor, i.e., all solutions in the same compatibility class must converge to this equilibrium. Later, this claim was called the Global Attractor Conjecture, and it was shown that it has remarkable implications for the dynamics of large classes of polynomial and power-law dynamical systems, even if they are not derived from mass-action kinetics. Several special cases of this conjecture have been proved during the last decade. We describe a proof of the conjecture in full generality. In particular, it will follow that all detailed balanced mass action systems and all deficiency zero mass-action systems have the global attractor property. We will also discuss some implications for biochemical mechanisms that implement noise filtering and cellular homeostasis.

Manoj Gopalkrishnan wrote a great post explaining the concept of complex balanced reaction network here on Azimuth, so if you want to understand the conjecture you could start there.

Even better, Manoj is talking here about a way to do statistical inference with chemistry! His talk is called ‘Statistical inference with a chemical soup':

Abstract. The goal is to design an “intelligent chemical soup” that can do statistical inference. This may have niche technological applications in medicine and biological research, as well as provide fundamental insight into the workings of biochemical reaction pathways. As a first step towards our goal, we describe a scheme that exploits the remarkable mathematical similarity between log-linear models in statistics and chemical reaction networks. We present a simple scheme that encodes the information in a log-linear model as a chemical reaction network. Observed data is encoded as initial concentrations, and the equilibria of the corresponding mass-action system yield the maximum likelihood estimators. The simplicity of our scheme suggests that molecular environments, especially within cells, may be particularly well suited to performing statistical computations.

It’s based on this paper:

• Manoj Gopalkrishnan, A scheme for molecular computation of maximum likelihood estimators for log-linear models.

I’m not sure, but this idea may exploit existing analogies between the approach to equilibrium in chemistry, the approach to equilibrium in evolutionary game theory, and statistical inference. You may have read Marc Harper’s post about that stuff!

David Doty is giving a broader review of ‘Computation by (not about) chemistry':

Abstract. The model of chemical reaction networks (CRNs) is extensively used throughout the natural sciences as a descriptive language for existing chemicals. If we instead think of CRNs as a programming language for describing artificially engineered chemicals, what sorts of computations are possible for these chemicals to achieve? The answer depends crucially on several formal choices:

1) Do we treat matter as infinitely divisible (real-valued concentrations) or atomic (integer-valued counts)?

2) How do we represent the input and output of the computation (e.g., Boolean presence or absence of species, positive numbers directly represented by counts/concentrations, positive and negative numbers represented indirectly by the difference between counts/concentrations of a pair of species)?

3) Do we assume mass-action rate laws (reaction rates proportional to reactant counts/concentrations) or do we insist the system works correctly under a broader class of rate laws?

The talk will survey several recent results and techniques. A primary goal of the talk is to convey the “programming perspective”: rather than asking “What does chemistry do?”, we want to understand “What could chemistry do?” as well as “What can chemistry provably not do?”

I’m really interested in chemical reaction networks that appear in biological systems, and there will be lots of talks about that. For example, Ovidiu Radulescu will talk about ‘Taming the complexity of biochemical networks through model reduction and tropical geometry’. Model reduction is the process of simplifying complicated models while preserving at least some of their good features. Tropical geometry is a cool version of algebraic geometry that uses the real numbers with minimization as addition and addition as multiplication. This number system underlies the principle of least action, or the principle of maximum energy. Here is Radulescu’s abstract:

Abstract. Biochemical networks are used as models of cellular physiology with diverse applications in biology and medicine. In the absence of objective criteria to detect essential features and prune secondary details, networks generated from data are too big and therefore out of the applicability of many mathematical tools for studying their dynamics and behavior under perturbations. However, under circumstances that we can generically denote by multi-scaleness, large biochemical networks can be approximated by smaller and simpler networks. Model reduction is a way to find these simpler models that can be more easily analyzed. We discuss several model reduction methods for biochemical networks with polynomial or rational rate functions and propose as their common denominator the notion of tropical equilibration, meaning finite intersection of tropical varieties in algebraic geometry. Using tropical methods, one can strongly reduce the number of variables and parameters of biochemical network. For multi-scale networks, these reductions are computed symbolically on orders of magnitude of parameters and variables, and are valid in wide domains of parameter and phase spaces.

I’m talking about the analogy between probabilities and quantum amplitudes, and how this makes chemistry analogous to particle physics. You can see two versions of my talk here, but I’ll be giving the ‘more advanced’ version, which is new:

Probabilities versus amplitudes.

Abstract. Some ideas from quantum theory are just beginning to percolate back to classical probability theory. For example, the master equation for a chemical reaction network describes the interactions of molecules in a stochastic rather than quantum way. If we look at it from the perspective of quantum theory, this formalism turns out to involve creation and annihilation operators, coherent states and other well-known ideas—but with a few big differences.

Anyway, there are a lot more talks, but if I don’t have breakfast and walk over to the math department, I’ll miss those talks!

David Hoggstar-formation history of the Milky Way

I read (or skimmed, really) some classic papers on the star-formation history of the Milky Way, in preparation for re-asking this question with APOGEE data this summer. Papers I skimmed included Prantzos & Silk, which infers the SFH from (mainly) abundance distributions and Gizis, Reid, and Hawley, which infers it from M-dwarf chromospheric activity. I also wrote myself a list of all the possible ways one might infer the SFH. I realized that not all of the ways I can think of have actually been executed. So now I have a whole set of projects to pitch!

June 30, 2015

ResonaancesSit down and relaxion

New ideas are rare in particle physics these days. Solutions to the naturalness problem of the Higgs mass are true collector's items. For these reasons, the new mechanism addressing the naturalness problem via cosmological relaxation have stirred a lot of interest in the community. There's already an article explaining the idea in popular terms. Below, I will give you a more technical introduction.

In the Standard Model, the W and Z bosons and fermions get their masses via the Brout-Englert-Higgs mechanism. To this end, the Lagrangian contains  a scalar field H with a negative mass squared  V = - m^2 |H|^2. We know that the value of the parameter m is around 90 GeV - the Higgs boson mass divided by the square root of 2. In quantum field theory,  the mass of a scalar particle is expected to be near the cut-off scale M of the theory, unless there's a symmetry protecting it from quantum corrections.  On the other hand, m much smaller than M, without any reason or symmetry principle, constitutes the naturalness problem. Therefore, the dominant paradigm has been that, around the energy scale of 100 GeV, the Standard Model must be replaced by a new theory in which the parameter m is protected from quantum corrections.  We know several mechanisms that could potentially protect the Higgs mass: supersymmetry, Higgs compositeness, the Goldstone mechanism, extra-dimensional gauge symmetry, and conformal symmetry. However, according to experimentalists, none seems to be realized at the weak scale; therefore, we need to accept that nature is fine-tuned (e.g. susy is just behind the corner), or to seek solace in religion (e.g. anthropics).  Or to find a new solution to the naturalness problem: one that is not fine-tuned and is consistent with experimental data.

Relaxation is a genuinely new solution, even if somewhat contrived. It is based on the following ingredients:
  1.  The Higgs mass term in the potential is V = M^2 |H|^2. That is to say,  the magnitude of the mass term is close to the cut-off of the theory, as suggested by the naturalness arguments. 
  2. The Higgs field is coupled to a new scalar field - the relaxion - whose vacuum expectation value is time-dependent in the early universe, effectively changing the Higgs mass squared during its evolution.
  3. When the mass squared turns negative and electroweak symmetry is broken, a back-reaction mechanism should prevent further time evolution of the relaxion, so that the Higgs mass terms is frozen at a seemingly unnatural value.       
These 3 ingredients can be realized in a toy model where the Standard Model is coupled to the QCD axion. The crucial interactions are  
Then the story goes as follows. The axion Φ starts at a small value where the M^2 term dominates and there's no electroweak symmetry breaking. During inflation its value slowly increases. Once gΦ > M^2, electroweak symmetry breaking is triggered and the Higgs field acquires a vacuum expectation value.  The crucial point is that the height of the axion potential Λ depends on the light quark masses which in turn depend on the Higgs expectation value v. As the relaxion evolves, v increases, and Λ also increases proportionally, which provides the desired back-reaction. At some point, the slope of the axion potential is neutralized by the rising Λ, and the Higgs expectation value freezes in. The question is now quantitative: is it possible to arrange the freeze-in to happen at the value v well below the cut-off scale M? It turns out the answer is yes, at the cost of choosing strange (though not technically unnatural) theory parameters.  In particular, the dimensionful coupling g between the relaxion and the Higgs has to be less than 10^-20 GeV (for a cut-off scale larger than 10 TeV), the inflation has to last for at least 10^40 e-folds, and the Hubble scale during inflation has to be smaller than the QCD scale.   

The toy-model above ultimately fails. Normally, the QCD axion is introduced so that its expectation value cancels the CP violating θ-term in the Standard Model Lagrangian. But here it is stabilized at a value determined by its coupling to the Higgs field. Therefore, in the toy-model, the axion effectively generates an order one θ-term, in conflict with the experimental bound  θ < 10^-10. Nevertheless, the same  mechanism can be implemented in a realistic model. One possibility is to add new QCD-like interactions with its own axion playing the relaxion role. In addition, one needs new "quarks" charged under the new strong interactions. These masses have to be sensitive to the electroweak scale v, thus providing a back-reaction on the axion potential that terminates its evolution. In such a model, the quantitative details would be a bit different than in the QCD axion toy-model. However, the "strangeness" of the parameters persists in any model constructed so far. Especially, the very low scale of inflation required by the relaxation mechanism is worrisome. Could it be that the naturalness problem is just swept into the realm of poorly understood physics of inflation? The ultimate verdict thus depends on whether a complete and  healthy model incorporating both relaxation and inflation can be constructed.

Certainly TBC.

Thanks to Brian for a great tutorial. 

David Hoggyet more K2 proposal

How can a 6-page proposal take more than six days to write? Also signed off on Dun Wang's paper on his pixel-level self-calibration of the Kepler Mission. Submit!

Gordon WattsEducation and “Internet Time”

I saw this link on techcrunch go by discussing the state of venture capital in the education sector. There is a general feeling, at least in the article, that when dealing with universities that things are not moving at internet speed:

“The challenge is, in general, education is a pretty slow to move category, particularly if you’re trying to sell into schools and universities … In many cases they don’t seem to show the sense of urgency that the corporate world does.” says Steve Murray, a partner with Softbank Capital, and investor in the education technology company, EdCast.

I had to laugh a bit. Duh. MOOC’s are a classic example. Massively Open Online Courses – a way to educate large numbers of people with a very small staff. The article refers to the problems with this, actually:

The first generation of massively open online courses have had (well-documented) problems with user retention.

So why have universities been so slow to just jump into the latest and greatest education technology? Can you imagine sending your kid to get a degree from the University of Washington, where they are trying out some new way of education that, frankly, fails on university scale? We are a publically funded university. We’d be shut! The press, rightly, would eat us alive. No institution is going to jump before they look and move their core business over to something that hasn’t been proven.

Another way to look at this, perhaps, is that each University has a brand to maintain. Ok, I’m not a business person here, so I probably am not using the word in quite the right way. None the less. My department at the University of Washington, the Physics Department, is constantly looking at the undergraduate curricula. We are, in some sense, driven by the question “What does it mean to have a degree from the University of Washington Physics Department?” or “What physics should they know?” or another flavor: “They should be able to explain and calculate X by the time they are awarded the degree.” There is a committee in the department that is responsible for adjusting the courses and material covered, and they are constantly proposing changes.

So far only certain technological solutions have an obvious “value proposition.” For example, the online homework websites. This enables students to practice problems without having to spend a huge amount of money on people who will do the grading of the exams. Learning Management Systems, like Canvas, allows us to quickly setup a website for the course that includes just about everything we need as teachers, saving us bunch of time.

Those examples make teaching cheaper and more efficient. But that isn’t always the case. Research (yes, research!!!) has shown that students learn better when they are actively working on a problem (in groups of peers is even more powerful) – so we can flip the class room: have them watch lectures on video and during traditional lecture time work in groups. To do it right, you need to redesign the room… which costs $$… And the professor now has to spend extra time recording the lectures. So there is innovation – and it is helping students learn better.

I think most of us in education will happily admit to the fact that there are inefficiencies in the education system – but really big ones? The problem with the idea that there are really big inefficiencies is that no one has really shown how to educate people on the scale of a University in a dramatically cheaper way. As soon as that happens the inefficiencies will become obvious along with the approach to “fix” them. There are things we need to focus on doing better, and there are places that seem like they are big inefficiencies… and MOOC’s will have a second generation to address their problems. And all of us will watch the evolution, and some professors will work with the companies to improve their products… but it isn’t going to happen overnight, and it isn’t obvious to me that it will happen at all, at least not for the bulk of students.

Education is labor intensive. In order to learn the student has to put in serious time. And as long this remains the case, we will be grappling with costs.

June 29, 2015

n-Category Café What is a Reedy Category?

I’ve just posted the following preprint, which has apparently quite little to do with homotopy type theory.

The notion of Reedy category is common and useful in homotopy theory; but from a category-theoretic point of view it is odd-looking. This paper suggests a category-theoretic understanding of Reedy categories, which I find more satisfying than any other I’ve seen.

So what is a Reedy category anyway? The idea of this paper is to start instead with the question “what is a Reedy model structure?” For a model category MM and a Reedy category CC, then M CM^C has a model structure in which a map ABA\to B is

  • …a weak equivalence iff A xB xA_x\to B_x is a weak equivalence in MM for all xCx\in C.
  • …a cofibration iff the induced map A x L xAL xBB xA_x \sqcup_{L_x A} L_x B \to B_x is a cofibration in MM for all xCx\in C.
  • …a fibration iff the induced map A xB x× M xBM xAA_x \to B_x \times_{M_x B} M_x A is a fibration in MM for all xCx\in C.

Here L xL_x and M xM_x are the latching object and matching object functors, which are defined in terms of the Reedy structure of CC. However, at the moment all we care about is that if xx has degree nn (part of the structure of a Reedy category is an ordinal-valued degree function on its objects), then L xL_x and M xM_x are functors M C nMM^{C_n} \to M, where C nC_n is the full subcategory of CC on the objects of degree less than nn. In the prototypical example of Δ op\Delta^{op}, where M CM^{C} is the category of simplicial objects in MM, L nAL_n A is the “object of degenerate nn-simplices” whereas M nAM_n A is the “object of simplicial (n1)(n-1)-spheres (potential boundaries for nn-simplices)”.

The fundamental observation which makes the Reedy model structure tick is that if we have a diagram AM C nA\in M^{C_n}, then to extend it to a diagram defined at xx as well, it is necessary and sufficient to give an object A xA_x and a factorization L xAA xM xAL_x A \to A_x \to M_x A of the canonical map L xAM xAL_x A \to M_x A (and similarly for morphisms of diagrams). For Δ op\Delta^{op}, this means that if we have a partially defined simplicial object with objects of kk-simplices for all k<nk\lt n, then to extend it with nn-simplices we have to give an object A nA_n, a map L nAA nL_n A \to A_n including the degeneracies, and a map A nM nAA_n \to M_n A assigning the boundary of every simplex, such that the composite L nAA nM nAL_n A \to A_n \to M_n A assigns the correct boundary to degenerate simplices.

Categorically speaking, this observation can be reformulated as follows. Given a natural transformation α:FG\alpha : F\to G between parallel functors F,G:MNF,G:M\to N, let us define the bigluing category Gl(α)Gl(\alpha) to be the category of quadruples (M,N,ϕ,γ)(M,N,\phi,\gamma) such that MMM\in M, NinNN\inN, and ϕ:FMN\phi:F M \to N and γ:NGM\gamma : N \to G M are a factorization of α M\alpha_M through NN. (I call this “bigluing” because if FF is constant at the initial object, then it reduces to the comma category (Id/G)(Id/G), which is sometimes called the gluing construction) The above observation is then that M C xGl(α)M^{C_x}\simeq Gl(\alpha), where α:L xM x\alpha: L_x \to M_x is the canonical map between functors M C nMM^{C_n} \to M and C xC_x is the full subcategory of CC on C n{x}C_n \cup \{x\}. Moreover, it is an easy exercise to reformulate the usual construction of the Reedy model structure as a theorem that if MM and NN are model categories and FF and GG are left and right Quillen respectively, then Gl(α)Gl(\alpha) inherits a model structure.

Therefore, our answer to the question “what is a Reedy model structure?” is that it is one obtained by repeatedly (perhaps transfinitely) bigluing along a certain kind of transformation between functors M CMM^C \to M (where CC is a category playing the role of C nC_n previously). This motivates us to ask, given CC, how can we find functors F,G:M CMF,G : M^{C}\to M and a map α:FG\alpha : F \to G such that Gl(α)Gl(\alpha) is of the form M CM^{C'} for some new category CC'?

Of course, we expect CC' to be obtained from CC by adding one new object “xx”. Thus, it stands to reason that FF, GG, and α\alpha will have to specify, among other things, the morphisms from xx to objects in CC, and the morphisms to xx from objects of CC. These two collections of morphisms form diagrams W:CSetW:C\to\Set and U:C opSetU:C^{op} \to \Set, respectively; and given such UU and WW we do have canonical functors FF and GG, namely the UU-weighted colimit and the WW-weighted limit. Moreover, a natural transformation from the UU-weighted colimit to the WW-weighted limit can naturally be specified by giving a map W×UC(,)W\times U \to C(-,-) in Set C op×C\Set^{C^{op}\times C}. In CC', this map will supply the composition of morphisms through xx. (A triple consisting of UU, WW, and a map W×UC(,)W\times U \to C(-,-) is also known as an object of the Isbell envelope of CC.)

It remains only to specify the hom-set C(x,x)C'(x,x) (and the relevant composition maps), and for this there is a “universal choice”: we take C(x,x)=(W CU){id x}C'(x,x) = (W \otimes_C U) \sqcup \{\id_x\}. That is, we throw in composites of morphisms xyxx\to y \to x, freely subject to the associative law, and also an identity morphism. This CC' has a universal property (it is a “collage” in the bicategory of profunctors) which ensures that the resulting biglued category is indeed equivalent to M CM^{C'}.

A category with degrees assigned to its objects can be obtained by iterating this construction if and only if any nonidentity morphism between objects of the same degree factors uniquely-up-to-zigzags through an object of strictly lesser degree (i.e. the category of such factorizations is connected). What remains is to ensure that the resulting latching and matching objects are left and right Quillen. It turns out that this is equivalent to requiring that morphisms between objects of different degrees also have connected or empty categories of factorizations through objects of strictly lesser degree.

I call a category satisfying these conditions almost-Reedy. This doesn’t look much like the usual definition of Reedy category, but it turns out to be very close to it. If CC is almost-Reedy, let C +C_+ (resp. C C_-) be the class of morphisms f:xyf:x\to y such that deg(x)deg(y)\deg(x)\le \deg(y) (resp. deg(y)deg(x)\deg(y)\le \deg(x)) and that do not factor through any object of strictly lesser degree than xx and yy. Then we can show that just as in a Reedy category, every morphism factors uniquely into a C C_--morphism followed by a C +C_+-morphism.

The only thing missing from the usual definition of a Reedy category, therefore, is that C C_- and C +C_+ be subcategories, i.e. closed under composition. And indeed, this can fail to be true; but it is all that can go wrong: CC is a Reedy category if and only if it is an almost-Reedy category such that C C_- and C +C_+ are closed under composition. (In particular, this means that C C_- and C +C_+ don’t have to be given as data in the definition of a Reedy category; they are recoverable from the degree function. This was also noticed by Riehl and Verity.)

In other words, the notion of Reedy category (very slightly generalized) is essentially inevitable. Moreover, as often happens, once we understand a definition more conceptually, it is easier to generalize further. The same analysis can be repeated in other contexts, yielding the existing notions of generalized Reedy category and enriched Reedy category, as well as new generalizations such as a combined notion of “enriched generalized Reedy category”.

(I should note that some of the ideas in this paper were noticed independently, and somewhat earlier, by Richard Garner. He also pointed out that the bigluing model structure is a special case of the “Grothendieck construction” for model categories.)

This paper is, I think, slightly unusual, for a paper in category theory, in that one of its main results (unique C +C_+-C C_--factorization in an almost-Reedy category) depends on a sequence of technical lemmas, and as far as I know there is no particular reason to expect it to be true. This made me worry that I’d made a mistake somewhere in one of the technical lemmas that might bring the whole theorem crashing down. After I finished writing the paper, I thought this made it a good candidate for an experiment in computer formalization of some non-HoTT mathematics.

Verifying all the results of the paper would have required a substantial library of basic category theory, but fortunately the proof in question (including the technical lemmas) is largely elementary, requiring little more than the definition of a category. However, formalizing it nevertheless turned out to be much more time-consuming that I had hoped, and as a result I’m posting this paper quite some months later than I might otherwise have. But the result I was worried about turned out to be correct (here is the Coq code, which unlike the HoTT Coq library requires only a standard Coq v8.4 install), and now I’m much more confident in it. So was it worth it? Would I choose to do it again if I knew how much work it would turn out to be? I’m not sure.

Having this formalization does provide an opportunity for another interesting experiment. As I said, the theorem turned out to be correct; but the process of formalization did uncover a few minor errors, which I corrected before posting the paper. I wonder, would those errors have been caught by a human referee? And you can help answer that question! I’ve posted a version without these corrections, so you can read it yourself and look for the mistakes. The place to look is Theorem 7.16, its generalization Theorem 8.26, and the sequences of lemmas leading up to them (starting with Lemmas 7.12 and 8.15). The corrected version that I linked to up top mentions all the errors at the end, so you can see how many of them you caught — then post your results in the comments! You do, of course, have the advantage over an ordinary referee that I’m telling you there is at least one error to find.

Of course, you can also try to think of an easier proof, or a conceptual reason why this theorem ought to be true. If you find one (or both), I will be both happy (for obvious reasons) and sad (because of all the time I wasted…).

Let me end by mentioning one other thing I particularly enjoyed about this paper: it uses two bits of very pure category theory in its attempt to explain an apparently ad hoc definition from homotopy theory.

The first of these bits is “tight lax colimits of diagrams of profunctors”. It so happens that an object (U,W,α)(U,W,\alpha) of the Isbell envelope can also be regarded as a special sort of lax diagram in ProfProf, and the category CC' constructed from it is its lax colimit. Moreover, the universal property of this lax colimit — or more precisely, its stronger universal property as a “tight colimit” in the equipment ProfProf — is precisely what we need in order to conclude that M CM^{C'} is the desired bigluing category.

The second of these bits is an absolute coequalizer that is not split. The characterization of non-split absolute coequalizers seemed like a fairly esoteric and very pure bit of category theory when I first learned it. I don’t, of course, mean this in any derogatory way; I just didn’t expect to ever need to use it in an application to, say, homotopy theory. But it turned out to be exactly what I needed at one point in this paper, to “enrich” an argument involving a two-step zigzag (whose unenriched version I learned from Riehl-Verity).

Jacques Distler Asymptotic Safety and the Gribov Ambiguity

Recently, an old post of mine about the Asymptotic Safety program for quantizing gravity received a flurry of new comments. Inadvertently, one of the pseudonymous commenters pointed out yet another problem with the program, which deserves a post all its own.

Before launching in, I should say that

  1. Everything I am about to say was known to Iz Singer in 1978. Though, as with the corresponding result for nonabelian gauge theory, the import seems to be largely unappreciated by physicists working on the subject.
  2. I would like to thank Valentin Zakharevich, a very bright young grad student in our Math Department for a discussion on this subject, which clarified things greatly for me.

Yang-Mills Theory

Let’s start by reviewing Singer’s explication of the Gribov ambiguity.

Say we want to do the path integral for Yang-Mills Theory, with compact semi-simple gauge group GG. For definiteness, we’ll talk about the Euclidean path integral, and take M=S 4M= S^4. Fix a principal GG-bundle, PMP\to M. We would like to integrate over all connections, AA, on PP, modulo gauge transformations, with a weight given by e S YM(A)e^{-S_{\text{YM}}(A)}. Let 𝒜\mathcal{A} be the space of all connections on PP, 𝒢\mathcal{G} the (infinite dimensional) group of gauge transformations (automorphisms of PP which project to the identity on MM), and =𝒜/𝒢\mathcal{B}=\mathcal{A}/\mathcal{G}, the gauge equivalence classes of connections.

“Really,” what we would like to do is integrate over \mathcal{B}. In practice, what we actually do is fix a gauge and integrate over actual connections (rather than equivalence classes thereof). We could, for instance, choose background field gauge. Pick a fiducial connection, A¯\overline{A}, on PP, and parametrize any other connection A=A¯+Q A= \overline{A}+Q with QQ a 𝔤\mathfrak{g}-valued 1-form on MM. Background field gauge is

(1)D A¯*Q=0D_{\overline{A}}* Q = 0

which picks out a linear subspace 𝒬𝒜\mathcal{Q}\subset\mathcal{A}. The hope is that this subspace is transverse to the orbits of 𝒢\mathcal{G}, and intersects each orbit precisely once. If so, then we can do the path integral by integrating1 over 𝒬\mathcal{Q}. That is, 𝒬\mathcal{Q} is the image of a global section of the principal 𝒢\mathcal{G}-bundle, 𝒜\mathcal{A}\to \mathcal{B} and integrating over \mathcal{B} is equivalent to integrating over its image, 𝒬\mathcal{Q}.

What Gribov found (in a Coulomb-type gauge) is that 𝒬\mathcal{Q} intersects a given gauge orbit more than once. Singer explained that this is not some accident of Coulomb gauge. The bundle 𝒜\mathcal{A}\to \mathcal{B} is nontrivial and no global gauge choice (section) exists.

A small technical point: 𝒢\mathcal{G} doesn’t act freely on 𝒜\mathcal{A}. Except for the case2 G=SU(2)G=SU(2), there are reducible connections, which are fixed by a subgroup of 𝒢\mathcal{G}. Because of the presence of reducible connections, we should interpret \mathcal{B} as a stack. However, to prove the nontriviality, we don’t need to venture into the stacky world; it suffices to consider the irreducible connections, 𝒜 0𝒜\mathcal{A}_0\subset \mathcal{A}, on which 𝒢\mathcal{G} acts freely. We then have 𝒜 0 0\mathcal{A}_0\to \mathcal{B}_0 of which 𝒢\mathcal{G} acts freely on the fibers. If we were able to find a global section of 𝒜 0 0\mathcal{A}_0\to \mathcal{B}_0, then we would have established 𝒜 0 0×𝒢 \mathcal{A}_0\cong \mathcal{B}_0\times \mathcal{G} But Singer proves that

  1. π k(𝒜 0)=0,k>0\pi_k(\mathcal{A}_0)=0,\,\forall k\gt 0. But
  2. π k(𝒢)0\pi_k(\mathcal{G})\neq 0 for some k>0k\gt 0.

Hence 𝒜 0 0×𝒢 \mathcal{A}_0\ncong \mathcal{B}_0\times \mathcal{G} and no global gauge choice is possible.

What does this mean for Yang-Mills Theory?

  • If we’re working on the lattice, then 𝒢=G N\mathcal{G}= G^N, where NN is the number of lattice sites. We can choose not to fix a gauge and instead divide our answers by Vol(G) NVol(G)^N, which is finite. That is what is conventionally done.
  • In perturbation theory, of course, you never see any of this, because you are just working locally on \mathcal{B}.
  • If we’re working in the continuum, and we’re trying to do something non-perturbative, then we just have to work harder. Locally on \mathcal{B}, we can always choose a gauge (any principal 𝒢\mathcal{G}-bundle is locally-trivial). On different patches of \mathcal{B}, we’ll have to choose different gauges, do the path integral on each patch, and then piece together our answers on patch overlaps using partitions of unity. This sounds like a pain, but it’s really no different from what anyone has to do when doing integration on manifolds.


The Asymptotic Freedom people want to do the path-integral over metrics and search for a UV fixed point. As above, they work in Euclidean signature, with M=S 4M=S^4. Let ℳℯ𝓉\mathcal{Met} be the space of all metrics on MM, 𝒟𝒾𝒻𝒻\mathcal{Diff} the group of diffeomorphism, and =ℳℯ𝓉/𝒟𝒾𝒻𝒻\mathcal{B}= \mathcal{Met}/\mathcal{Diff} the space of metrics on MM modulo diffeomorphisms.

Pick a (fixed, but arbitrary) fiducial metric, g¯\overline{g}, on S 4S^4. Any metric, gg, can be written as g μν=g¯ μν+h μν g_{\mu\nu} = \overline{g}_{\mu\nu}+ h_{\mu\nu} They use background field gauge,

(2)¯ μh μν12¯ ν(h μ μ )=0\overline{\nabla}^\mu h_{\mu\nu}-\tfrac{1}{2}\overline{\nabla}_\nu(\tensor{h}{^\mu_\mu}) = 0

where ¯\overline{\nabla} is the Levi-Cevita connection for g¯\overline{g}, and indices are raised and lowered using g¯\overline{g}. As before, (2) defines a subspace 𝒬ℳℯ𝓉\mathcal{Q}\subset \mathcal{Met}. If it happens to be true that 𝒬\mathcal{Q} is everywhere transverse to the orbits of 𝒟𝒾𝒻𝒻\mathcal{Diff} and meets every 𝒟𝒾𝒻𝒻\mathcal{Diff} orbit precisely once, then we can imagine doing the path integral over 𝒬\mathcal{Q} instead of over \mathcal{B}.

In addition to the other problems with the asymptotic safety program (the most grievous of which is that the infrared regulator used to define Γ k(g¯)\Gamma_k(\overline{g}) is not BRST-invariant, which means that their prescription doesn’t even give the right path-integral measure locally on 𝒬\mathcal{Q}), the program is saddled with the same Gribov problem that we just discussed for gauge theory, namely that there is no global section of ℳℯ𝓉\mathcal{Met}\to\mathcal{B}, and hence no global choice of gauge, along the lines of (2).

As in the gauge theory case, let ℳℯ𝓉 0\mathcal{Met}_0 be the metrics with no isometries3. 𝒟𝒾𝒻𝒻\mathcal{Diff} acts freely on the fibers of ℳℯ𝓉 0 0\mathcal{Met}_0\to \mathcal{B}_0. Back in his 1978 paper, Singer already noted that

  1. π k(ℳℯ𝓉 0)=0,k>0\pi_k(\mathcal{Met}_0)=0,\,\forall k\gt 0, but
  2. 𝒟𝒾𝒻𝒻\mathcal{Diff} has quite complicated homotopy-type.

Of course, none of this matters perturbatively. When hh is small, i.e. for gg close to g¯\overline{g}, (2) is a perfectly good gauge choice. But the claim of the Asymptotic Safety people is that they are doing a non-perturbative computation of the β\beta-functional, and that hh is not assumed to be small. Just as in gauge theory, there is no global gauge choice (whether (2) or otherwise). And that should matter to their analysis.

Note: Since someone will surely ask, let me explain the situation in the Polyakov string. There, the gauge group isn’t 𝒟𝒾𝒻𝒻\mathcal{Diff}, but rather the larger group, 𝒢=𝒟𝒾𝒻𝒻Weyl\mathcal{G}= \mathcal{Diff}\ltimes \text{Weyl}. And we only do a partial gauge-fixing: we don’t demand a metric, but rather only a Weyl equivalence-class of metrics. That is, we demand a section of ℳℯ𝓉/Weylℳℯ𝓉/𝒢\mathcal{Met}/\text{Weyl} \to \mathcal{Met}/\mathcal{G}. And that can be done: in d=2d=2, every metric is diffeomorphic to a Weyl-rescaling of a constant-curvature metric.

1 To get the right measure on 𝒬\mathcal{Q}, we need to use the Fadeev-Popov trick. But, as long as 𝒬\mathcal{Q} is transverse to the gauge orbits, that’s all fine, and the prescription can be found in any textbook.

2 For more general choice of MM, we would also have to require H 2(M,)=0H^2(M,\mathbb{Z})=0.

3 When dim(M)>1dim(M)\gt 1, ℳℯ𝓉 0(M)\mathcal{Met}_0(M) is dense in ℳℯ𝓉(M)\mathcal{Met}(M). But for dim(M)=1dim(M)=1, ℳℯ𝓉 0=\mathcal{Met}_0=\emptyset. In that case, we actually can choose a global section of ℳℯ𝓉(S 1)ℳℯ𝓉(S 1)/𝒟𝒾𝒻𝒻(S 1)\mathcal{Met}(S^1) \to \mathcal{Met}(S^1)/\mathcal{Diff}(S^1).

Doug NatelsonHow much information can you cram down an optical fiber?

A new cool result showed up in Science this week, implying that we may be able to increase the information-carrying capacity of fiber optics beyond what had been thought of as (material-dependent) fundamental limits.  To appreciate this, it's good to think a bit about the way optical fiber carries information right now, including the bits of this blog post to you.  (This sort of thing is discussed in the photonics chapter of my book, by the way.)
Information is passed through optical fibers in a way that isn't vastly different than AM radio.  A carrier frequency is chosen (corresponding to a free-space wavelength of light of around 1.55 microns, in the near-infrared) that just so happens to correspond to the frequency where the optical absorption of ultrapure SiO2 glass is minimized.   Light at that frequency is generated by a diode laser, and the intensity of that light is modulated at high speed (say 10 GHz or 40 GHz), to encode the 1s and 0s of digital information.  If you look at the power vs. frequency for the modulated signal, you get something like what is shown in the figure - the central carrier frequency, with sidebands offset by the modulation frequency.   The faster the modulation, the farther apart the sidebands.   In current practice, a number of carrier frequencies (colors) are used, all close to the minimum in the fiber absorption, and the carriers are offset enough that the sidebands from modulation don't run into each other.  Since the glass is very nearly a linear medium, we can generally use superposition nicely and have those different colors all in there without them affecting each other (much).

So, if you want to improve data carrying capacity (including signal-to-noise), what can you do?  You could imagine packing in as many channels as possible, modulated as fast as possible to avoid cross-channel interference, and cranking up the laser power so that the signal size is big.  One problem, though, is that while the ultrapure silica glass is really good stuff, it's not perfectly linear, and it has dispersion:  The propagation speed of different colors is slightly different, and it's affected by the intensity of the different colors.  This tends to limit the total amount of power you can put in without the signals degrading each other (that is, channel A effectively acts like a phase and amplitude noise source for channel B).  What the UCSD researchers have apparently figured out is, if you start with the different channels coherently synced, then the way the channels couple to each other is mathematically nicely determined, and can be de-convolved later on, essentially cutting down on the effective interference.  This could boost total information carrying capacity by quite a bit - very neat. 

n-Category Café Feynman's Fabulous Formula

Guest post by Bruce Bartlett.

There is a beautiful formula at the heart of the Ising model; a formula emblematic of all of quantum field theory. Feynman, the king of diagrammatic expansions, recognized its importance, and distilled it down to the following combinatorial-geometric statement. He didn’t prove it though — but Sherman did.

Feynman’s formula. Let GG be a planar finite graph, with each edge ee regarded as a formal variable denoted x ex_e. Then the following two polynomials are equal:

H evenGx(H)= [γ]P(G)(1(1) w[γ]x[γ])\displaystyle \sum_{H \subseteq_{even} G} x(H) = \prod_{[\vec{\gamma}] \in P(G)} \left(1 - (-1)^{w[\vec{\gamma}]} x[\vec{\gamma}]\right)


I will explain this formula and its history below. Then I’ll explain a beautiful generalization of it to arbitrary finite graphs, expressed in a form given by Cimasoni.

What the formula says

The left hand side of Feynman’s formula is a sum over all even subgraphs HH of GG, including the empty subgraph. An even subgraph HH is one which has an even number of half-edges emanating from each vertex. For each even subgraph HH, we multiply the variables x ex_e of all the edges eHe \in H together to form x(H)x(H). So, the left hand side is a polynomial with integer coefficients in the variables x e ix_{e_i}.

The right hand side is a product over all γP(G)\vec{\gamma} \in P(G), where P(G)P(G) is the set of all prime, reduced, unoriented, closed paths in GG. That’s a bit subtle, so let me define it carefully. Firstly, our graph is not oriented. But, by an oriented edge e\mathbf{e}, I mean an unoriented edge ee equipped with an orientation. An oriented closed path γ\vec{\gamma} is a word of composable oriented edges e 1e n\mathbf{e_1} \cdots \mathbf{e_n}; we consider γ\vec{\gamma} up to cyclic ordering of the edges. The oriented closed path γ\vec{\gamma} is called called reduced if it never backtracks, that is, if no oriented edge e\mathbf{e} is immediately followed by the oriented edge e 1\mathbf{e^{-1}}. The oriented closed path γ\vec{\gamma} is called prime if, when viewed as a cyclic word, it cannot be expressed as the product δ r\vec{\delta}^r of a given oriented closed path δ\vec{\delta} for any r2r \geq 2. Note that the oriented closed path γ\vec{\gamma} is reduced (resp. prime) if and only if γ 1\vec{\gamma}^{-1} is. It therefore makes sense to talk about prime reduced unoriented closed paths [γ][\vec{\gamma}], by which we mean simply an equivalence class [γ]=[γ 1][\vec{\gamma}] = [\vec{\gamma}^{-1}].

Suppose GG is embedded in the plane, so that each edge forms a smooth curve. Then given an oriented closed path γ\vec{\gamma}, we can compute the winding number w(γ)w(\vec{\gamma}) of the tangent vector along the curve. We need to fix a convention about what happens at vertices, where we pass from the tangent vector vv at the target of e i\mathbf{e_i} to the tangent vector vv' at the source of e i+1\mathbf{e_{i+1}}. We choose to rotate vv into vv' by the angle less than π\pi in absolute value.

Note that w(γ)=w(γ)w(-\vec{\gamma}) = -w(\vec{\gamma}), so that its parity (1) w[γ](-1)^{w[\vec{\gamma}]} makes sense for unoriented paths. Finally, by x[γ]x[\vec{\gamma}] we simply mean the product of all the variables x e ix_{e_i} for e ie_i along γ\vec{\gamma}.

The product on the right hand side is infinite, since P(G)P(G) is infinite in general (we will shortly do some examples). But, we regard the product as a formal power series in the terms x e 1x e 2x e nx_{e_1} x_{e_2} \cdots x_{e_n}, each of which only receives finitely many contributions (there are only finitely many paths of a given length), so the right hand side is a well-defined formal power series.


Let’s do some examples, taken from Sherman. Suppose GG is a graph with one vertex vv and one edge ee:


Write x=x(e)x = x(e). There are two even subgraphs — the empty one, and GG itself. So the sum over even subgraphs gives 1+x1+x. There is only a single closed path in P(G)P(G), namely [e][\mathbf{e}], with odd winding number, so the sum over paths also gives 1+x1+x. Hooray!

Now let’s consider a graph with two loops:


There are 4 even subgraphs, and the left hand side of the formula is 1+x 1+x 2+x 1x 21 + x_1 + x_2 + x_1x_2. Now let’s count closed paths γP(G)\gamma \in P(G). There are infinitely many; here is a table. Let e 1\mathbf{e_1} and e 2\mathbf{e_2} be the counterclockwise oriented versions of e 1e_1 and e 2e_2. [γ] 1(1) w[γ]x[γ] [e 1] 1+x 1 [e 2] 1+x 2 [e 1e 2] 1+x 1x 2 [e 1e 2 1] 1x 1x 2 [e 1 2e 2] 1x 1 2x 2 [e 1 2e 2 1] 1+x 1 2x 2 [e 1e 2 2] 1x 1x 2 2 [e 1 1e 2 2] 1+x 1x 2 2 \begin{array}{cc} [\vec{\gamma}] & 1 - (-1)^{w[\vec{\gamma}]} x[\vec{\gamma}] \\ ------ & ------ \\ [\mathbf{e_1}] & 1 + x_1 \\ [\mathbf{e_2}] & 1 + x_2 \\ [\mathbf{e_1 e_2}] & 1 + x_1 x_2 \\ [\mathbf{e_1 e_2^{-1}}] & 1 - x_1 x_2 \\ [\mathbf{e_1^2 e_2}] & 1 - x_1^2 x_2 \\ [\mathbf{e_1^2 e_2^{-1}}] & 1 + x_1^2 x_2 \\ [\mathbf{e_1 e_2^2}] & 1 - x_1 x_2^2 \\ [\mathbf{e_1^{-1} e_2^2}] & 1 + x_1 x_2^2 \\ \cdots & \cdots \end{array} If we multiply out the terms the right hand side gives (1+x 1+x 2+x 1x 2)(1x 1 2x 2 2)(1x 1 4x 2 2)(1x 1 2x 2 4) (1 + x_1 + x_2 + x_1 x_2) (1-x_1^2 x_2^2) (1-x_1^4 x_2^2)(1-x_1^2x_2^4) \cdots In order for this to equal 1+x 1+x 2+x 1x 2 1 + x_1 + x_2 + x_1x_2 we will need some miraculous cancellation in the higher powers to occur! And indeed this is what happens. The minus signs from the winding numbers conspire to cancel the remaining terms. Even in this simple example, the mechanism is not obvious — but it does happen.

Pondering the meaning of the formula

Let’s ponder the formula. Why do I say it is so beautiful?

Well, the left hand side is combinatorial — it has only to do with the abstract graph GG, having the property that it is embeddable in the plane (this property can be abstractly encoded via Kuratowski’s theorem). The right hand side is geometric — we fix some embedding of GG in the plane, and then compute winding numbers of tangent vectors! So, the formula expresses a combinatorial (or topological) property of the graph in terms of geometry.

Ok… but why is this formula emblematic of all of quantum field theory? Well, summing over all loops is what the path integral in quantum mechanics is all about. (See Witten’s IAS lectures on the Dirac index on manifolds, for example.) Note that the quantum mechanics path integral has recently been made rigorous in the work of Baer and Pfaffle, as well as Fine and Sawin.

Also, I think the formula has overtones of the linked-cluster theorem in perturbative quantum field theory, which relates the generating function for all Feynman diagrams (similar to the even subgraphs) to the generating function for connected Feynman diagrams (similar to the closed paths). You can see why Feynman was interested!

History of the formula

One beautiful way of computing the partition function in the Ising model, due to Kac and Ward, is to express it as a square root of a certain determinant. (I hope to explain this next time.) To do this though, they needed a “topological theorem” about planar graphs. Their theorem was actually false in general, as shown by Sherman. It was Feynman who reformulated it in the above form. From Mark Kac’s autobiography (clip):

The two-dimensional case for so-called nearest neighbour interactions was solved by Lars Onsager in 1944. Onsager’s solution, a veritable tour de force of mathematical ingenuity and inventiveness, uncovered a number of suprising features and started a series of investigations which continue to this day. The solution was difficult to understand and George Uhlenbeck urged me to try to simplify it. “Make it human” was the way he put it …. At the Institute [for Advanced Studies at Princeton] I met John C. Ward … we succeeded in rederiving Onsager’s result. Our success was in large measure due to knowing the answer; we were, in fact, guided by this knowledge. But our solution turned out to be incomplete… it took several years and the effort of several people before the gap in the derivation was filled. Even Feynman got into the act. He attended two lectures I gave in 1952 at Caltech and came up with the clearest and sharpest formulation of what was needed to fill the gap. The only time I have ever seen Feynman take notes was during the two lectures. Usually, he is miles ahead of the speaker but following combinatorial arguments is difficult for all mortals.

Feynman’s formula for general graphs

Every finite graph can be embedded in some closed oriented surface of high enough genus. So there should be a generalization of the formula to all finite graphs, not just planar ones. But on the right hand side of the formula, how do we compute the winding number of a closed path on a general surface? The answer, in the formulation of Cimasoni, is beautiful: we should sum over spin structures on the surface, each weighted by their Arf invariant!

Generalized Feynman formula. Let GG be a finite graph of genus gg. Then the following two polynomials are equal: H evenGx(H)=12 g λSpin(Σ)(1) Arf(λ) [γ]P(G)(1(1) w λ[γ]x[γ]) \sum_{H \subseteq_{even} G} x(H) = \frac{1}{2^g} \sum_{\lambda \in Spin(\Sigma)} (-1)^{Arf(\lambda)} \prod_{[\vec{\gamma}] \in P(G)} (1 - (-1)^{w_\lambda[\vec{\gamma}]} x[\vec{\gamma}])

The point is that a spin structure on Σ\Sigma can be represented as a nonzero vector field λ\lambda on Σ\Sigma minus a finite set of points, with even index around these points. (Of course, a nonzero vector field on the whole of Σ\Sigma won’t exist, except on the torus. That is why we need these points.) So, we can measure the winding number w λ(γ)w_\lambda(\vec{\gamma}) of a closed path γ\vec{\gamma} with respect to this background vector field λ\lambda.

The first version of this generalized Feynman formula was obtained by Loebl, in the case where all vertices have degree 2 or 4, and using the notion of Sherman rotation numbers instead of spin structures (see also Loebl and Somberg). In independent work, Cimasoni formulated it differently using the language of spin structures and Arf invariants, and proved it in the slightly more general case of general graphs, though his proof is not a direct one. Also, in unpublished work, Masbaum and Loebl found a direct combinatorial argument (in the style of Sherman’s proof of the planar case) to prove this general, spin-structures version.

Last thoughts

I find the generalized Feynman’s formula to be very beautiful. The left hand side is completely combinatorial / topological, manifestly only depending on GG. The right hand picks some embedding of the graph in a surface, and is very geometric, referring to high-brow things such as spin structures and Arf invariants! Who knew that there was such an elegant geometric theorem lurking behing arbitrary finite graphs?

Moreover, it is all part of a beautiful collection of ideas relating the Ising model to the free fermion conformal field theory. (Indeed, the appearance of spin structures and winding numbers is telling us we are dealing with fermions.) Of course, physicists knew this for ages, but it hasn’t been clear to mathematicians exactly what they meant :-)

But in recent times, mathematicians are making this all precise, and beautiful geometry is emerging, like the above formula. There’s even a Fields medal in the mix. It’s all about discrete complex analysis, spinors on Riemann surfaces, the discrete Dirac equation, isomonodromic deformation of flat connections, heat kernels, conformal invariance, Pfaffians, and other amazing things (here is a recent expository talk of mine). I hope to explain some of this story next time.

Clifford JohnsonWarm…

So far, the Summer has not been as brutal in the garden as it was last year. Let's hope that continues. I think that late rain we had last month (or earlier this month?) helped my later planting get a good start too. This snap of a sunflower was taken on a lovely warm evening in the garden the other day, after a (only slightly too) hot day... sunflower_june_2015 -cvj Click to continue reading this post

Chad OrzelOn Scientific Conferences, and Making Them Better

I’ve been doing a bunch of conferencing recently, what with DAMOP a few weeks ago and then Convergence last week. This prompted me to write up a couple of posts about conference-related things, which I posted over at Forbes. These were apparently a pretty bad fit for the folks reading over there, as they’ve gotten very little traffic relative to, well, everything else I’ve posted during that span. Live and learn.

Anyway, I’m fairly happy with how both of those turned out, and on the off chance that they’ll do better with the ScienceBlogs crowd, let me link them here:

What Are Academic Conferences Good For? Starts from the irony of presenting PER research on how lectures are suboptimal for student learning in a conference talk format, then argues that the real purpose of a standard conference talk isn’t information transfer but advertising.

Going To An Academic Conference? Here Are Some Tips: In response to a series of posts at the Owl_Meat blog, some advice for conference organizers, faculty advisors, and students on how to make going to conferences a better experience.

If either of those descriptions sounds relevant to your interests, follow the link and read the rest. And in future, I’ll probably put this sort of material over here to begin with…

June 28, 2015

Tommaso DorigoIn Memory Of David Cline

I was saddened today to hear of the death of David Cline. I do not have much to say here - I am not good with obituaries - but I do remember meeting him at a conference in Albuquerque in 2008, where we chatted on several topics, among them the history of the CDF experiment, a topic on which I had just started to write a book. 

Perhaps the best I can do here as a way to remember Cline, whose contributions to particle physics can and will certainly be better described by many others, is to report a quote from a chapter of the book, which describes a funny episode on the very early days of CDF. I think he did have a sense of humor, so he might not dislike it if I do.


read more

BackreactionI wasn’t born a scientist. And you weren’t either.

There’s a photo which keeps cropping up in my facebook feed and it bothers me. It shows a white girl, maybe three years, kissing a black boy the same age. The caption says “No one is born racist.” It’s adorable. It’s inspirational. But the problem is, it’s not true.

Children aren’t saints. We’re born mistrusting people who look different from us, and we treat those who look like us better. Toddlers already have this “in-group bias” research says. Though I have to admit that, as a physicist, I am generally not impressed by what psychologists consider statistically significant, and I acknowledge it is generally hard to distinguish nature from nurture. But that a preference for people of similar appearance should be a result of evolution isn’t so surprising. We are more supportive to who we share genes with, family ahead of all, and looks are a giveaway.

As we grow up, we should become aware that our bias is both unnecessary and unfair, and take measures to prevent it from being institutionalized. But since we are born being extra suspicious about anybody not from our own clan, it takes conscious educational effort to act against the preference we give to people “like us.” Racist thoughts are not going away by themselves, though one can work to address them – or at least I hope so. But it starts with recognizing one is biased to begin with. And that’s why this photo bothers me. Denying a problem rarely helps solving it.

On the same romantic reasoning I often read that infants are all little scientists, and it’s only our terrible school education that kills curiosity and prevents adults from still thinking scientifically. That is wrong too. Yes, we are born being curious, and as children we learn a lot by trial and error. Ask my daughter who recently learned to make rainbows with the water sprinkler, mostly without soaking herself. But our brains didn’t develop to serve science, they developed to serve ourselves in the first place.

My daughters for example haven’t yet learned to question authority. What mommy speaks is true, period. When the girls were beginning to walk I told them to never, ever, touch the stove when I’m in the kitchen because it’s hot and it hurts and don’t, just don’t. They took this so seriously that for years they were afraid to come anywhere near the stove at any time. Yes, good for them. But if I had told them rainbows are made by garden fairies they’d have believed this too. And to be honest, the stove isn’t hot all that often in our household. Still today much of my daughters’ reasoning begins with “mommy says.” Sooner or later they will move beyond M-theory, or so I hope, but trust in authorities is a cognitive bias that remains with us through adulthood. I have it. You have it. It doesn’t go away by denying it.

Let me be clear that human cognitive biases aren’t generally a bad thing. Most of them developed because they are, or at least have been, of advantage to us. We are for example more likely to put forward opinions that we believe will be well-received by others. This “social desirability bias” is a side-effect of our need to fit into a group for survival. You don’t tell the tribal chief his tent stinks if you have a dozen fellows with spears in the back. How smart of you. While opportunism might benefit our survival, it rarely benefits knowledge discovery though.

It is because of our cognitive shortcomings that scientists have put into place many checks and methods designed to prevent us from lying to ourselves. Experimental groups for example go to lengths preventing bias in data analysis. If your experimental data are questionnaire replies then that’s that, but in physics data aren’t normally very self-revealing. They have to be processed suitably and be analyzed with numerical tools to arrive at useful results. Data has to be binned, cuts have to be made, background has to be subtracted.

There are usually many different ways to process the data, and the more ways you try the more likely you are to find one that delivers an interesting result, just by coincidence. It is pretty much impossible to account for trying different methods because one doesn’t know how much these methods are correlated. So to prevent themselves from inadvertently running multiple searches for a signal that isn’t there, many experimental collaborations agree on a method for data analysis before the data is in, then proceed according to plan.

(Of course if the data are made public this won’t prevent other people to reanalyze the same numbers over and over again. And every once in a while they’ll find some signal whose statistical significance they overestimate because they’re not accounting, can’t account, for all the failed trials. Thus all the CMB anomalies.)

In science as in everyday life the major problems though are the biases we do not account for. Confirmation bias is the probably most prevalent one. If you search the literature for support of your argument, there it is. If you try to avoid that person who asked a nasty question during your seminar, there it is. If you just know you’re right, there it is.

Even though it often isn’t explicitly taught to students, everyone who succeeded making a career in research has learned to work against their own confirmation bias. Failing to list contradicting evidence or shortcomings of one’s own ideas is the easiest way to tell a pseudoscientist. A scientist’s best friend is their inner voice saying: “You are wrong. You are wrong, wrong, W.R.O.N.G.” Try to prove yourself wrong. Then try it again. Try to find someone willing to tell you why you are wrong. Listen. Learn. Look for literature that explains why you are wrong. Then go back to your idea. That’s the way science operates. It’s not the way humans normally operate.

(And lest you want to go meta on me, the title of this post is of course also wrong. We are scientists in some regards but not in others. We like to construct new theories, but we don’t like being proved wrong.)

But there are other cognitive and social biases that affect science which are not as well-known and accounted for as confirmation bias. “Motivated cognition” (aka “wishful thinking”) is one of them. It makes you believe positive outcomes are more likely than they really are. Do you recall them saying the LHC would find evidence for physics beyond the standard model. Oh, they are still saying it will?

Then there is the “sunk cost fallacy”: The more time and effort you’ve spent on SUSY, the less likely you are to call it quits, even though the odds look worse and worse. I had a case of that when I refused to sign up for the Scandinavian Airline frequent flyer program after I realized that I'd be a gold member now had I done this 6 years ago.

I already mentioned the social desirability bias that discourages us from speaking unwelcome truths, but there are other social biases that you can see in action in science.

The “false consensus effect” is one of them. We tend to overestimate how much and how many other people agree with us. Certainly nobody can disagree that string theory is the correct theory of quantum gravity. Right. Or, as Joseph Lykken and Maria Spiropulu put it:
“It is not an exaggeration to say that most of the world’s particle physicists believe that supersymmetry must be true.” (Their emphasis.)
The “halo effect” is the reason we pay more attention to literally every piece of crap a Nobelprize winner utters. The above mentioned “in-group bias” is what makes us think researchers in our own field are more intelligent than others. It’s the way people end up studying psychology because they were too stupid for physics. The “shared information bias” is the one in which we discuss the same “known problems” over and over and over again and fail to pay attention to new information held only by a few people.

One of the most problematic distortions in science is that we consider a fact more likely the more often we have heard of it, called the “attentional bias” or the “mere exposure effect”. Oh, and then there is the mother of all biases, the “bias blind spot,” the insistence that we certainly are not biased.

Cognitive biases we’ve always had of course. Science has progressed regardless, so why should we start paying attention now? (Btw, it’s called the “status-quo-bias”.) We should pay attention now because shortcomings in argumentation become more relevant the more we rely on logical reasoning detached from experimental guidance. This is a problem which affects some areas of theoretical physics more than any other field of science.

The more prevalent problem though is the social biases whose effects become more pronounced the larger the groups are, the tighter they are connected, and the more information is shared. This is why these biases are so much more relevant today than a century, even two decades ago.

You can see these problems in pretty much all areas of science. Everybody seems to be thinking and talking about the same things. We’re not able to leave behind research directions that turn out fruitless, we’re bad at integrating new information, we don’t criticize our colleagues’ ideas because we are afraid of becoming “socially undesirable” when we mention the tent’s stink. We disregard ideas off the mainstream because these come from people “not like us.” And we insist our behavior is good scientific conduct, purely based on our unbiased judgement, because we cannot possibly be influenced by social and psychological effects, no matter how well established.

These are behaviors we have developed not because they are stupid, but because they are beneficial in some situations. But in some situations they can become a hurdle to progress. We weren’t born to be objective and rational. Being a good scientist requires constant self-monitoring and learning about the ways we fool ourselves. Denying the problem doesn’t solve it.

What I really wanted to say is that I’ve finally signed up for the SAS frequent flyer program.

Chad OrzelMost Played Songs Meme

This went around a different corner of my social-media universe while I was off in Waterloo, away from my iTunes. I was curious about it, though, so looked at the contents of the “25 Most Played” playlist, and having done that, I might as well post them here (the number in parentheses is the number of times it’s been played according to iTunes):

  1. Beautiful Wreck,” Shawn Mullins, (280)
  2. “In The Mood,” Glenn Miller And His Orchestra, (279)
  3. “Christmas (Baby Please Come Home),” Darlene Love, (278)
  4. “Almost Saturday Night,” John Fogerty, (277)
  5. “Shake It Up,” The Cars, (275)
  6. “Sunblock,” Emmet Swimming, (275)
  7. “Life, In A Nutshell,” Barenaked Ladies, (273)
  8. “Manhattan,” Visqueen, (273)
  9. “Tell Balgeary, Balgury Is Dead,” Ted Leo & The Pharmacists, (272)
  10. “Sing, Sing, Sing,” Benny Goodman & His Orchestra, (270)
  11. “Big Brown Eyes,” Old 97’s, (266)
  12. “The Whole Of The Moon,” The Waterboys, (266)
  13. “Lost In The Supermarket,” The Afghan Whigs, (264)
  14. “Hey Hey What Can I Do,” Led Zeppelin, (264)
  15. “The World Without Logos,” Hellsing, (259)
  16. “All The Lilacs In Ohio,” John Hiatt, (256)
  17. “Dance The Night Away,” The Mavericks, (255)
  18. “Shoe Box,” Barenaked Ladies, (254)
  19. “Rockaway Beach,” The Ramones, (254)
  20. “Somebody To Shove,” Soul Asylum, (253)
  21. “Catch My Disease,” Ben Lee, (252)
  22. “Bye, Bye,” The Subdudes, (251)
  23. “Open All Night,” Bruce Springsteen, (250)
  24. “Singing In My Sleep,” Semisonic, (250)
  25. “Sister Havana,” Urge Overkill, (250)

That’s a pretty decent list of tracks, right there. These are all on the “FutureBaby” playlist, I believe, which I somewhat regularly use as a source for kid-friendly shuffle play. Which is why there’s no Hold Steady on this, and only one cover song by the Afghan Whigs.

So, there you have it: the music I’ve heard the most (while sitting at this particular computer) over the last several years. Make of that what you will.

June 27, 2015

David Hoggmore K2 proposal

In my research time today, I kept on the K2 proposal. I made the argument that we want K2 to point less well in Campaign 9 than it has in previous Campaigns, because we want the crowded field (which is in the Bulge of the Milky Way) to move significantly relative to the pixel grid. We need that redundancy (or heterogeneity?) for self-calibration. I hope that—if we get this proposal funded—we will also get influence over the spacecraft attitude management!

David HoggK2 flat-field

I spent my research time today working on my proposal for K2 Campaign 9. The proposal is to self-calibrate to get the flat-field, which is critical for crowded-field photometry (even if done via image differencing).

Scott AaronsonCelebrate gay marriage—and its 2065 equivalent

Yesterday was a historic day for the United States, and I was as delighted as everyone else I know.  I’ve supported gay marriage since the mid-1990s, when as a teenager, I read Andrew Hodges’ classic biography of Alan Turing, and burned with white-hot rage at Turing’s treatment.  In the world he was born into—our world, until fairly recently—Turing was “free”: free to prove the unsolvability of the halting problem, free to help save civilization from the Nazis, just not free to pursue the sexual and romantic fulfillment that nearly everyone else took for granted.  I resolved then that, if I was against anything in life, I was against the worldview that had hounded Turing to his death, or anything that even vaguely resembled it.

So I’m proud for my country, and I’m thrilled for my gay friends and colleagues and relatives.  At the same time, seeing my Facebook page light up with an endless sea of rainbow flags and jeers at Antonin Scalia, there’s something that gnaws at me.  To stand up for Alan Turing in 1952 would’ve taken genuine courage.  To support gay rights in the 60s, 70s, 80s, even the 90s, took courage.  But celebrating a social change when you know all your friends will upvote you, more than a decade after the tide of history has made the change unstoppable?  It’s fun, it’s righteous, it’s justified, I’m doing it myself.  But let’s not kid ourselves by calling it courageous.

Do you want to impress me with your moral backbone?  Then go and find a group that almost all of your Facebook friends still consider it okay, even praiseworthy, to despise and mock, for moral failings that either aren’t failings at all or are no worse than the rest of humanity’s.  (I promise: once you start looking, it shouldn’t be hard to find.)  Then take a public stand for that group.

Chad OrzelCourse Report: Intro Mechanics Spring 2015

I’ve been pretty quiet about educational matters of late, for the simple reason that I was too busy teaching to say much. The dust having settled a bit, though, I thought I would put some notes here about what I did this past term, and what worked.

I had two sections of the introductory Newtonian mechanics course in the Spring term; this was off the normal sequence for engineering majors (the engineers mostly take this in the Winter term of their first year), but this year we had yet another larger-than-expected engineering class, and needed to open another section. I picked up both of these, which meant I didn’t need to coordinate with anybody else, and could experiment a bit.

One of the things I tried this term was making more use of the Direct Measurement Videos made by Peter Bohacek at Carleton. These are high-quality videos of various basic physics scenarios, with a frame-by-frame player and length scales put directly on the videos, so you can determine velocities and so forth by measuring positions and counting frames.

We did a few of these as in-class exercises, and I assigned a few others for homework. I was a little disappointed with the results from the homework problems– a depressing number of students didn’t recognize what they were supposed to do, even after I explained in detail in class– but student comments about these were surprisingly positive. I’ll do more with these videos in the future, possibly with more scaffolding in the statement of the homework questions.

I continued to experiment with standards-based-grading, using the same sort of leveled scheme I did the last couple of times I taught introductory E&M. I broke the content up into a series of subjects (Vectors, Kinematics, the Momentum Principle, etc.) and sorted the various skills associated with each into three levels (here’s the full list (PDF)). I coded each assignment (homework, quizzes, exams) in terms of these standards, and then gave a grade of 0, 1, or 2 to each standard. At the end of the term, I averaged these, and weighted the scores so a 2/2 on all Level I standards would give a student a C, 2/2 on all Level II would raise it to a B, and at least some Level III standards would be needed to lift a student into the A range.

The main advantages of this are that it gives students somewhat better feedback on what areas they need to work on, and doesn’t penalize students for bad scores early in the term, provided they figure out what’s going on later. The main disadvantages are that there’s a lot more tedious clerical work on the faculty side to make it all work, and it’s non-standard enough that students are sometimes confused about their grades. To get around the latter problem, I handed out “if the course ended today” grade breakdowns after each of the midterm exams; that seems to have soothed nerves enough that I didn’t get many complaints about the grading on the course comments.

As with the last time I taught this, a few years ago, I did whiteboard discussions in class. The room is set up so students sit in pairs (though there were a few groups of three), and each group has a whiteboard and markers. I use modified versions of the “clicker questions” found at the Colorado PER page— tricky conceptual questions with multiple-choice answers, and generally no numbers. They work on these together with their partner, then we talk about the answers, “polling” the class by having them hold up the whiteboards.

This works pretty well by the various measures I have access to– the class performance on exams was about what I usually expect, and the gains on the standard conceptual test that we use to track things was pretty good (not spectacular, but above the “traditional lecture” range). Student response was generally fairly positive– there were the inevitable complaints that I didn’t work enough example problems in class, but a fair number of comments saying that they found this more engaging and interactive.

I was a bit concerned that there would be problems with the group dynamics; some times in the past, I’ve switched groups up after a couple of weeks, so the same students weren’t together for a whole term. This addresses the uncanny ability of confused students to find each other when making self-selected groups on the first day, but sometimes annoys students who just get comfortable with a particular partner, and then have to switch to a new partner. This term, I didn’t do that because reasons, and just made an effort to keep a closer eye on the pairs of confused students so I could provide them more assistance. That seems to have worked out, but I’m not sure it wasn’t just luck.

So, you know, a reasonably successful term. The negative comments I got were mostly things attributable at least in part to being department chair– I wasn’t able to offer many office hours, and I was slow getting homework graded and returned. Some of the latter was also due to this being the first time I’ve used SBG in the intro mechanics course, so I had to code the problems for the standards and write up new solutions to everything. Next time I do it, that part will go more quickly.

Of course, the next time I do this will be September 2016 at the earliest, as I’m on sabbatical next year. Calloo, callay, and all that.

John BaezHigher-Dimensional Rewriting in Warsaw (Part 2)

Today I’m going to this workshop:

Higher-Dimensional Rewriting and Applications, 28-29 June 2015, Warsaw, Poland.

Many of the talks will be interesting to people who are trying to use category theory as a tool for modelling networks!

For example, though they can’t actually attend, Lucius Meredith and my student Mike Stay hope to use Google Hangouts to present their work on Higher category models of the π-calculus. The π-calculus is a way of modelling networks where messages get sent here and there, e.g. the internet. Check out Mike’s blog post about this:

• Mike Stay, A 2-categorical approach to the pi calculus, The n-Category Café, 26 May 2015.

Krzysztof Bar, Aleks Kissinger and Jamie Vicary will be speaking about Globular, a proof assistant for computations in n-categories:

This talk is a progress report on Globular, an online proof assistant for semistrict higher-dimensional rewriting. We aim to produce a tool which can visualize higher-dimensional categorical diagrams, assist in their construction with a point-and-click interface, perform type checking to prevent incorrect composites, and automatically handle the interchanger data at each dimension. Hosted on the web, it will have a low barrier to use, and allow hyperlinking of formalized proofs directly from research papers. We outline the theoretical basis for the tool, and describe the challenges we have overcome in its design.

Eric Finster will be talking about another computer system for dealing with n-categories, based on the ‘opetopic’ formalism that James Dolan and I invented. And Jason Morton is working on a computer system for computation in compact closed categories! I’ve seen it, and it’s cool, but he can’t attend the workshop, so David Spivak will be speaking on his work with Jason on the theoretical foundations of this software:

We consider the linked problems of (1) finding a normal form for morphism expressions in a closed compact category and (2) the word problem, that is deciding if two morphism expressions are equal up to the axioms of a closed compact category. These are important ingredients for a practical monoidal category computer algebra system. Previous approaches to these problems include rewriting and graph-based methods. Our approach is to re-interpret a morphism expression in terms of an operad, and thereby obtain a single composition which is strictly associative and applied according to the abstract syntax tree. This yields the same final operad morphism regardless of the tree representation of the expression or order of execution, and solves the normal form problem up to automorphism.

Recently Eugenia Cheng has been popularizing category theory, touring to promote her book Cakes, Custard and Category Theory. But she’ll be giving two talks in Warsaw, I believe on distributive laws for Lawvere theories.

As for me, I’ll be promoting my dream of using category theory to understand networks in electrical engineering. I’ll be giving a talk on control theory and a talk on electrical circuits: two sides of the same coin, actually.

• John Baez, Jason Erbele and Nick Woods, Categories in control.

If you’ve seen a previous talk of mine with the same title, don’t despair—this one has new stuff! In particular, it talks about a new paper by Nick Woods and Simon Wadsley.

Abstract. Control theory is the branch of engineering that studies dynamical systems with inputs and outputs, and seeks to stabilize these using feedback. Control theory uses “signal-flow diagrams” to describe processes where real-valued functions of time are added, multiplied by scalars, differentiated and integrated, duplicated and deleted. In fact, these are string diagrams for the symmetric monoidal category of finite-dimensional vector spaces, but where the monoidal structure is direct sum rather than the usual tensor product. Jason Erbele has given a presentation for this symmetric monoidal category, which amounts to saying that it is the PROP for bicommutative bimonoids with some extra structure.

A broader class of signal-flow diagrams also includes “caps” and “cups” to model feedback. This amounts to working with a larger symmetric monoidal category where objects are still finite-dimensional vector spaces but the morphisms are linear relations. Erbele also found a presentation for this larger symmetric monoidal category. It is the PROP for a remarkable thing: roughly speaking, an object with two special commutative dagger-Frobenius structures, such that the multiplication and unit of either one and the comultiplication and counit of the other fit together to form a bimonoid.

• John Baez and Brendan Fong, Circuits, categories and rewrite rules.

Abstract. We describe a category where a morphism is an electrical circuit made of resistors, inductors and capacitors, with marked input and output terminals. In this category we compose morphisms by attaching the outputs of one circuit to the inputs of another. There is a functor called the ‘black box functor’ that takes a circuit, forgets its internal structure, and remembers only its external behavior. Two circuits have the same external behavior if and only if they impose same relation between currents and potentials at their terminals. This is a linear relation, so the black box functor goes from the category of circuits to the category of finite-dimensional vector spaces and linear relations. Constructing this functor makes use of Brendan Fong’s theory of ‘decorated cospans’—and the question of whether two ‘planar’ circuits map to the same relation has an interesting answer in terms of rewrite rules.

The answer to the last question, in the form of a single picture, is this:

(Click to enlarge.) How can you change an electrical circuit made out of resistors without changing what it does? 5 ways are shown here:

  1. You can remove a loop of wire with a resistor on it. It doesn’t do anything.
  2. You can remove a wire with a resistor on it if one end is unattached. Again, it doesn’t do anything.

  3. You can take two resistors in series—one after the other—and replace them with a single resistor. But this new resistor must have a resistance that’s the sum of the old two.

  4. You can take two resistors in parallel and replace them with a single resistor. But this resistor must have a conductivity that’s the sum of the old two. (Conductivity is the reciprocal of resistance.)

  5. Finally, the really cool part: the Y-Δ transform. You can replace a Y made of 3 resistors by a triangle of resistors But their resistances must be related by the equations shown here.

For circuits drawn on the plane, these are all the rules you need! This was proved here:

• Yves Colin de Verdière, Isidoro Gitler and Dirk Vertigan, Réseaux électriques planaires II.

It’s just the beginning of a cool story, which I haven’t completely blended with the categorical approach to circuits. Doing so clearly calls for 2-categories: those double arrows are 2-morphisms! For more, see:

• Joshua Alman, Carl Lian and Brandon Tran, Circular planar electrical networks I: The electrical poset EPn.

June 26, 2015

Clifford JohnsonNaddy

Yesterday, an interesting thing happened while I was out in my neighbourhood walking my son for a good hour or more (covered, in a stroller - I was hoping he'd get some sleep), visiting various shops, running errands. Before describing it, I offer two bits of background information as (possibly relevant?) context. (1) I am black. (2) I live in a neighbourhood where there are very few people of my skin colour as residents. Ok, here's the thing: * * * I'm approaching two young (mid-to-late teens?) African-American guys, sitting at a bus stop, chatting and laughing good-naturedly. As I begin to pass them, nodding a hello as I push the stroller along, one of them stops me. [...] Click to continue reading this post

Scott AaronsonA query complexity breakthrough

Update (June 26): See this just-released paper, which independently obtains a couple of the same results as the Ambainis et al. paper, but in a different way (using the original Göös et al. function, rather than modifications of it).

Lots of people have accused me of overusing the word “breakthrough” on this blog. So I ask them: what word should I use when a paper comes out that solves not one, not two, but three of the open problems I’ve cared about most for literally half of my life, since I was 17 years old?

Yesterday morning, Andris Ambainis, Kaspars Balodis, Aleksandrs Belovs, Troy Lee, Miklos Santha, and Juris Smotrovs posted a preprint to ECCC in which they give:

(1) A total Boolean function f with roughly a fourth-power separation between its deterministic and bounded-error quantum query complexities (i.e., with D(f)~Q(f)4). This refutes the conjecture, which people have been making since Beals et al.’s seminal work in 1998, that the biggest possible gap is quadratic.

(2) A total Boolean function f with a quadratic separation between its deterministic and randomized query complexities (with D(f)~R0(f)2). This refutes a conjecture of Saks and Wigderson from 1986, that the best possible gap is R0(f)~D(f)0.753 (from the recursive AND/OR tree), and shows that the known relation D(f)=O(R0(f)2) is close to tight.

(3) The first total Boolean function f with any asymptotic gap between its zero-error and bounded-error randomized query complexities (in particular, with R0(f)~R(f)3/2).

(There are also other new separations—for example, involving exact quantum query complexity and approximate degree as a real polynomial. But the above three are the most spectacular to me.)

In updates to this post (coming soon), I’ll try my best to explain to general readers what D(f), R(f), and so forth are (see here for the classic survey of these measures), and I’ll also discuss how Ambainis et al. designed the strange functions f that achieve the separations (though their paper already does a good job of explaining it). For now, I’ll just write the stuff that’s easier to write.

I’m at the Federated Computing Research Conference in Portland, Oregon right now, where yesterday I gave my STOC talk (click here for the PowerPoint slides) about the largest possible separations between R(f) and Q(f) for partial Boolean functions f. (That paper is also joint work with Andris Ambainis, who has his fingers in many pies, or his queries in many oracles, or something.) Anyway, when I did a practice run of my talk on Monday night, I commented that, of course, for total Boolean functions f (those not involving a promise), the largest known gap between R(f) and Q(f) is quadratic, and is achieved when f is the OR function because of Grover’s algorithm.

Then, Tuesday morning, an hour before I was to give my talk, I saw the Ambainis et al. bombshell, which made that comment obsolete. So, being notoriously bad at keeping my mouth shut, I mentioned to my audience that, while it was great that they came all the way to Portland to learn what was new in theoretical computer science, if they wanted real news in the subfield I was talking about, they could stop listening to me and check their laptops.

(Having said that, I have had a wonderful time at FCRC, and have learned lots of other interesting things—I can do another addendum to the post about FCRC highlights if people want me to.)

Anyway, within the tiny world of query complexity—i.e., the world where I cut my teeth and spent much of my career—the Ambainis et al. paper is sufficiently revolutionary that I feel the need to say what it doesn’t do.

First, the paper does not give a better-than-quadratic gap between R(f) and Q(f) (i.e., between bounded-error randomized and quantum query complexities). The quantum algorithms that compute their functions f are still “just” variants of the old standbys, Grover’s algorithm and amplitude amplification. What’s new is that the authors have found functions where you can get the quadratic, Grover speedup between R(f) and Q(f), while also getting asymptotic gaps between D(f) and R(f), and between R0(f) and R(f). So, putting it together, you get superquadratic gaps between D(f) and Q(f), and between R0(f) and Q(f). But it remains at least a plausible conjecture that R(f)=O(Q(f)2) for all total Boolean functions f—i.e., if you insist on a “fair comparison,” then the largest known quantum speedup for total Boolean functions remains the Grover one.

Second, as far as I can tell (I might be mistaken) (I’m not), the paper doesn’t give new separations involving certificate complexity or block sensitivity (e.g., between D(f) and bs(f)). So for example, it remains open whether D(f)=O(bs(f)2), and whether C(f)=O(bs(f)α) for some α<2. (Update: Avishay Tal, in the comments, informs me that the latter conjecture was falsified by Gilmer, Saks, and Srinivasan in 2013. Wow, I’m really out of it!)

In the end, achieving these separations didn’t require any sophisticated new mathematical machinery—just finding the right functions, something that could’ve been done back in 1998, had anyone been clever enough. So, where did these bizarre functions f come from? Ambainis et al. directly adapted them from a great recent communication complexity paper by Mika Göös, Toniann Pitassi, and Thomas Watson. But the Göös et al. paper itself could’ve been written much earlier. It’s yet another example of something I’ve seen again and again in this business, how there’s no substitute for just playing around with a bunch of examples.

The highest compliment one researcher can pay another is, “I wish I’d found that myself.” And I do, of course, but having missed it, I’m thrilled that at least I get to be alive for it and blog about it. Huge congratulations to the authors!

Addendum: What’s this about?

OK, so let’s say you have a Boolean function f:{0,1}n→{0,1}, mapping n input bits to 1 output bit. Some examples are the OR function, which outputs 1 if any of the n input bits are 1, and the MAJORITY function, which outputs 1 if the majority of them are.

Query complexity is the study of how many input bits you need to read in order to learn the value of the output bit. So for example, in evaluating the OR function, if you found a single input bit that was 1, you could stop right there: you’d know that the output was 1, without even needing to look at the remaining bits. In the worst case, however, if the input consisted of all 0s, you’d have to look at all of them before you could be totally sure the output was 0. So we say that the OR function has a deterministic query complexity of n.

In this game, we don’t care about any other resources used by an algorithm, like memory or running time: just how many bits of the input it looks at! There are many reasons why, but the simplest is that, unlike with memory or running time, for many functions we can actually figure out how many input bits need to be looked at, without needing to solve anything like P vs. NP. (But note that this can already be nontrivial! For algorithms can often cleverly avoid looking at all the bits, for example by looking at some and then deciding which ones to look at next based on which values they see.)

In general, given a deterministic algorithm A and an n-bit input string x, let DA,x (an integer from 0 to n) be the number of bits of x that A examines when you run it. Then let DA be the maximum of DA,x over all n-bit strings x. Then D(f), or the deterministic query complexity of f, is the minimum of DA, over all algorithms A that correctly evaluate f(x) on every input x.

For example, D(OR) and D(MAJORITY) are both n: in the worst case, you need to read everything. For a more interesting example, consider the 3-bit Boolean function

f(x,y,z) = (not(x) and y) or (x and z).

This function has D(f)=2, even though it depends on all 3 of the input bits. (Do you see why?) In general, even if f depends on n input bits, D(f) could be as small as log2n.

The bounded-error randomized query complexity, or R(f), is like D(f), except that now we allow the algorithm to make random choices of which input bit to query, and for each input x, the algorithm only needs to compute f(x) with probability 2/3. (Here the choice of 2/3 is arbitrary; if you wanted the right answer with some larger constant probability, say 99.9%, you could just repeat the algorithm a constant number of times and take a majority vote.) The zero-error randomized query complexity, or R0(f), is the variant where the algorithm is allowed to make random choices, but at the end of the day, needs to output the correct f(x) with probability 1.

To illustrate these concepts, consider the three-bit majority function, MAJ(x,y,z). We have D(MAJ)=3, since if a deterministic algorithm queried one bit and got a 0 and queried a second bit and got a 1 (as can happen), it would have no choice but to query the third bit. But for any possible setting of x, y, and z, if we choose which bits to query randomly, there’s at least a 1/3 chance that the first two queries will return either two 0s or two 1s—at which point we can stop, with no need to query the third bit. Hence R0(MAJ)≤(1/3)2+(2/3)3=8/3 (in fact it equals 8/3, although we haven’t quite shown that). Meanwhile, R(MAJ), as we defined it, is only 1, since if you just need a 2/3 probability of being correct, you can simply pick x, y, or z at random and output it!

The bounded-error quantum query complexity, or Q(f), is the minimum number of queries made by a quantum algorithm for f, which, again, has to output the right answer with probability at least 2/3 for every input x. Here a quantum algorithm makes a “query” by feeding a superposition of basis states, each of the form |i,a,w〉, to a “black box,” which maps each basis state to |i, a XOR xi, w〉, where i is the index of the input bit xi to be queried, a is a 1-qubit “answer register” into which xi is reversibly written, and w is a “workspace” that doesn’t participate in the query. In between two queries, the algorithm can apply any unitary transformation it wants to the superposition of |i,a,w〉’s, as long as it doesn’t depend on x. Finally, some designated qubit is measured to decide whether the algorithm accepts or rejects.

As an example, consider the 2-bit XOR function, XOR(x,y). We have D(XOR)=R0(XOR)=R(XOR)=2, since until you’ve queried both bits, you’ve learned nothing about their XOR. By contrast, Q(XOR)=1, because of the famous Deutsch-Jozsa algorithm.

It’s clear that

0 ≤ Q(f) ≤ R(f) ≤ R0(f) ≤ D(f) ≤ n,

since a quantum algorithm can simulate a randomized one and a randomized one can simulate a deterministic one.

A central question for the field, since these measures were studied in the 1980s or so, has been how far apart these measures can get from each other. If you allow partial Boolean functions—meaning that only some n-bit strings, not all of them, are “valid inputs” for which the algorithm needs to return a definite answer—then it’s easy to get enormous separations between any two of the measures (indeed, even bigger than exponential), as for example in my recent paper with Andris.

For total functions, by contrast, it’s been known for a long time that these measures can differ by at most polynomial factors:

D(f) = O(R(f)3) (Nisan)

D(f) = O(R0(f)2) (folklore, I think)

R0(f) = O(R2 log(n)) (Midrijanis)

D(f) = O(Q(f)6) (Beals et al. 1998)

OK, so what were the largest known gaps? For D versus R0 (as well as D versus R), the largest known gap since 1986 has come from the “recursive AND/OR tree”: that is, an OR of two ANDs of two ORs of two ANDs of … forming a complete binary tree of depth d, with the n=2d input variables comprising the leaves. For this function, we have D(f)=n, whereas Saks and Wigderson showed that R0(f)=Θ(n0.753) (and later, Santha showed that R(f)=Θ(n0.753) as well).

For D versus Q, the largest gap has been for the OR function: we have D(OR)=n (as mentioned earlier), but Q(OR)=Θ(√n) because of Grover’s algorithm. Finally, for R0 versus R, no asymptotic gap has been known for any total function. (This is a problem that I clearly remember working on back in 2000, when I was an undergrad. I even wrote a computer program, the Boolean Function Wizard, partly to search for separations between R0 versus R. Alas, while I did find a one or two functions with separations, I was unable to conclude anything from them about asymptotics.)

So, how did Ambainis et al. achieve bigger gaps for each of these? I’ll try to have an explanation written by the time my flight from Portland to Boston has landed tonight. But if you can’t wait for that, or you prefer it straight from the horse’s mouth, read their paper!

Addendum 2: The Actual Functions

As I mentioned before, the starting point for everything Ambainis et al. do is a certain Boolean function g recently constructed by Göös, Pitassi, and Watson (henceforth GPW), for different purposes than the ones that concern Ambainis et al. We think of the inputs to g as divided into nm “cells,” which are arranged in a rectangular grid with m columns and n rows. Each cell contains a bit that’s either 0 or 1 (its “label), as well as a pointer to another cell (consisting of ~log2(nm) bits). The pointer can also be “null” (i.e., can point nowhere). We’ll imagine that a query of a cell gives you everything: the label and all the bits of the pointer. This could increase the query complexity of an algorithm, but only by a log(n) factor, which we won’t worry about.

Let X be a setting of all the labels and pointers in the grid. Then the question we ask about X is the following:

    Does there exist a “marked column”: that is, a column where all n of the labels are 1, and which has exactly one non-null pointer, which begins a chain of pointers of length m-1, which visits exactly one “0” cell in each column other than the marked column, and then terminates at a null pointer?

If such a marked column exists, then we set g(X)=1; otherwise we set g(X)=0. Crucially, notice that if a marked column exists, then it’s unique, since the chain of pointers “zeroes out” all m-1 of the other columns, and prevents them from being marked.

This g already leads to a new query complexity separation, one that refutes a strengthened form of the Saks-Wigderson conjecture. For it’s not hard to see that D(g)=Ω(mn): indeed, any deterministic algorithm must query almost all of the cells. A variant of this is proved in the paper, but the basic idea is that an adversary can answer all queries with giant fields of ‘1’ labels and null pointers—until a given column is almost completed, at which point the adversary fills in the last cell with a ‘0’ label and a pointer to the last ‘0’ cell that it filled in. The algorithm just can’t catch a break; it will need to fill in m-1 columns before it knows where the marked one is (if a marked column exists at all).

By contrast, it’s possible to show that, if n=m, then R(g) is about O(n4/3). I had an argument for R(g)=O((n+m)log(m)) in an earlier version of this post, but the argument was wrong; I thank Alexander Belov for catching the error. I’ll post the R(g)=O(n4/3) argument once I understand it.

To get the other separations—for example, total Boolean functions for which D~R02, D~Q4, R0~Q3, R0~R3/2, and R~approxdeg4—Ambainis et al. need to add various “enhancements” to the basic GPW function g defined above. There are three enhancements, which can either be added individually or combined, depending on one’s needs.

1. Instead of just a single marked column, we can define g(X) to be 1 if and only if there are k marked columns, which point to each other in a cycle, and which also point to a trail of m-k ‘0’ cells, showing that none of the other columns contain all ‘1’ cells. This can help a bounded-error randomized algorithm—which can quickly find one of the all-1 columns using random sampling—while not much helping a zero-error randomized algorithm.

2. Instead of a linear chain of pointers showing that all the non-marked columns contain a ‘0’ cell, for g(X) to be 1 we can demand a complete binary tree of pointers, originating at a marked column and fanning out to all the unmarked columns in only log(m) layers. This can substantially help a quantum algorithm, which can’t follow a pointer trail any faster than a classical algorithm can; but which, given a complete binary tree, can “fan out” and run Grover’s algorithm on all the leaves in only the square root of the number of queries that would be needed classically. Meanwhile, however, putting the pointers in a tree doesn’t much help deterministic or randomized algorithms.

3. In addition to pointers “fanning out” from a marked column to all of the unmarked columns, we can demand that in every unmarked column, some ‘0’ cell contains a back-pointer, which leads back to a marked column. These back-pointers can help a randomized or quantum algorithm find a marked column faster, while not much helping a deterministic algorithm.

Unless I’m mistaken, the situation is this:

With no enhancements, you can get D~R2 and something like D~R03/2 (although I still don’t understand how you get the latter with no enhancements; the paper mentions it without proof Andris has kindly supplied a proof here).

With only the cycle enhancement, you can get R0~R3/2.

With only the binary tree enhancement, you can get R~approxdeg4.

With only the back-pointer enhancement, you can get D~R02.

With the cycle enhancement and the binary-tree enhancement, you can get R0~Q3.

With the back-pointer enhancement and the binary-tree enhancement, you can get D~Q4.

It’s an interesting question whether there are separations that require both the cycle enhancement and the back-pointer enhancement; Ambainis et al. don’t give any examples.

And here’s another interesting question not mentioned in the paper. Using the binary-tree enhancement, Ambainis et al. achieve a fourth-power separation between bounded-error randomized query complexity and approximate degree as a real polynomial—i.e., quadratically better than any separation that was known before. Their proof of this involves cleverly constructing a low-degree polynomial by summing a bunch of low-degree polynomials derived from quantum algorithms (one for each possible marked row). As a result, their final, summed polynomial does not itself correspond to a quantum algorithm, meaning that they don’t get a fourth-power separation between R and Q (which would’ve been even more spectacular than what they do get). On the other hand, purely from the existence of a function with R~approxdeg4, we can deduce that that function has either

(i) a super-quadratic gap between R and Q (refuting my conjecture that the Grover speedup is the best possible quantum speedup for total Boolean functions), or
(ii) a quadratic gap between quantum query complexity and approximate degree—substantially improving over the gap found by Ambainis in 2003.

I conjecture that the truth is (ii); it would be great to have a proof or disproof of this.

John PreskillHolography and the MERA

The AdS/MERA correspondence has been making the rounds of the blogosphere with nice posts by Scott Aaronson and Sean Carroll, so let’s take a look at the topic here at Quantum Frontiers.

The question of how to formulate a quantum theory of gravity is a long-standing open problem in theoretical physics. Somewhat recently, an idea that has gained a lot of traction (and that Spiros has blogged about before) is emergence. This is the idea that space and time may emerge from some more fine-grained quantum objects and their interactions. If we could understand how classical spacetime emerges from an underlying quantum system, then it’s not too much of a stretch to hope that this understanding would give us insight into the full quantum nature of spacetime.

One type of emergence is exhibited in holography, which is the idea that certain (D+1)-dimensional systems with gravity are exactly equivalent to D-dimensional quantum theories without gravity. (Note that we’re calling time a dimension here. For example, you would say that on a day-to-day basis we experience D = 4 dimensions.) In this case, that extra +1 dimension and the concomitant gravitational dynamics are emergent phenomena.

A nice aspect of holography is that it is explicitly realized by the AdS/CFT correspondence. This correspondence proposes that a particular class of spacetimes—ones that asymptotically look like anti-de Sitter space, or AdS—are equivalent to states of a particular type of quantum system—a conformal field theory, or CFT. A convenient visualization is to draw the AdS spacetime as a cylinder, where time marches forward as you move up the cylinder and different slices of the cylinder correspond to snapshots of space at different instants of time. Conveniently, in this picture you can think of the corresponding CFT as living on the boundary of the cylinder, which, you should note, has one less dimension than the “bulk” inside the cylinder.


Even within this nice picture of holography that we get from the AdS/CFT correspondence, there is a question of how exactly do CFT, or boundary quantities map onto quantities in the AdS bulk. This is where a certain tool from quantum information theory called tensor networks has recently shown a lot of promise.

A tensor network is a way to efficiently represent certain states of a quantum system. Moreover, they have nice graphical representations which look something like this:


Beni discussed one type of tensor network in his post on holographic codes. In this post, let’s discuss the tensor network shown above, which is known as the Multiscale Entanglement Renormalization Ansatz, or MERA.

The MERA was initially developed by Guifre Vidal and Glen Evenbly as an efficient approximation to the ground state of a CFT. Roughly speaking, in the picture of a MERA above, one starts with a simple state at the centre, and as you move outward through the network, the MERA tells you how to build up a CFT state which lives on the legs at the boundary. The MERA caught the eye of Brian Swingle, who noticed that it looks an awfully lot like a discretization of a slice of the AdS cylinder shown above. As such, it wasn’t a preposterously big leap to suggest a possible “AdS/MERA correspondence.” Namely, perhaps it’s more than a simple coincidence that a MERA both encodes a CFT state and resembles a slice of AdS. Perhaps the MERA gives us the tools that are required to construct a map between the boundary and the bulk!

So, how seriously should one take the possibility of an AdS/MERA correspondence? That’s the question that my colleagues and I addressed in a recent paper. Essentially, there are several properties that a consistent holographic theory should satisfy in both the bulk and the boundary. We asked whether these properties are still simultaneously satisfied in a correspondence where the bulk and boundary are related by a MERA.

What we found was that you invariably run into inconsistencies between bulk and boundary physics, at least in the simplest construals of what an AdS/MERA correspondence might be. This doesn’t mean that there is no hope for an AdS/MERA correspondence. Rather, it says that the simplest approach will not work. For a good correspondence, you would need to augment the MERA with some additional structure, or perhaps consider different tensor networks altogether. For instance, the holographic code features a tensor network which hints at a possible bulk/boundary correspondence, and the consistency conditions that we proposed are a good list of checks for Beni and company as they work out the extent to which the code can describe holographic CFTs. Indeed, a good way to summarize how our work fits into the picture of quantum gravity alongside holography and tensors networks is by saying that it’s nice to have good signposts on the road when you don’t have a map.

Mark Chu-CarrollTruth in Type Theory

Now, we’re getting to the heart of type theory: judgements. Judgements are the basic logical statements that we want to be able to prove about computations in type theory. There’s a basic set of judgements that we want to be able to make.

I keep harping on this, but it’s the heart of type theory: type theory is all about understanding logical statements as specifications of computations. Or, more precisely, in computer science terms, they’re about understanding true logical statements as halting computations.

In this post, we’ll see the ultimate definition of truth in type theory: every logical proposition is a set, and the proposition is true if the set has any members. A non-halting computation is a false statement – because you can never get it to resolve an expression to a canonical value.

So remember as we go through this: judgements are based on the idea of logical statements as specifications of computations. So when we talk about a predicate P, we’re using its interpretation as a specification of a computation. When we look at an expression 3+5, we understand it not as a piece of notation that describes the number 8, but as a description of a computation that adds 3 to 5. “3+5″ not the same computation as “2*4″ or “2+2+2+2″, but as we’ll see, they’re equal because they evaluate to the same thing – that is, they each perform a computation that results in the same canonical value – the number 8.

In this post, we’re going to focus on a collection of canonical atomic judgement types:

A \text{set}
This judgement says that A is a set.
A = B
A and Bare equal sets.
a \in A
a is an element of the set A
a_1 == a_2 \in A
a_1 and a_2 are equal members of the set A.
A is a proposition.
The proposition A is true.

The definition of the meanings of judgements is, necessarily, mutually recursive, so we’ll have to go through all of them before any of them is complete.

An object A is a Set
When I say that A is a set in type theory, that means that:

  • I know the rules for how to form the canonical expressions for the set;
  • I’ve got an equivalence relation that says when two canonical members of the set are equal.
Two Sets are Equal

When I say that A and B are equal sets, that means:

  • A and B are both sets.
  • If a is a canonical member of A, then a is also a canonical member of B, and vice versa.
  • If a and b are canonical members of A, and they’re also equal in A, then a and b are also equal canonical members of B (and vice versa).

The only tricky thing about the definition of this judgement is the fact that it defines equality is a property of a set, not of the elements of the set. By this definition, tt’s possible for two expressions to be members of two different sets, and to be equal in one, but not equal in the other. (We’ll see in a bit that this possibility gets eliminated, but this stage of the definition leaves it open!)

An object is a member of a set
When I say a \in A, that means that if I evaluate a, the result will be a canonical member of A.
Two members of a set are equal
If a \in A and b \in A and a = b, that means when if you evaluate a, you’ll get a canonical expression ; and when you evaluate b, you’ll get a canonical expression . For a == b to be true, then , that is, the canonical expressions resulting from their evaluation must also be equal.

This nails down the problem back in set equivalence: since membership equivalence is defined in terms of evaluation as canonical values, and every expression evaluates to exactly one canonical expression (that’s the definition of canonical!), then if two objects are equal in a set, they’re equal in all sets that they’re members of.

An object A is a proposition
Here’s where type theory really starts to depart from the kind of math that we’re used to. In type theory, a proposition is a set. That’s it: to be a proposition, A has to be a set.
The proposition A is true
And the real meat of everything so far: if we have a proposition A, and A is true, what that means is that a has at least one element. If a proposition is a non-empty set, then it’s true. If it’s empty, it’s false.

Truth in type theory really comes down to membership in a set. This is, subtly, different from the predicate logic that we’re familiar with. In predicate logic, a quantified proposition can, ultimately, be reduced to a set of values, but it’s entirely reasonable for that set to be empty. I can, for example, write a logical statement that “All dogs with IQs greater that 180 will turn themselves into cats.” A set-based intepretation of that is the collection of objects for which it’s true. There aren’t any, because there aren’t any dogs with IQs greater than 180. It’s empty. But logically, in terms of predicate logic, I can still say that it’s “true”. In type theory, I can’t: to be true, the set has to have at least one value, and it doesn’t.

In the next post, we’ll take these atomic judgements, and start to expand them into the idea of hypothetical judgements. In the type-theory sense, that means statements require some other set of prior judgements before they can be judged. In perhaps more familiar terms, we’ll be looking at type theory’s equivalent of the contexts in classical sequent calculus – those funny little \Gammas that show up in all of the sequent rules.

Jordan EllenbergAlexandra Florea on the average central value of hyperelliptic L-functions

Alexandra Florea, a student of Soundararajan, has a nice new paper up, which I heard about in a talk by Michael Rubinstein.  She computes the average of

L(1/2, \chi_f)

as f ranges over squarefree polynomials of large degree.  If this were the value at 1 instead of the value at 1/2, this would be asking for the average number of points on the Jacobian of a hyperelliptic curve, and I could at least have some idea of where to start (probably with this paper of Erman and Wood.)  And I guess you could probably get a good grasp on moments by imitating Granville-Soundararajan?

But I came here to talk about Florea’s result.  What’s cool about it is that it has the a main term that matches existing conjectures in the number field case, but there is a second main term, whose size is about the cube root of the main term, before you get to fluctuations!

The only similar case I know is Roberts’ conjecture, now a theorem of Bhargava-Shankar-Tsimerman and Thorne-Taniguchi, which finds a similar secondary main term in the asymptotic for counting cubic fields.  And when I say similar I really mean similar — e.g. in both cases the coefficient of the secondary term is some messy thing involving zeta functions evaluated at third-integers.

My student Yongqiang Zhao found a lovely geometric interpretation for the secondary term the Roberts conjecture.  Is there some way to see what Florea’s secondary term “means” geometrically?  Of course I’m stymied here by the fact that I don’t really know how to think about her counting problem geometrically in the first place.


June 25, 2015

Clifford JohnsonSpeed Dating for Science!

youtubespace panelLast night was amusing. I was at the YouTubeLA space with 6 other scientists from various fields, engaging with an audience of writers and other creators for YouTube, TV, film, etc. It was an event hosted by the Science and Entertainment Exchange and Youtube/Google, and the idea was that we each had seven minutes to present in seven successive rooms with different audiences in each, so changing rooms each seven minutes. Of course, early on during the planning conference call for the event, one of the scientists asked why it was not more efficient to simply have one large [...] Click to continue reading this post

Chad OrzelMy Week in Waterloo

I spent the last few days in Ontario, attending the Convergence meeting at the Perimeter Institute. This brought a bunch of Perimeter alumni and other big names together for a series of talks and discussions about the current state and future course of physics.

My role at this was basically to impersonate a journalist, and I had a MEDIA credential to prove it. I did a series of posts at Forbes about different aspects of the meeting:

The Laser Cavity was Flooded: a revisiting of the idea of True Lab Stories, which was a loose series of funny disaster tales from the early days of ScienceBlogs.

Converging on the Structure of Physics: Talks from the first day fitting a loose theme of looking for underlying structure.

All Known Physics in One Meeting: The second day of talk covered an impressive range, from subatomic particles to cosmological distances.

Making Lampposts to Look for New Physics: Tying the closing panel discussion to an earlier metaphor about searching “where the light is” for exotic phenomena.

As I said, I’ll have some more follow-up next week, picking up and running with a few asides or themes that came up at the meeting. For the moment, though, I’m pretty wiped out, having put in almost 780 miles of driving over the last four days, and staying up late hanging out with science writers and theoretical physicists. So this summary post will have to hold you…

Tommaso DorigoEarly-Stage Researcher Positions To Open Soon

The Marie-Curie network I am coordinating, AMVA4NewPhysics, is going to start very soon, and with its start several things are going to happen. One you should not be concerned with is the arrival of the first tranche of the 2.4Meuros that the European Research Council has granted us. Something more interesting to you, if you have a degree in Physics or Statistics, is the fact that the network will soon start hiring ten skilled post-lauream researchers across Europe, with the aim of providing them with an exceptional plan of advanced training in particle physics, data analysis, statistics, machine learning, and more.

read more

David Hoggradical self-calibration

At group meeting, Fadely showed us plots that show that he can do what I call “radical” self-calibration with realistic (simulated) data from fields of stars. This is the kind of calibration where we figure out the flat-field and PSF simultaneously by insisting that the images we have could have been generated by point sources convolved with some pixel-convolved PSF. He also showed how the results degrade as our knowledge of the PSF gets wrong. We can withstand percent-ish problems with our PSF model, but we can't withstand tens-of-percent. That's interesting, and useful. I feel like we are pretty safe for our HST WFC3 calibration project though: We know the PSF very well and have a great first guess at the flat too.

At the same meeting, we bitched about the Astronomers' Telegram, looked at an outburst from a black-hole source, argued about mapping the sky with Fermi GBM, and looked at K2 data on a Sanchis-Ojeida planet. Oh and right after group meeting, Malz demonstrated to me conclusively that our Bayesian hierarchical inference of the redshift distribution—given probabilistic photometric redshifts—will work!

Scott Aaronson“Can Quantum Computing Reveal the True Meaning of Quantum Mechanics?”

I now have a 3500-word post on that question up at NOVA’s “Nature of Reality” blog.  If you’ve been reading Shtetl-Optimized religiously for the past decade (why?), there won’t be much new to you there, but if not, well, I hope you like it!  Comments are welcome, either here or there.  Thanks so much to Kate Becker at NOVA for commissioning this piece, and for her help editing it.

BackreactionDoes faster-than-light travel lead to a grandfather paradox?

Whatever you do, don’t f*ck with mom.
Fast track to wisdom: Not necessarily.

I stopped going to church around the same time I started reading science fiction. Because who really needs god if you can instead believe in alien civilizations, wormholes, and cell rejuvenation? Oh, yes, I wanted to leave behind this planet for a better place. But my space travel enthusiasm suffered significantly once I moved from the library’s fiction aisle to popular science, and learned that the speed of light is the absolute limit. For all we know. And ever since I have of course wondered just how well we know this.

Fact is we’ve never seen anything move faster than the speed of light (except for illusions of motion), and it is both theoretically understood and experimentally confirmed that we cannot accelerate anything to become faster than light. That doesn’t sound good for what our chances of visiting the aliens are concerned, but it isn’t the main problem. It could just be that we haven’t looked in the right places or not tried hard enough. No, the main problem is that it is very hard to make sense of faster-than-light travel at all within the context of our existing theories. And if you can’t make sense of it, how can you build it?

Special relativity doesn’t forbid motion faster than light. It just tells you that you’d need an infinite amount of energy to accelerate something which is slower than light (“subluminal”) to become faster than light (“superluminal”). Ok, the infinite energy need won’t fly with the environmentalists, I know. But if you have a particle that always moves faster than light, its existence isn’t prohibited in principle. These particles are called “tachyons,” have never been observed, and are believed to not exist for two reasons. First, they have the awkward property of accelerating when they lose energy, which lets them induce instabilities that have to be fixed somehow. (In quantum field theory one can deal with tachyonic fields, and they play an important role, but they don’t actually transmit any information faster than light. So these are not so relevant to our purposes.) Second, tachyons seem to lead to causality problems.

The causality problems with superluminal travel come about as follows. Special relativity is based on the axiom that all observers have the same laws of physics, and these are converted from one observer to another by a well-defined procedure called Lorentz-transformation. This transformation from one observer to the other maintains lightcones, because the speed of light doesn’t change. The locations of objects relative to an observer can change when the observer changes velocity. But two observers at the same location with different velocities who look at an object inside the lightcone will agree on whether it is in the past or in the the future.

Not so however with objects outside the lightcone. For these, what is in the future for one observer can be in the past of another observer. This means then that a particle that for one observer moves faster than light – ie to a point outside the lightcone – actually moves backwards in time for another observer! And since in special relativity all observers have equal rights, neither of them is wrong. So once you accept superluminal travel, you are forced to also accept travel back in time.

At least that’s what the popular science books said. It’s nonsense of course because what does it mean for a particle to move backwards in time anyway? Nothing really. If you’d see a particle move faster than light to the left, you could as well say it moved backwards in time to the right. The particle doesn’t move in any particular direction on a curve in space-time because the particles’ curves have no orientation. Superluminal particle travel is logically perfectly possible as long as it leads to a consistent story that unfolds in time, and there is nothing preventing such a story.

Take as an example the below image showing the worldline of a particle that is produced, scatters twice to change direction, travels superluminally, and goes back in time to meet itself. You could interpret the very same arrangement as saying you have produced a pair of particles, one of which scatters and then annihilates again.

No, there is no problem with the travel of superluminal particles in principle. The problems start once we think of macroscopic objects, like spaceships. We attach to their curves an arrow of time, pointing into the direction in which the travelers age. And it’s here where the trouble starts. Now special relativity indeed tells you that somebody who travels faster than light will move backwards in time for another observer, because a change of reference frame will not reverse the travelers’ arrow of time. This is what creates the grandfather paradox, in which you can travel back in time to kill your own grandfather, resulting in you never be born. Here, requiring consistency would necessitate that it is somehow impossible for you to kill your grandfather, and it is hard to see how this would be insured by the laws of physics.

While it’s hard to see what conspiracy would prevent you from killing your grandpa, it is fairly easy to see that closing the loop backwards in time is prevented by the known laws of physics. We age because entropy increases. It increases in some direction that we can, for lack of a better word, call “forward” in time. This entropy increase is ultimately correlated with decoherence and thus probably also with the restframe of the microwave background, but for our purposes it doesn’t matter so much exactly in which direction it increases, just that it increases in some direction.

Now whenever you have a closed curve that is oriented in the direction in which the traveler presumably experience the passage of time, then the arrow of time on the curve must necessarily run against the increase of entropy somewhere. Any propulsion system able to do this would have to decrease entropy against the universe’s thrust of increasing it. And that’s what ultimately prevents time-travel. In the image below I have drawn the same worldline as above with an intrinsic arrow of time (the direction in which passengers age), and how it is necessarily incompatible with any existing arrow of time along one of the curves, which is thus forbidden.

There is no propulsion system that would be able to produce the necessary finetuning to decrease entropy along the route. But even if such a propulsion existed it would just mean that time in the spaceship now runs backwards. In other words, the passengers wouldn’t actually experience moving backwards in time, but instead moving forwards in time in the opposite direction. This would force us to buy into an instance of a grandfather pair creation, later followed by a grandchild pair annihilation. It doesn’t seem very plausible, and it violates energy conservation, but besides this it’s at least a consistent story.

I briefly elaborated on this in a paper I wrote some years ago as a sidenote (see page 6). But just last month there was a longer paper on the arxiv, by Nemiroff and Russell, that studied the problems with superluminal travel in a very concrete scenario. In their example, a spaceship leaves Earth, visits an exoplanet that moves with some velocity relative to Earth, and then returns. The velocity of the spaceship at the both launches is the same relative to the planet from which the ship launches, which means it’s a different velocity on the return trip.

The authors then calculate explicitly at which velocity the curves start going back in time. They arrive at the conclusion that the necessity of a consistent time evolution for the Earth observer would then require to interpret the closed loop in time as a pair creation event, followed by a later pair annihilation, much like I argued above. Note that singling out the Earth observer as the one demanding consistency with their arrow of time is in this case what introduces a preferred frame relative to which “forward in time” is defined.

The relevant point to take away from this is that superluminal travel in and by itself is not inconsistent. Leaving aside the stability problems with superluminal particles, they do not lead to causal paradoxa. What leads to causal paradoxa is allowing travel against the arrow of time which we, for better or worse, experience. This means that superluminal travel is possible in principle, even though travel backwards in time is not.

That travel faster than light is not prevented by the existing laws of nature doesn’t mean of course that it’s possible. There is also still the minor problem that nobody has the faintest clue how to do it... Maybe it’s easier to wait for the aliens to come visit us.

June 24, 2015

Sean CarrollAlgebra of the Infrared

In my senior year of college, when I was beginning to think seriously about graduate school, a magical article appeared in the New York Times magazine. Called “A Theory of Everything,” by KC Cole, it conveyed the immense excitement that had built in the theoretical physics community behind an idea that had suddenly exploded in popularity after burbling beneath the surface for a number of years: a little thing called “superstring theory.” The human-interest hook for the story was simple — work on string theory was being led by a brilliant 36-year-old genius, a guy named Ed Witten. It was enough to cement Princeton as the place I most wanted to go to for graduate school. (In the end, they didn’t let me in.)

Nearly thirty years later, Witten is still going strong. As evidence, check out this paper that recently appeared on the arxiv, with co-authors Davide Gaiotto and Greg Moore:

Algebra of the Infrared: String Field Theoretic Structures in Massive N=(2,2) Field Theory In Two Dimensions
Davide Gaiotto, Gregory W. Moore, Edward Witten

We introduce a “web-based formalism” for describing the category of half-supersymmetric boundary conditions in 1+1 dimensional massive field theories with N=(2,2) supersymmetry and unbroken U(1)R symmetry. We show that the category can be completely constructed from data available in the far infrared, namely, the vacua, the central charges of soliton sectors, and the spaces of soliton states on ℝ, together with certain “interaction and boundary emission amplitudes”. These amplitudes are shown to satisfy a system of algebraic constraints related to the theory of A∞ and L∞ algebras. The web-based formalism also gives a method of finding the BPS states for the theory on a half-line and on an interval. We investigate half-supersymmetric interfaces between theories and show that they have, in a certain sense, an associative “operator product.” We derive a categorification of wall-crossing formulae. The example of Landau-Ginzburg theories is described in depth drawing on ideas from Morse theory, and its interpretation in terms of supersymmetric quantum mechanics. In this context we show that the web-based category is equivalent to a version of the Fukaya-Seidel A∞-category associated to a holomorphic Lefschetz fibration, and we describe unusual local operators that appear in massive Landau-Ginzburg theories. We indicate potential applications to the theory of surface defects in theories of class S and to the gauge-theoretic approach to knot homology.

I cannot, in good conscience, say that I understand very much about this new paper. It’s a kind of mathematica/formal field theory that is pretty far outside my bailiwick. (This is why scientists roll their eyes when a movie “physicist” is able to invent a unified field theory, build a time machine, and construct nanobots that can cure cancer. Specialization is real, folks!)

But there are two things about the paper that I nevertheless can’t help but remarking on. One is that it’s 429 pages long. I mean, damn. That’s a book, not a paper. Scuttlebutt informs me that the authors had to negotiate specially with the arxiv administrators just to upload the beast. Most amusingly, they knew perfectly well that a 400+ page work might come across as a little intimidating, so they wrote a summary paper!

An Introduction To The Web-Based Formalism
Davide Gaiotto, Gregory W. Moore, Edward Witten

This paper summarizes our rather lengthy paper, “Algebra of the Infrared: String Field Theoretic Structures in Massive N=(2,2) Field Theory In Two Dimensions,” and is meant to be an informal, yet detailed, introduction and summary of that larger work.

This short, user-friendly introduction is a mere 45 pages — still longer than 95% of the papers in this field. After a one-paragraph introduction, the first words of the lighthearted summary paper are “Let X be a Kähler manifold, and W : X → C a holomorphic Morse function.” So maybe it’s not that informal.

The second remarkable thing is — hey look, there’s my name! Both of the papers cite one of my old works from when I was a grad student, with Simeon Hellerman and Mark Trodden. (A related paper was written near the same time by Gary Gibbons and Paul Townsend.)

Domain Wall Junctions are 1/4-BPS States
Sean M. Carroll, Simeon Hellerman, Mark Trodden

We study N=1 SUSY theories in four dimensions with multiple discrete vacua, which admit solitonic solutions describing segments of domain walls meeting at one-dimensional junctions. We show that there exist solutions preserving one quarter of the underlying supersymmetry — a single Hermitian supercharge. We derive a BPS bound for the masses of these solutions and construct a solution explicitly in a special case. The relevance to the confining phase of N=1 SUSY Yang-Mills and the M-theory/SYM relationship is discussed.

Simeon, who was a graduate student at UCSB at the time and is now faculty at the Kavli IPMU in Japan, was the driving force behind this paper. Mark and I had recently written a paper on different ways that topological defects could intersect and join together. Simeon, who is an expert in supersymmetry, noticed that there was a natural way to make something like that happen in supersymmetric theories: in particular, you could have domain walls (sheets that stretch through space, separating different possible vacuum states) could intersect at “junctions.” Even better, domain-wall junction configurations would break some of the supersymmetry but not all of it. Setups like that are known as BPS states, and are highly valued and useful to supersymmetry aficionados. In general, solutions to quantum field theories are very difficult to find and characterize with any precision, but the BPS property lets you invoke some of the magic of supersymmetry to prove results that would otherwise be intractable.

Admittedly, the above paragraph is likely to be just as opaque to the person on the street as the Gaiotto/Moore/Witten paper is to me. The point is that we were able to study the behavior of domain walls and how they come together using some simple but elegant techniques in field theory. Think of drawing some configuration of walls as a network of lines in a plane. (All of the configurations we studied were invariant along some “vertical” direction in space, as well as static in time, so all the action happens in a two-dimensional plane.) Then we were able to investigate the set of all possible ways such walls could come together to form allowed solutions. Here’s an example, using walls that separate four different possible vacuum states:


As far as I understand it (remember — not that far!), this is a very baby version of what Gaiotto, Moore, and Witten have done. Like us, they look at a large-distance limit, worrying about how defects come together rather than the detailed profiles of the individual configurations. That’s the “infrared” in their title. Unlike us, they go way farther, down a road known as “categorification” of the solutions. In particular, they use a famous property of BPS states: you can multiply them together to get other BPS states. That’s the “algebra” of their title. To mathematicians, algebras aren’t just ways of “solving for x” in equations that tortured you in high school; they are mathematical structures describing sets of vectors that can be multiplied by each other to produce other vectors. (Complex numbers are an algebra; so are ordinary three-dimensional vectors, using the cross product operation.)

At this point you’re allowed to ask: Why should I care? At least, why should I imagine putting in the work to read a 429-page opus about this stuff? For that matter, why did these smart guys put in the work to write such an opus?

It’s a reasonable question, but there’s also a reasonable answer. In theoretical physics there are a number of puzzles and unanswered questions that we are faced with, from “Why is the mass of the Higgs 125 GeV?” to “How does information escape from black holes?” Really these are all different sub-questions of the big one, “How does Nature work?” By construction, we don’t know the answer to these questions — if we did, we’d move onto other ones. But we don’t even know the right way to go about getting the answers. When Einstein started thinking about fitting gravity into the framework of special relativity, Riemannian geometry was absolutely the last thing on his mind. It’s hard to tell what paths you’ll have to walk down to get to the final answer.

So there are different techniques. Some people will try a direct approach: if you want to know how information comes out of a black hole, think as hard as you can about what happens when black holes radiate. If you want to know why the Higgs mass is what it is, think as hard as you can about the Higgs field and other possible fields we haven’t yet found.

But there’s also a more patient, foundational approach. Quantum field theory is hard; to be honest, we don’t understand it all that well. There’s little question that there’s a lot to be learned by studying the fundamental behavior of quantum field theories in highly idealized contexts, if only to better understand the space of things that can possibly happen with an eye to eventually applying them to the real world. That, I suspect, is the kind of motivation behind a massive undertaking like this. I don’t want to speak for the authors; maybe they just thought the math was cool and had fun learning about these highly unrealistic (but still extremely rich) toy models. But the ultimate goal is to learn some basic wisdom that we will someday put to use in answering that underlying question: How does Nature work?

As I said, it’s not really my bag. I don’t have nearly the patience nor that mathematical aptitude that is required to make real progress in this kind of way. I’d rather try to work out on general principles what could have happened near the Big Bang, or how our classical world emerges out of the quantum wave function.

But, let a thousand flowers bloom! Gaiotto, Moore, and Witten certainly know what they’re doing, and hardly need to look for my approval. It’s one strategy among many, and as a community we’re smart enough to probe in a number of different directions. Hopefully this approach will revolutionize our understanding of quantum field theory — and at my retirement party everyone will be asking me why I didn’t stick to working on domain-wall junctions.

Doug NatelsonWhat is quantum coherence?

Often when people write about the "weirdness" of quantum mechanics, they talk about the difference between the interesting, often counter-intuitive properties of matter at the microscopic level (single electrons or single atoms) and the response of matter at the macroscopic level.  That is, they point out how on the one hand we can have quantum interference physics where electrons (or atoms or small molecules) seem to act like waves that are, in some sense, in multiple places at once; but on the other hand we can't seem to make a baseball act like this, or have a cat act like it's in a superposition of being both alive and dead.  Somehow, as system size (whatever that means) increases, matter acts more like classical physics would suggest, and quantum effects (except in very particular situations) become negligibly small.  How does that work, exactly?   

Rather than comparing the properties of one atom vs. 1025 atoms, we can gain some insights by thinking about one electron "by itself" vs. one electron in a more complicated environment.   We learn in high school chemistry that we need quantum mechanics to understand how electrons arrange themselves in single atoms. The 1s orbital of a hydrogen atom is a puffy spherical shape; the 2p orbitals look like two-lobed blobs that just touch at the position of the proton; the higher d and f orbitals look even more complicated.  Later on, if you actually take quantum mechanics, you learn that these shapes are basically standing waves - the spatial state of the electron is described by a (complex, in the sense of complex numbers) wavefunction \(\psi(\mathbf{r})\) that obeys the Schroedinger equation, and if you have the electron feeling the spherically symmetric \(1/r\) attractive potential from the proton, then there are certain discrete allowed shapes for \(\psi(\mathbf{r})\).  These funny shapes are the result of "self interference", in the same way that the allowed vibrational modes of a drumhead are the result of self-interfering (and thus standing) waves of the drumhead.

In quantum mechanics, we also learn that, if you were able to do some measurement that tries to locate the electron (e.g., you decide to shoot gamma rays at the atom to do some scattering experiment to deduce where the electron is), and you looked at a big ensemble of such identically prepared atoms, each measurement would give you a different result for the location.  However, if you asked, what is the probability of finding the electron in some small region around a location \(\mathbf{r}\), the answer is \(|\psi(\mathbf{r})|^2\).  The wavefunction gives you the complex amplitude for finding the particle in a location, and the probability of that outcome of a measurement is proportional to the magnitude squared of that amplitude.  The complex nature of the quantum amplitudes, combined with the idea that you have to square amplitudes to get probabilities, is where quantum interference effects originate.   

This is all well and good, but when you worry about the electrons flowing in your house wiring, or even your computer or mobile device, you basically never worry about these quantum interference effects.  Why not?

The answer is rooted in the idea of quantum coherence, in this case of the spatial state of the electron.  Think of the electron as a wave with some wavelength and some particular phase - some arrangement of peaks and troughs that passes through zero at spatially periodic locations (say at x = 0, 1, 2, 3.... nanometers in some coordinate system).   If an electron propagates along in vacuum, this just continues ad infinitum.

If an electron scatters off some static obstacle, that can reset where the zeros are (say, now at x = 0.2, 1.2, 2.2, .... nm after the scattering).  A given static obstacle would always shift those zeros the same way.   Interference between waves (summing the complex wave amplitudes and squaring to find the probabilities) with a well-defined phase difference is what gives the fringes seen in the famous two-slit experiment linked above.

If an electron scatters off some dynamic obstacle (this could be another electron, or some other degree of freedom whose state can be, in turn, altered by the electron), then the phase of the electron wave can be shifted in a more complicated way.  For example, maybe the scatterer ends up in state S1, and that corresponds to the electron wave having zeros at x=0.2, 1.2, 2.2, .....; maybe the scatterer ends up in state S2, and that goes with the electron wave having zeros at x=0.3, 1.3, 2.3, ....  If the electron loses energy to the scatterer, then the spacing between the zeros can change (x=0.2, 1.3, 2.4, ....).  If we don't keep track of the quantum state of the scatterer as well, and we only look at the electron, it looks like the electron's phase is no longer well-defined after the scattering event.  That means if we try to do an interference measurement with that electron, the interference effects are comparatively suppressed.

In your house wiring, there are many many allowed states for the conduction electrons that are close by in energy, and there are many many dynamical things (other electrons, lattice vibrations) that can scatter the electrons.  The consequence of this is that the phase of the electron's wavefunction only remains well defined for a really short time, like 10-15 seconds.    Conversely, in a single hydrogen atom, the electron has no states available close in energy, and in the absence of some really invasive probe, doesn't have any dynamical things off which to scatter.

I'll try to write more about this soon, and may come back to make a figure or two to illustrate this post.

June 22, 2015

John PreskillHello, my name is QUANTUM MASTER EQUATION

“Why does it have that name?”

I’ve asked in seminars, in lectures, in offices, and at group meetings. I’ve asked about physical conjectures, about theorems, and about mathematical properties.

“I don’t know.” Lecturers have shrugged. “It’s just a name.”

This spring, I asked about master equations. I thought of them as tools used in statistical mechanics, the study of vast numbers of particles. We can’t measure vast numbers of particles, so we can’t learn about stat-mech systems everything one might want to know. The magma beneath Santorini, for example, consists of about 1024 molecules. Good luck measuring every one.

Imagine, as another example, using a quantum computer to solve a problem. We load information by initializing the computer to a certain state: We orient the computer’s particles in certain directions. We run a program, then read out the output.

Suppose the computer sits on a tabletop, exposed to the air like leftover casserole no one wants to save for tomorrow. Air molecules bounce off the computer, becoming entangled with the hardware. This entanglement, or quantum correlation, alters the computer’s state, just as flies alter a casserole.* To understand the computer’s output—which depends on the state, which depends on the air—we must have a description of the air. But we can’t measure all those air molecules, just as we can’t measure all the molecules in Santorini’s magma.

We can package our knowledge about the computer’s state into a mathematical object, called a density operator, labeled by ρ(t). A quantum master equation describes how ρ(t) changes. I had no idea, till this spring, why we call master equations “master equations.” Had someone named “John Master” invented them? Had the inspiration for the Russell Crowe movie Master and Commander? Or the Igor who lisps, “Yeth, mathter” in adaptations of Frankenstein?

Jenia Mozgunov, a fellow student and Preskillite, proposed an answer: Using master equations, we can calculate how averages of observable properties change. Imagine describing a laser, a cavity that spews out light. A master equation reveals how the average number of photons (particles of light) in the cavity changes. We want to predict these averages because experimentalists measure them. Because master equations spawn many predictions—many equations—they merit the label “master.”

Jenia’s hypothesis appealed to me, but I wanted certainty. I wanted Truth. I opened my laptop and navigated to Facebook.

“Does anyone know,” I wrote in my status, “why master equations are called ‘master equations’?”

Ian Durham, a physicist at St. Anselm College, cited Tom Moore’s Six Ideas that Shaped Physics. Most physics problems, Ian wrote, involve “some overarching principle.” Example principles include energy conservation and invariance under discrete translations (the system looks the same after you step in some direction). A master equation encapsulates this principle.

Ian’s explanation sounded sensible. But fewer people “liked” his reply on Facebook than “liked” a quip by a college friend: Master equations deserve their name because “[t]hey didn’t complete all the requirements for the doctorate.”

My advisor, John Preskill, dug through two to three books, one set of lecture notes, one German Wikipedia page, one to two articles, and Google Scholar. He concluded that Nordsieck, Lamb, and Uhlenbeck coined “master equation.” According to a 1940 paper of theirs,** “When the probabilities of the elementary processes are known, one can write down a continuity equation for W [a set of probabilities], from which all other equations can be derived and which we will call therefore the ‘master’ equation.”

“Are you sure you were meant to be a physicist,” I asked John, “rather than a historian?”

“Procrastination is a powerful motivator,” he replied.

Lecturers have shrugged at questions about names. Then they’ve paused, pondered, and begun, “I guess because…” Theorems and identities derive their names from symmetries, proof techniques, geometric illustrations, and applications to problems I’d thought unrelated. A name taught me about uses for master equations. Names reveal physics I wouldn’t learn without asking about names. Names aren’t just names. They’re lamps and guides.

Pity about the origin of “master equation,” though. I wish an Igor had invented them.

*Apologies if I’ve spoiled your appetite.

**A. Nordsieck, W. E. Lamb, and G. E. Uhlenbeck, “On the theory of cosmic-ray showers I,” Physica 7, 344-60 (1940), p. 353.

June 21, 2015

Mark Chu-CarrollWeekend Recipe: 3-cup chicken

This is a traditional chinese dish that my wife grew up eating in Taiwan. For some reason, she never told me about it, until she saw an article with a recipe in the NY Times. Of course, I can’t leave recipes alone; I always put my own spin on it. And the recipe in the article had some (in my opinion) glaring problems. For example, it called for cooking with sesame oil. Sesame oil is a seasoning, not a cooking oil. It’s got a very strong flavor, and it burns at stir-fry temperature, which makes any dish cooked in it taste absolutely awful. You cook in neutral oils with high smoke points, like peanut, canola, or soybean; and then you add a drop of sesame as part of the sauce, so that it’s moderated and doesn’t burn. Anyway, below is my version of the dish.

  • 2 pounds of chicken thighs, cut into bite-sized pieces.
  • About 8 large cloves of garlic, thickly sliced.
  • About a 1-inch section of fresh ginger, cut into disks.
  • 5 whole dried szechuan chili peppers (or more, if you like those lovely things!)
  • A good bunch of thai basil leaves, removed from the stems, but left whole. (About a cup, if it’s packed pretty tight. Don’t skimp – these are the best part of the dish!)
  • 4 scallions, thinly sliced, whites and greens separated.
  • 1/3 cup soy sauce.
  • 1/4 cup mirin
  • 1/2 cup sake
  • 1 tablespoon sugar
  • 1 teaspoon cornstarch, dissolved in water.
  • 1/4 teaspoon sesame oil (just a drop, for flavor).
  • Enough canola oil (or similarly bland, high-smoke-point cooking oil) to took – a couple of tablespoons at most.
  1. Get your wok smoking hot. Add enough oil to coat the bottom, and swirl it around.
  2. Add in half of the chicken, and cook until it’s nicely browned, then remove it. (It won’t be cooked all the way through yet, don’t worry!)
  3. Repeat with the other half of the chicken.
  4. Make sure there’s enough oil in the bottom of the wok, then toss in the garlic, ginger, chili peppers, and scallion whites. Stir fry them until the garlic starts to get just a little bit golden.
  5. Add the chicken back in, and add the soy, mirin, sake, and sugar. Get it boiling, and keep stirring things around until the chicken is cooked through.
  6. Add the basil and scallions, and keep stirring until the basil wilts, and the whole thing smells of that wonderful thai basic fragrance.
  7. Add the cornstarch and sesame oil, and cook until the sauce starts to thicken.
  8. Remove it from the heat, and serve on a bed of white rice, along with some simple stir-fried vegetables. (I used a batch of beautiful sugar-snap peas, quickly stir fried with just a bit of garlic, and a bit of soy sauce.)

A couple of notes on ingredients:

  • This is a dish where the soy sauce matters. Don’t use cheap generic american soy sauce; that stuff is just saltwater with food coloring. For some things, that’s actually OK. But in this dish, it’s the main flavor of the sauce, so it’s important to use something with a good flavor. Get a good quality chinese soy (I like Pearl River Bridge brand), or a good japanese shoyu.
  • For the sugar, if you’ve got turbinado (or even better, real chinese rock sugar), use that. If not, white sugar is Ok.
  • Definitely try to get thai basil. It’s very different from italian basil – the leaves are thinner (which makes them much easier to eat whole, as you do in this dish), and they’ve got a very different flavor – almost like Italian basic mixed with a bit of anise and a bit of menthol. It’s one of my favorite herbs, and it’s actually gotten pretty easy to find.
  • Szechuan peppers can be hard to find – you pretty much need to go to an Asian grocery. They’re worth it. They’ve got a very distinctive flavor, and I don’t know of any other dried pepper that works in a sauce like them. You don’t actually eat the peppers – the way you cook them, they actually burn a bit – but they bloom their flavor into the oil that you use to cook the rest of the dish, and that totally changes the sauce.

Tommaso DorigoSeeing Jupiter In Daylight

Have you ever seen Venus in full daylight ? It's a fun experience. Of course we are accustomed to see even a small crescent Moon in daylight -it is large and although of the same colour of clouds, it cannot be missed in a clear sky. But Venus is a small dot, and although it can be quite bright after the sunset or before dawn, during the day it is just a unconspicuous, tiny white dot which you never see, unless you look exactly in its direction.

read more

Chad OrzelFather’s Day 2015

Act I:

STEELYKID and THE PIP: Happy Father’s Day, Daddy!

DADDY: Aww, that’s sweet. So, what are you going to make me for breakfast?


DADDY: It’s father’s day, right? So you guys should be cooking breakfast for me.


THE PIP: We can’t cook breakfast for you. We’re not tall enough to bake stuff. And, also, we’re not allowed to cook.

DADDY: Well, I’m your father, so I can give you permission to cook breakfast.

STEELYKID: Yeah, but we don’t know how to cook.

THE PIP: Yeah, so you have to cook pancakes for us!

DADDY: Oh, all right. I guess I can do that. {Heads into kitchen}

{End of Act I}

Act II:

{DADDY is in the kitchen, cooking pancakes. A very loud THUMP comes from the living room.}

DADDY: [The Pip], what are you doing in there?

THE PIP: Nothing, now.

DADDY: What were you doing just now that resulted in a big loud THUMP?

THE PIP: Well, I’m not going to tell you.

DADDY {walking into the living room with a pancake}: I see. Well, do you know what we do to toddlers who make big loud THUMP noises and won’t tell what they did?


DADDY: We tickle them!

{Wild shrieks of laughter. Curtain.}

So, that’s how my morning is going. Hope the other fathers out there are getting backtalk that’s half as cute.

The Pip is ready to fix anything that needs drilling, and SteelyKid wants a katana.

The Pip is ready to fix anything that needs drilling, and SteelyKid wants a katana.

n-Category Café What's so HoTT about Formalization?

In my last post I promised to follow up by explaining something about the relationship between homotopy type theory (HoTT) and computer formalization. (I’m getting tired of writing “publicity”, so this will probably be my last post for a while in this vein — for which I expect that some readers will be as grateful as I).

As a potential foundation for mathematics, HoTT/UF is a formal system existing at the same level as set theory (ZFC) and first-order logic: it’s a collection of rules for manipulating syntax, into which we can encode most or all of mathematics. No such formal system requires computer formalization, and conversely any such system can be used for computer formalization. For example, the HoTT Book was intentionally written to make the point that HoTT can be done without a computer, while the Mizar project has formalized huge amounts of mathematics in a ZFC-like system.

Why, then, does HoTT/UF seem so closely connected to computer formalization? Why do the overwhelming majority of publications in HoTT/UF come with computer formalizations, when such is still the exception rather than the rule in mathematics as a whole? And why are so many of the people working on HoTT/UF computer scientists or advocates of computer formalization?

To start with, note that the premise of the third question partially answers the first two. If we take it as a given that many homotopy type theorists care about computer formalization, then it’s only natural that they would be formalizing most of their papers, creating a close connection between the two subjects in people’s minds.

Of course, that forces us to ask why so many homotopy type theorists are into computer formalization. I don’t have a complete answer to that question, but here are a few partial ones.

  1. HoTT/UF is built on type theory, and type theory is closely connected to computers, because it is the foundation of typed functional programming languages like Haskell, ML, and Scala (and, to a lesser extent, less-functional typed programming languages like Java, C++, and so on). Thus, computer proof assistants built on type theory are well-suited to formal proofs of the correctness of software, and thus have received a lot of work from the computer science end. Naturally, therefore, when a new kind of type theory like HoTT comes along, the existing type theorists will be interested in it, and will bring along their predilection for formalization.

  2. HoTT/UF is by default constructive, meaning that we don’t need to assert the law of excluded middle or the axiom of choice unless we want to. Of course, most or all formal systems have a constructive version, but with type theories the constructive version is the “most natural one” due to the Curry-Howard correspondence. Moreover, one of the intriguing things about HoTT/UF is that it allows us to prove certain things constructively that in other systems require LEM or AC. Thus, it naturally attracts attention from constructive mathematicians, many of whom are interested in computable mathematics (i.e. when something exists, can we give an algorithm to find it?), which is only a short step away from computer formalization of proofs.

  3. One could, however, try to make similar arguments from the other side. For instance, HoTT/UF is (at least conjecturally) an internal language for higher topos theory and homotopy theory. Thus, one might expect it to attract an equal influx of higher topos theorists and homotopy theorists, who don’t care about computer formalization. Why hasn’t this happened? My best guess is that at present the traditional 1-topos theorists seem to be largely disjoint from the higher topos theorists. The former care about internal languages, but not so much about higher categories, while for the latter it is reversed; thus, there aren’t many of us in the intersection who care about both and appreciate this aspect of HoTT. But I hope that over time this will change.

  4. Another possible reason why the influx from type theory has been greater is that HoTT/UF is less strange-looking to type theorists (it’s just another type theory) than to the average mathematician. In the HoTT Book we tried to make it as accessible as possible, but there are still a lot of tricky things about type theory that one seemingly has to get used to before being able to appreciate the homotopical version.

  5. Another sociological effect is that Vladimir Voevodsky, who introduced the univalence axiom and is a Fields medalist with “charisma”, is also a very vocal and visible advocate of computer formalization. Indeed, his personal programme that he calls “Univalent Foundations” is to formalize all of mathematics using a HoTT-like type theory.

  6. Finally, many of us believe that HoTT is actually the best formal system extant for computer formalization of mathematics. It shares most of the advantages of type theory, such as the above-mentioned close connection to programming, the avoidance of complicated ZF-encodings for even basic concepts like natural numbers, and the production of small easily-verifiable “certificates” of proof correctness. (The advantages of some type theories that HoTT doesn’t yet share, like a computational interpretation, are work in progress.) But it also rectifies certain infelicious features of previously existing type theories, by specifying what equality of types means (univalence), including extensionality for functions and truth values, providing well-behaved quotient types (HITs), and so on, making it more comfortable for ordinary mathematicians. (I believe that historically, this was what led Voevodsky to type theory and univalence in the first place.)

There are probably additional reasons why HoTT/UF attracts more people interested in computer formalization. (If you can think of others, please share them in the comments.) However, there is more to it than this, as one can guess from the fact that even people like me, coming from a background of homotopy theory and higher category theory, tend to formalize a lot of our work on HoTT. Of course there is a bit of a “peer pressure” effect: if all the other homotopy type theorists formalize their papers, then it starts to seem expected in the subject. But that’s far from the only reason; here are some “real” ones.

  1. Computer formalization of synthetic homotopy theory (the “uniquely HoTT” part of HoTT/UF) is “easier”, in certain respects, than most computer formalization of mathematics. In particular, it requires less infrastructure and library support, because it is “closer to the metal” of the underlying formal system than is usual for actually “interesting” mathematics. Thus, formalizing it still feels more like “doing mathematics” than like programming, making it more attractive to a mathematician. You really can open up a proof assistant, load up no pre-written libraries at all, and in fairly short order be doing interesting HoTT. (Of course, this doesn’t mean that there is no value in having libraries and in thinking hard about how best to design those libraries, just that the barrier to entry is lower.)

  2. Precisely because, as mentioned above, type theory is hard to grok for a mathematician, there is a significant benefit to using a proof assistant that will automatically tell you when you make a mistake. In fact, messing around with a proof assistant is one of the best ways to learn type theory! I posted about this almost exactly four years ago.

  3. I think the previous point goes double for homotopy type theory, because it is an unfamiliar new world for almost everyone. The types of HoTT/UF behave kind of like spaces in homotopy theory, but they have their own idiosyncracies that it takes time to develop an intuition for. Playing around with a proof assistant is a great way to develop that intuition. It’s how I did it.

  4. Moreover, because that intuition is unique and recently developed for all of us, we may be less confident in the correctness of our informal arguments than we would be in classical mathematics. Thus, even an established “homotopy type theorist” may be more likely to want the comfort of a formalization.

  5. Finally, there is an additional benefit to doing mathematics with a proof assistant (as opposed to formalizing mathematics that you’ve already done on paper), which I think is particularly pronounced for type theory and homotopy type theory. Namely, the computer always tells you what you need to do next: you don’t need to work it out for yourself. A central part of type theory is inductive types, and a central part of HoTT is higher inductive types; both of which are characterized by an induction principle (or “eliminator”) which says that in order to prove a statement of the form “for all x:Wx:W, P(x)P(x)”, it suffices to prove some number of other statements involving the predicate PP. The most familiar example is induction on the natural numbers, which says that in order to prove “for all nn\in \mathbb{N}, P(n)P(n)” it suffices to prove P(0)P(0) and “for all nn\in \mathbb{N}, if P(n)P(n) then P(n+1)P(n+1)”. When using proof by induction, you need to isolate PP as a predicate on nn, specialize to n=0n=0 to check the base case, write down P(n)P(n) as the inductive hypothesis, then replace nn by n+1n+1 to find what you have to prove in the induction step. The students in an intro to proofs class have trouble with all of these steps, but professional mathematicians have learned to do them automatically. However, for a general inductive or higher inductive type, there might instead be four, six, ten, or more separate statements to prove when applying the induction principle, many of which involve more complicated transformations of PP, and it’s common to have to apply several such inductions in a nested way. Thus, when doing HoTT on paper, a substantial amount of time is sometimes spent simply figuring out what has to be proven. But a proof assistant equipped with a unification algorithm can do that for you automatically: you simply say “apply induction for the type WW” and it immediately decides what PP is and presents you with a list of the remaining goals that have to be proven.

To summarize this second list, then, I think it’s fair to say that compared to formalizing traditional mathematics, formalizing HoTT tends to give more benefit at lower cost. However, that cost is still high, especially when you take into account the time spent learning to use a proof assistant, which is often not the most user-friendly of software. This is why I always emphasize that HoTT can perfectly well be done without a computer, and why we wrote the book the way we did.

June 20, 2015

Scott AaronsonFCRC Highlights

By popular request, here are some highlights from this week’s FCRC conference in Portland, Oregon:

  • The edit distance between two strings means the minimum number of insertions, deletions, and replacements needed to convert one string to the other: for example, SHTETL and SHETLAND have an edit distance of 4.  Edit distance has major, obvious applications to DNA sequence comparison, as well as plagiarism detection and many other things.  There’s a clever dynamic programming algorithm to compute the edit distance between two n-bit strings, but it takes ~n2 time, which is already too slow for many applications.  Can you do better?  I remember wondering about that 15 years ago, as a beginning grad student taking Richard Karp’s computational biology course.  Now Arturs Backurs and Piotr Indyk have shown that, if you can compute edit distance in O(n2-ε) time for any ε>0, then you can also solve CNF-SAT in 2cn time for some c<1, thereby refuting the Strong Exponential Time Hypothesis.  For more about this important result, see this MIT News article.
  • Olivier Temam gave a superb keynote talk about hardware neural networks.  His main points were these: implementing neural nets with special-purpose hardware was a faddish idea a few decades ago, but was abandoned once people realized that (a) it didn’t work that great, and (b) more to the point, anything you could do with special-purpose hardware, you could do better and more easily with silicon chips, after waiting just a few years for Moore’s Law to catch up.  Today, however, two things have spurred a revival of the idea: firstly, neural nets (renamed “deep learning,” and done with bigger networks and way more training data) are delivering spectacular, state-of-the-art results; and second, transistors have stopped shrinking, so it now makes more sense to think about the few orders-of-magnitude speed improvement that you can get from special-purpose hardware.  This would mean organizing computers kind-of, sort-of like the brain is organized, with (for example) memory integrated into the connections between the “neurons” (processing elements), rather than on a separate chip that’s connected to the processor by a bus.  On the other hand, Temam also stressed that computer architects shouldn’t slavishly copy the brain: instead, they should simply build the fastest hardware they can to implement the best available machine-learning algorithms, and they should rely on the machine-learning theorists to incorporate whatever broad lessons are to be gleaned from neuroscience (as they’ve done several times in the past).
  • Three separate sets of authors (Koppula, Lewko, and Waters; Canetti, Holmgren, Jain, and Vaikuntanathan; and Bitansky, Garg, Lin, Pass, and Telang) independently wrote papers that showed how to achieve “indistinguishability obfuscation” (i.o.) for Turing machines rather than for circuits.  For those not in the world of theoretical crypto, i.o. is a hot concept that basically means: obfuscating a program in such a way that no adversary can figure out anything about which program you started with, among all the possible programs that compute the same function in roughly the same amount of time.  (On the other hand, the adversary might be able to learn more than she could if merely given a black box for the function.  And that’s why this kind of obfuscation falls short of the “gold standard,” which was shown to be impossible in general in seminal work by Barak et al.)  Recent papers have shown how to achieve the weaker notion of i.o., but they first require converting your program to a Boolean circuit—something that’s absurdly inefficient in practice, and also has the theoretical drawback of producing an obfuscated program whose size grows, not merely with the size of the original, unobfuscated program, but also with the amount of time the original program is supposed to run for. So, the new work gets around that drawback, by cleverly obfuscating a program whose purpose is to compute the “next step function” of the original program, on data that’s itself encrypted. The talk was delivered in “tag team” format, with one representative from each group of authors speaking for 6-7 minutes. Surprisingly, it worked extremely well.
  • Laci Babai gave a truly epic hour-long Knuth Prize lecture, basically trying to summarize all of his work over the past 35 years (and related work by others), in 170 or so slides.  The talk had not a single word of filler: it was just pure beef, result after result, some of them well-known and seminal (e.g., MIP=NEXP, AM[2]=AM[k], AlmostNP=MA, group membership in NP, group non-membership in AM…) and others obscure little gems.  Boaz Barak commented that an entire semester-long course could be taught from the PowerPoint slides. Laci ended the talk by defining the Babai point, and then saying “having made my point, I’m done.”
  • Ambainis (yes, the same Ambainis), Filmus and Le Gall had a paper about the limitations of the techniques used to achieve all matrix multiplication algorithms from Coppersmith-Winograd (O(n2.3755)) onward, including those of Stothers 2010 (O(n2.3730)), Vassilevska Williams 2012 (O(n2.3728642)), and Le Gall 2014 (O(n2.3728639)).  Their basic conclusion—not surprising, but still nice to establish formally—is that applying more and more massive computer search to the current ideas can’t possibly get you below O(n2.308); new ideas will be needed to push further.
  • At the STOC business meeting, there was a long discussion about the proposal to turn STOC into a weeklong “theory festival,” with more plenary talks (including from other fields), possibly more parallel sessions, etc. There were lots of interesting arguments, but alas, I was too tired and jetlagged to remember what they were. (Anyone who does remember is welcome to chime in.)

There are many great things that I haven’t written about—for example, I haven’t even said a word about any of the three best paper talks!—but I’m out of energy right now.  Others are more than welcome to share other FCRC highlights in the comments section.

June 19, 2015

Tommaso DorigoATLAS Pictures Colour Flow Between Quarks

In 1992 I started working at my undergraduate thesis, the search for all-hadronic top quark pairs in CDF data. The CDF experiment was just starting to collect proton-antiproton collision data with the brand-new silicon vertex detector in what was called Run 1a, which ended in 1993 and produced the data on which the first evidence claim of top quarks was based. But I was still working on the Run 0 data: 4 inverse picobarns of collisions -the very first collisions at the unprecedented energy of 1.8 TeV. And I was not alone: many analyses of those data were still in full swing.

read more

Tommaso DorigoATLAS Pictures Colour Flow Between Quarks

In 1992 I started working at my undergraduate thesis, the search for all-hadronic top quark pairs in CDF data. The CDF experiment was just starting to collect proton-antiproton collision data with the brand-new silicon vertex detector in what was called Run 1a, which ended in 1993 and produced the data on which the first evidence claim of top quarks was based. But I was still working on the Run 0 data: 4 inverse picobarns of collisions -the very first collisions at the unprecedented energy of 1.8 TeV. And I was not alone: many analyses of those data were still in full swing.

read more

Mark Chu-CarrollCanonical Expressions in Type Theory

Sorry for the gap in posts. I’ve been trying to post more regularly, and was just hitting a rhythm, when my son brought home a particularly vicious bug, and I got sick. I’ve spent the last couple of weeks being really, really sick, and then trying to get caught up at work. I’m mostly recovered, except for some lingering asthma, so I’m trying to get back to that twice-per-week posting schedule.

In the last couple of posts, we looked at Martin-Löf’s theory of expressions. The theory of expressions is purely syntactic: it’s a way of understanding the notation of expressions. Everything that we do in type theory will be written with expressions that follow the syntax, the arity, and the definitional equivalency rules of expression theory.

The next step is to start to understand the semantics of expressions. In type theory, when it comes to semantics, we’re interested in two things: evaluation and judgements. Evaluation is the process by which an expression is reduced to its simplest form. It’s something that we care about, but it’s not really a focus of type theory: type theory largely waves its hands in the air and says “we know how to do this”, and opts for normal-order evaluation. Judgements are where things get interesting.

Judgements are provable statements about expressions and the values that they represent. As software people, when we think about types and type theory, we’re usually thinking about type declarations: type declarations are judgements about the expressions that they apply to. When you write a type declaration in a programming language, what you’re doing is asserting the type theory judgement. When the compiler “type-checks” your program, what it’s doing in type theory terms is checking that your judgements are proven by your program.

For example, we’d like to be able to make the judgement A set – that is, that A is a set. In order to make the judgement that A is a set in type theory, we need to know two things:

  1. How are canonical instances of the set A formed?
  2. Given two canonical instances of A, how can we determine if they’re equal?

To understand those two properties, we need to take a step back. What is a canonical instance of a set?

If we think about how we use predicate logic, we’re always given some basic set of facts as a starting point. In type theory, the corresponding concept is a primitive constant. The primitive constants include base values and primitive functions. For example, if we’re working with lispish expressions, then cons(1, cons(2, nil)) is an expression, and cons, nil, 1 and 2 are primitive constants; cons is the head of the expression, and 1 and cons(2, nil) are the arguments.

A canonical expression is a saturated, closed expression whose head is a primitive constant.

The implications of this can be pretty surprising, because it means that a canonical expression can contain unevaluated arguments! The expression has to be saturated and closed – so its arguments can’t have unbound variables, or be missing parameters. But it can contain unevaluated subexpressions. For example, if we were working with Peano arithmetic in type theory, succ(2+3) is canonical, even though “2+3″ hasn’t be evaluated.

In general, in type theory, the way that we evaluate an expression is called normal order evaluation – what programming language people call lazy evaluation: that’s evaluating from the outside in. Given a non-canonical expression, we evaluate from the outside in until we get to a canonical expression, and then we stop. A canonical expression is considered the result of a computation – so we can see succ(2+3) as a result!

A canonical expression is the evaluated form of an expression, but not the fully evaluated form. The fully evaluated form is when the expression and all of its saturated parts are fully evaluated. So in our previous example, the saturated part 2+3 wasn’t evaluated, so it’s not fully evaluated. To get it to be fully evaluated, we’d need to evaluate 2+3, giving us succ(5); then, since succ(5) is saturated, it’s evaluated to 6, which is the fully evaluated form of the expression.

Next post (coming monday!), we’ll use this new understanding of canonical expressions, and start looking at judgements, and what they mean. That’s when type theory starts getting really fun and interesting.

John BaezOn Care For Our Common Home

There’s been a sea change on attitudes toward global warming in the last couple of years, which makes me feel much less need to determine the basic facts of the matter, or convince people of these facts. The challenge is now to do something.

Even the biggest European oil and gas companies are calling for a carbon tax! Their motives, of course, should be suspect. But they have realized it’s hopeless to argue about the basics. They wrote a letter to the United Nations beginning:

Dear Excellencies:

Climate change is a critical challenge for our world. As major companies from the oil & gas sector, we recognize both the importance of the climate challenge and the importance of energy to human life and well-being. We acknowledge that the current trend of greenhouse gas emissions is in excess of what the Intergovernmental Panel on Climate Change (IPCC) says is needed to limit the temperature rise to no more than 2 degrees above pre-industrial levels. The challenge is how to meet greater energy demand with less CO2. We stand ready to play our part.

It seems there are just a few places, mostly former British colonies, where questioning the reality and importance of man-made global warming is a popular stance among politicians. Unfortunately one of these, the United States, is a big carbon emitter. Otherwise we could just ignore these holdouts.

Given all this, it’s not so surprising that Pope Francis has joined the crowd and released a document on environmental issues:

• Pope Francis, Enyclical letter Laudato Si’: on care for our common home.

Still, it is interesting to read this document, because unlike most reports we read on climate change, it addresses the cultural and spiritual dimensions of this problem.

I believe arguments should be judged by their merits, not the fact that they’re made by someone with an impressive title like

His Holiness Francis, Bishop of Rome, Vicar of Jesus Christ, Successor of the Prince of the Apostles, Supreme Pontiff of the Universal Church, Primate of Italy, Archbishop and Metropolitan of the Roman Province, Sovereign of the Vatican City State, Servant of the servants of God.

(Note the hat-tip to Darwin there. )

But in fact Francis has some interesting things to say. And among all the reportage on this issue, it’s hard to find more than quick snippets of the actual 182-page document, which is actually quite interesting. So, let me quote a bit.

I will try to dodge the explicitly Christian bits, because I really don’t want people arguing about religion on this blog—in fact I won’t allow it. Of course discussing what the Pope says without getting into Christianity is very difficult and perhaps even absurd. But let’s try.

I will also skip the extensive section where he summarizes the science. It’s very readable, and for an audience who doesn’t want numbers and graphs it’s excellent. But I figure the audience of this blog already knows that material.

So, here are some of the passages I found most interesting.

St. Francis of Assisi

He discusses how St. Francis of Assisi has been an example to him, and says:

Francis helps us to see that an integral ecology calls for openness to categories which transcend the language of mathematics and biology, and take us to the heart of what it is to be human. Just as happens when we fall in love with someone, whenever he would gaze at the sun, the moon or the smallest of animals, he burst into song, drawing all other creatures into his praise.


If we approach nature and the environment without this openness to awe and wonder, if we no longer speak the language of fraternity and beauty in our relationship with the world, our attitude will be that of masters, consumers, ruthless exploiters, unable to set limits on their immediate needs. By contrast, if we feel intimately united with all that exists, then sobriety and care will well up spontaneously. The poverty and austerity of Saint Francis were no mere veneer of asceticism, but something much more radical: a refusal to turn reality into an object simply to be used and controlled.

Weak responses

On the responses to ecological problems thus far:

The problem is that we still lack the culture needed to confront this crisis. We lack leadership capable of striking out on new paths and meeting the needs of the present with concern for all and without prejudice towards coming generations. The establishment of a legal framework which can set clear boundaries and ensure the protection of ecosystems has become indispensable, otherwise the new power structures based on the techno-economic paradigm may overwhelm not only our politics but also freedom and justice.

It is remarkable how weak international political responses have been. The failure of global summits on the environment make it plain that our politics are subject to technology and finance. There are too many special interests, and economic interests easily end up trumping the common good and manipulating information so that their own plans will not be affected. The Aparecida Document urges that “the interests of economic groups which irrationally demolish sources of life should not prevail in dealing with natural resources”. The alliance between the economy and technology ends up sidelining anything unrelated to its immediate interests. Consequently the most one can expect is superficial rhetoric, sporadic acts of philanthropy and perfunctory expressions of concern for the environment, whereas any genuine attempt by groups within society to introduce change is viewed as a nuisance based on romantic illusions or an obstacle to be circumvented.

In some countries, there are positive examples of environmental improvement: rivers, polluted for decades, have been cleaned up; native woodlands have been restored; landscapes have been beautified thanks to environmental renewal projects; beautiful buildings have been erected; advances have been made in the production of non-polluting energy and in the improvement of public transportation. These achievements do not solve global problems, but they do show that men and women are still capable of intervening positively. For all our limitations, gestures of generosity, solidarity and care cannot but well up within us, since we were made for love.

At the same time we can note the rise of a false or superficial ecology which bolsters complacency and a cheerful recklessness. As often occurs in periods of deep crisis which require bold decisions, we are tempted to think that what is happening is not entirely clear. Superficially, apart from a few obvious signs of pollution and deterioration, things do not look that serious, and the planet could continue as it is for some time. Such evasiveness serves as a licence to carrying on with our present lifestyles and models of production and consumption. This is the way human beings contrive to feed their self-destructive vices: trying not to see them, trying not to acknowledge them, delaying the important decisions and pretending that nothing will happen.

On the risks:

It is foreseeable that, once certain resources have been depleted, the scene will be set for new wars, albeit under the guise of noble claims.

Everything is connected

He writes:

Everything is connected. Concern for the environment thus needs to be joined to a sincere love for our fellow human beings and an unwavering commitment to resolving the problems of society.

Moreover, when our hearts are authentically open to universal communion, this sense of fraternity excludes nothing and no one. It follows that our indifference or cruelty towards fellow creatures of this world sooner or later affects the treatment we mete out to other human beings. We have only one heart, and the same wretchedness which leads us to mistreat an animal will not be long in showing itself in our relationships
with other people. Every act of cruelty towards any creature is “contrary to human dignity”. We can hardly consider ourselves to be fully loving if we disregard any aspect of reality: “Peace, justice and the preservation of creation are three absolutely interconnected themes, which cannot be separated and treated individually without once again falling into reductionism”.

Technology: creativity and power

Technoscience, when well directed, can produce important means of improving the quality of human life, from useful domestic appliances to great transportation systems, bridges, buildings and public spaces. It can also produce art and enable men and women immersed in the material world to “leap” into the world of beauty. Who can deny the beauty of an aircraft or a skyscraper? Valuable works of art and music now make use of new technologies. So, in the beauty intended by the one who uses new technical instruments and in the contemplation of such beauty, a quantum leap occurs, resulting in a fulfilment which is uniquely human.

Yet it must also be recognized that nuclear energy, biotechnology, information technology, knowledge of our DNA, and many other abilities which we have acquired, have given us tremendous power. More precisely, they have given those with the knowledge, and especially the economic resources to use them, an impressive dominance over the whole of humanity and the entire world. Never has humanity had such power over itself, yet nothing ensures that it will be used wisely, particularly when we consider how it is currently being used. We need but think of the nuclear bombs dropped in the middle of the twentieth century, or the array of technology which Nazism, Communism and other totalitarian regimes have employed to kill millions of people, to say nothing of the increasingly deadly arsenal of weapons available for modern warfare. In whose hands does all this power lie, or will it eventually end up? It is extremely risky for a small part of humanity to have it.

The globalization of the technocratic paradigm

The basic problem goes even deeper: it is the way that humanity has taken up technology and its development according to an undifferentiated and one-dimensional paradigm. This paradigm exalts the concept of a subject who, using logical and rational procedures, progressively approaches and gains control over an external object. This subject makes every effort to establish the scientific and experimental method, which in itself is already a technique of possession, mastery and transformation. It is as if the subject were to find itself in the presence of something formless, completely open to manipulation. Men and women have constantly intervened in nature, but for a long time this meant being in tune with and respecting the possibilities offered by the things themselves. It was a matter of receiving what nature itself allowed, as if from its own hand. Now, by contrast, we are the ones to lay our hands on things, attempting to extract everything possible from them while frequently ignoring or forgetting the reality in front of us. Human beings and material objects no longer extend a friendly hand to one another; the relationship has become confrontational. This has made it easy to accept the idea of infinite or unlimited growth, which proves so attractive to economists, financiers and experts in technology. It is based on the lie that there is an infinite supply of the earth’s goods, and this leads to the planet being squeezed dry beyond every limit. It is the false notion that “an infinite quantity of energy and resources are available, that it is possible to renew them quickly, and that the negative effects of the exploitation of the natural order can be easily absorbed”.

The difficulty of changing course

The idea of promoting a different cultural paradigm and employing technology as a mere instrument is nowadays inconceivable. The technological paradigm has become so dominant that it would be difficult to do without its resources and even more difficult to utilize them without being dominated by their internal logic. It has become countercultural to choose a lifestyle whose goals are even partly independent of technology, of its costs and its power to globalize and make us all the same. Technology tends to absorb everything into its ironclad logic, and those who are surrounded with technology “know full well that it moves forward in the final analysis neither for profit nor for the well-being of the human race”, that “in the most radical sense of the term power is its motive – a lordship over all”. As a result, “man seizes hold of the naked elements of both nature and human nature”. Our capacity to make decisions, a more genuine freedom and the space for each one’s alternative creativity are diminished.

The technocratic paradigm also tends to dominate economic and political life. The economy accepts every advance in technology with a view to profit, without concern for its potentially negative impact on human beings. Finance overwhelms the real economy. The lessons of the global financial crisis have not been assimilated, and we are learning all too slowly the lessons of environmental deterioration. Some circles maintain that current economics and technology will solve all environmental problems, and argue, in popular and non-technical terms, that the problems of global hunger and poverty will be resolved simply by market growth. They are less concerned with certain economic theories which today scarcely anybody dares defend, than with their actual operation in the functioning of the economy. They may not affirm such theories with words, but nonetheless support them with their deeds by showing no interest in more balanced levels of production, a better distribution of wealth, concern for the environment and the rights of future generations. Their behaviour shows that for them maximizing profits is enough.

Toward an ecological culture

Ecological culture cannot be reduced to a series of urgent and partial responses to the immediate problems of pollution, environmental decay and the depletion of natural resources. There needs to be a distinctive way of looking at things, a way of thinking, policies, an educational programme, a lifestyle and a spirituality which together generate resistance to the assault of the technocratic paradigm. Otherwise, even the best ecological initiatives can find themselves caught up in the same globalized logic. To seek only a technical remedy to each environmental problem which comes up is to separate what is in reality interconnected and to mask the true and deepest problems of the global system.

Yet we can once more broaden our vision. We have the freedom needed to limit and direct technology; we can put it at the service of another type of progress, one which is healthier, more human, more social, more integral. Liberation from the dominant technocratic paradigm does in fact happen sometimes, for example, when cooperatives of small producers adopt less polluting means of production, and opt for a non-consumerist model of life, recreation and community. Or when technology is directed primarily to resolving people’s concrete problems, truly helping them live with more dignity and less suffering. Or indeed when the desire to create and contemplate beauty manages to overcome reductionism through a kind of salvation which occurs in beauty and in those who behold it. An authentic humanity, calling for a new synthesis, seems to dwell in the midst of our technological culture, almost unnoticed, like a mist seeping gently beneath a closed door. Will the promise last, in spite of everything, with all that is authentic rising up in stubborn resistance?

Integral ecology

Near the end he calls the for the development of an ‘integral ecology’. I find it fascinating that this has something in common with ‘network theory':

Since everything is closely interrelated, and today’s problems call for a vision capable of taking into account every aspect of the global crisis, I suggest that we now consider some elements of an integral ecology, one which clearly respects its human and social dimensions.

Ecology studies the relationship between living organisms and the environment in which they develop. This necessarily entails reflection and debate about the conditions required for the life and survival of society, and the honesty needed to question certain models of development, production and consumption. It cannot be emphasized enough how everything is interconnected. Time and space are not independent of one another, and not even atoms or subatomic particles can be considered in isolation. Just as the different aspects of the planet—physical, chemical and biological—are interrelated, so too living species are part of a network which we will never fully explore and understand. A good part of our genetic code is shared by many living beings. It follows that the fragmentation of knowledge and the isolation of bits of information can actually become a form of ignorance, unless they are integrated into a broader vision of reality.

When we speak of the “environment”, what we really mean is a relationship existing between nature and the society which lives in it. Nature cannot be regarded as something separate from ourselves or as a mere setting in which we live. We are part of nature, included in it and thus in constant interaction with it. Recognizing the reasons why a given area is polluted requires a study of the workings of society, its economy, its behaviour patterns, and the ways it grasps reality. Given the scale of change, it is no longer possible to find a specific, discrete answer for each part of the problem. It is essential to seek comprehensive solutions which consider the interactions within natural systems themselves and with social systems. We are faced not with two separate crises, one environmental and the other social, but rather with one complex crisis which is both social and environmental. Strategies for a solution demand an integrated approach to combating poverty, restoring dignity to the excluded, and at the same time protecting nature.

Due to the number and variety of factors to be taken into account when determining the environmental impact of a concrete undertaking, it is essential to give researchers their due role, to facilitate their interaction, and to ensure broad academic freedom. Ongoing research should also give us a better understanding of how different creatures relate to one another in making up the larger units which today we term “ecosystems”. We take these systems into account not only to determine how best to use them, but also because they have an intrinsic value independent of their usefulness.

Ecological education

He concludes by discussing the need for ‘ecological education’.

Environmental education has broadened its goals. Whereas in the beginning it was mainly centred on scientific information, consciousness-raising and the prevention of environmental risks, it tends now to include a critique of the “myths” of a modernity grounded in a utilitarian mindset (individualism, unlimited progress, competition, consumerism, the unregulated market). It seeks also to restore the various levels of ecological equilibrium, establishing harmony within ourselves, with others, with nature and other living creatures, and with God. Environmental education should facilitate making the leap towards the transcendent which gives ecological ethics its deepest meaning. It needs educators capable of developing an ethics of ecology, and helping people, through effective pedagogy, to grow in solidarity, responsibility and compassionate care.

Even small good practices can encourage new attitudes:

Education in environmental responsibility can encourage ways of acting which directly and significantly affect the world around us, such as avoiding the use of plastic and paper, reducing water consumption, separating refuse, cooking only what can reasonably be consumed, showing care for other living beings, using public transport or car-pooling, planting trees, turning off unnecessary lights, or any number of other practices. All of these reflect a generous and worthy creativity which brings out the best in human beings. Reusing something instead of immediately discarding it, when done for the right reasons, can be an act of love which expresses our own dignity.

We must not think that these efforts are not going to change the world. They benefit society, often unbeknown to us, for they call forth a goodness which, albeit unseen, inevitably tends to spread. Furthermore, such actions can restore our sense of self-esteem; they can enable us to live more fully and to feel that life on earth is worthwhile.

Part of the goal is to be more closely attentive to what we have, not fooled into thinking we’d always be happier with more:

It is a return to that simplicity which allows us to stop and appreciate the small things, to be grateful for the opportunities which life affords us, to be spiritually detached from what we possess, and not to succumb to sadness for what we lack. This implies avoiding the dynamic of dominion and the mere accumulation of pleasures.

Such sobriety, when lived freely and consciously, is liberating. It is not a lesser life or one lived with less intensity. On the contrary, it is a way of living life to the full. In reality, those who enjoy more and live better each moment are those who have given up dipping here and there, always on the look-out for what they do not have. They experience what it means to appreciate each person and each thing, learning familiarity with the simplest things and how to enjoy them. So they are able to shed unsatisfied needs, reducing their obsessiveness and weariness. Even living on little, they can live a lot, above all when they cultivate other pleasures and find satisfaction in fraternal encounters, in service, in developing their gifts, in music and art, in contact with nature, in prayer. Happiness means knowing how to limit some needs which only diminish us, and being open to the many different possibilities which life can offer.

Clifford JohnsonScreen Junkies: Science and Jurassic World

So the episode I mentioned is out! It's a lot of fun, and there's so very much that we talked about that they could not fit into the episode. See below. It is all about Jurassic World - a huge box-office hit. movie_science_screen_shotIf you have not seen it yet, and don't want specific spoilers, watch out for where I write the word spoilers in capitals, and read no further. If you don't even want my overall take on things without specifics, read only up to where I link to the video. Also, the video has spoilers. I'll embed the video here, and I have some more thoughts that I'll put below. One point I brought up a bit (you can see the beginning of it in my early remarks) is the whole business of the poor portrayal of science and scientists overall in the film, as opposed to in the original Jurassic Park movie. In the original, putting quibbles over scientific feasibility aside (it's not a documentary, remember!), you have the "dangers of science" on one side, but you also have the "wonders of science" on the other. This includes that early scene or two that still delight me (and many scientists I know - and a whole bunch who were partly inspired by the movie to go into science!) of how genuinely moved the two scientist characters (played by Laura Dern and Sam Neil) are to see walking living dinosaurs, the subject of their life's work. Right in front of them. Even if you're not a scientist, you immediately relate to that feeling. It helps root the movie, as does that fact that pretty much all the characters are fleshed [...] Click to continue reading this post

June 18, 2015

Clifford JohnsonCalling Shenanigans

I hadn't realized that I knew some of the journalists who were at the event at which Tim Hunt made his negative-stereotype-strengthening remarks. I trust their opinion and integrity quite a bit, and so I'm glad to hear reports from them about what they witnessed. This includes Deborah Blum, who was speaking in the same session as Hunt that day, and who was at the luncheon. She spoke with Hunt about his views and intentions. Thanks, Deborah for calling shenanigans on the "I was only joking" defense so often used to hide behind the old "political correctness gone mad" trope. Read her article here, and further here. -cvj (Spoof poster imaged is by Jen Golbeck) Click to continue reading this post

BackreactionNo, Gravity hasn’t killed Schrödinger’s cat

There is a paper making the rounds which was just published in Nature Physics, but has been on the arXiv since two years:
    Universal decoherence due to gravitational time dilation
    Igor Pikovski, Magdalena Zych, Fabio Costa, Caslav Brukner
    arXiv:1311.1095 [quant-ph]
According to an article in New Scientist the authors have shown that gravitationally induced decoherence solves the Schrödinger’s cat problem, ie explains why we never observe cats that are both dead and alive. Had they achieved this, that would be remarkable indeed because the problem has been solved half a century ago. New Scientist also quotes the first author as saying that the effect discussed in the paper induces a “kind of observer.”

New Scientist further tries to make a connection to quantum gravity, even though everyone involved told the journalist it’s got nothing to do with quantum gravity whatsoever. There is also a Nature News article, which is more careful for what the connection to quantum gravity, or absence thereof, is concerned, but still wants you to believe the authors have shown that “completely isolated objects” can “collapse into one state” which would contradict quantum mechanics. If that could happen it would be essentially the same as the information loss problem in black hole evaporation.

So what did they actually do in the paper?

It’s a straight-forward calculation which shows that if you have a composite system in thermal equilibrium and you push it into a gravitational field, then the degrees of freedom of the center of mass (com) get entangled with the remaining degrees of freedom (those of the system’s particles relative to the center of mass). The reason for this is that the energies of the particles become dependent on their position in the gravitational field by the standard redshift effect. This means that if the system’s particles had quantum properties, then these quantum properties mix together with the com position, basically.

Now, decoherence normally works as follows. If you have a system (the cat) that is in a quantum state, and you get it in contact with some environment (a heat bath, the cosmic microwave background, any type of measurement apparatus, etc), then the cat becomes entangled with the environment. Since you don’t know the details of the environment however, you have to remove (“trace out”) its information to see what the cat is doing, which leaves you with a system that has now a classic probabilistic distribution. One says the system has “decohered” because it has lost its quantum properties (or at least some of them, those that are affected by the interaction with the environment).

Three things important to notice about this environmentally induced decoherence. First, the effect happens extremely quickly for macroscopic objects even for the most feeble of interactions with the environment. This is why we never see cats that are both dead and alive, and also why building a functioning quantum computer is so damned hard. Second, while decoherence provides a reason we don’t see quantum superpositions, it doesn’t solve the measurement problem in the sense that it just results in a probability distribution of possible outcomes. It does not result in any one particular outcome. Third, nothing of that requires an actually conscious observer; that’s an entirely superfluous complication of a quite well understood process.

Back to the new paper then. The authors do not deal with environmentally induced decoherence but with an internal decoherence. There is no environment, there is only a linear gravitational potential; it’s a static external field that doesn’t carry any degrees of freedom. What they show is that if you trace out the particle’s degrees of freedom relative to the com, then the com decoheres. The com motion, essentially, becomes classical. It can no longer be in a superposition once decohered. They calculate the time it takes for this to happen, which depends on the number of particles of the system and its extension.

Why is this effect relevant? Well, if you are trying to measure interference it is relevant because this relies on the center of mass moving on two different paths – one going through the left slit, the other through the right one. So the decoherence of the center of mass puts a limit on what you can measure in such interference experiments. Alas, the effect is exceedingly tiny, smaller even than the decoherence induced by the cosmic microwave background. In the paper they estimate the time it takes for 1023 particles to decohere is about 10-3 seconds. But the number of particles in composite systems that can presently be made to interfere is more like 102 or maybe 103. For these systems, the decoherence time is roughly 107 seconds - that’s about a year. If that was the only decoherence effect for quantum systems, experimentalists would be happy!

Besides this, the center of mass isn’t the only quantum property of a system, because there are many ways you can bring a system in superpositions that doesn’t affect the com at all. Any rotation around the com for example would do. In fact there are many more degrees of freedom in the system that remain quantum than that decohere by the effect discussed in the paper. The system itself doesn’t decohere at all, it’s really just this particular degree of freedom that does. The Nature News feature states that
“But even if physicists could completely isolate a large object in a quantum superposition, according to researchers at the University of Vienna, it would still collapse into one state — on Earth's surface, at least.”
This is just wrong. The object could still have many different states, as long as they share the same center of mass variable. A pure state left in isolation will remain in a pure state.

I think the argument in the paper is basically correct, though I am somewhat confused about the assumption that the thermal distribution doesn’t change if the system is pushed into a gravitational field. One would expect that in this case the temperature also depends on the gradient.

So in summary, it is a nice paper that points out an effect of macroscopic quantum systems in gravitational fields that had not previously been studied. This may become relevant for interferometry of large composite objects at some point. But it is an exceedingly weak effect, and I for sure am very skeptical that it can be measured any time in the soon future. This effect doesn’t teach us anything about Schrödinger’s cat or the measurement problem that we didn’t know already, and it for sure has nothing to do with quantum gravity.

Science journalists work in funny ways. Even though I am quoted in the New Scientist article, the journalist didn’t bother sending me a link. Instead I got the link from Igor Pikovski, one of the authors of the paper, who wrote to me to apologize for the garble that he was quoted with. He would like to pass on the following clarification:
“To clarify a few quotes used in the article: The effect we describe is not related to quantum gravity in any way, but it is an effect where both, quantum theory and gravitational time dilation, are relevant. It is thus an effect based on the interplay between the two. But it follows from physics as we know it.

In the context of decoherence, the 'observer' are just other degrees of freedom to which the system becomes correlated, but has of course nothing to do with any conscious being. In the scenario that we consider, the center of mass becomes correlated with all the internal constituents. This takes place due to time dilation, which correlates any dynamics to the position in the gravitational field and results in decoherence of the center of mass of the composite system.

For current experiments this effect is very weak. Once superposition experiments can be done with very large and complex systems, this effect may become more relevant. In the end, the simple prediction is that it only depends on how much proper time difference is acquired by the interfering amplitudes of the system. If it's exactly zero, no decoherence takes place, as for example in a perfectly horizontal setup or in space (neglecting special relativistic time dilation). The latter was used as an example in the article. But of course there are other means to make sure the proper time difference is minimized. How hard or easy that will be depends on the experimental techniques. Maybe an easier route to experimentally probe this effect is to probe the underlying Hamiltonian. This could be done by placing clocks in superposition, which we discussed in a paper in 2011. The important point is that these predictions follow from physics as we know, without any modification to quantum theory or relativity. It is thus 'regular' decoherence that follows from gravitational time dilation.”

June 17, 2015

Jordan EllenbergSure as roses

I learned when I was writing this piece a few months ago that the New York Times styleguide doesn’t permit “fun as hell.”  So I had a problem while writing yesterday’s article about Common Core, and its ongoing replacement by an identical set of standards with a different name.  I wanted to say I was “sure as hell” not going to use the traditional addition algorithm for a problem better served by another method.  So instead I wrote “sure as roses.”  Doesn’t that sound like an actual folksy “sure as hell” substitute?  But actually I made it up.  I think it works, though.  Maybe it’ll catch on.

June 16, 2015

BackreactionThe plight of the postdocs: Academia and mental health

This is the story of a friend of a friend, a man by name Francis who took his life at age 34. Francis had been struggling with manic depression through most of his years as a postdoc in theoretical physics.

It is not a secret that short-term contracts and frequent moves are the norm in this area of research, but rarely do we spell out the toll it takes one our mental health. In fact, most of my tenured colleagues who profit from cheap and replaceable postdocs praise the virtue of the nomadic lifestyle which, so we are told, is supposed to broaden our horizon. But the truth is that moving is a necessary, though not sufficient, condition to build your network. It isn’t about broadening your horizon, it’s to make the contacts for which you are later being bought in. It’s not optional, it’s a misery you are expected to pretend enjoying.

I didn’t know Francis personally, and I would never have heard of him if it wasn’t for the acknowledgements in Oliver Roston’s recent paper:

“This paper is dedicated to the memory of my friend, Francis Dolan, who died, tragically, in 2011. It is gratifying that I have been able to honour him with work which substantially overlaps with his research interests and also that some of the inspiration came from a long dialogue with his mentor and collaborator, Hugh Osborn. In addition, I am indebted to Hugh for numerous perceptive comments on various drafts of the manuscript and for bringing to my attention gaps in my knowledge and holes in my logic. Following the appearance of the first version on the arXiv, I would like to thank Yu Nakayama for insightful correspondence.

I am firmly of the conviction that the psychological brutality of the post-doctoral system played a strong underlying role in Francis’ death. I would like to take this opportunity, should anyone be listening, to urge those within academia in roles of leadership to do far more to protect members of the community suffering from mental health problems, particularly during the most vulnerable stages of their careers.”
As a postdoc, Francis lived separated from his partner, and had trouble integrating in a new group. Due to difficulties with the health insurance after an international move, he couldn’t continue his therapy. And even though highly gifted, he must have known that no matter how hard he worked, a secure position in the area of research he loved was a matter of luck.

I found myself in a very similar situation after I moved to the US for my first postdoc. I didn’t fully realize just how good the German health insurance system is until I suddenly was on a scholarship without any insurance at all. When I read the fineprint, it became pretty clear that I wouldn’t be able to afford an insurance that covered psychotherapy or medical treatment for mental disorders, certainly not when I disclosed a history of chronic depression and various cycles of previous therapy.

With my move, I had left behind literally everybody I knew, including my boyfriend who I had intended to marry. For several months, the only piece of furniture in my apartment was a mattress because thinking any further was too much. I lost 30 pounds in six months, and sometimes went weeks without talking to a human being, other than myself.

The main reason I’m still here is that I’m by nature a loner. When I wasn’t working, I was hiking in the canyons, and that was pretty much all I did for the better part of the first year. Then, when I had just found some sort of equilibrium, I had to move again to take on another position. And then another. And another. It still seems a miracle that somewhere along the line I managed to not only marry the boyfriend I had left behind, but to also produce two wonderful children.

Yes, I was lucky. But Francis wasn’t. And just statistically some of you are in that dark place right now. If so, then you, as I, have heard them talk about people who “managed to get diagnosed” as if depression was a theater performance in which successful actors win a certificate to henceforth stay in bed. You, as I, know damned well that the last thing you want is that anybody who you may have to ask for a letter sees anything but the “hard working” and “very promising” researcher who is “recommended without hesitation.” There isn’t much advice I can give, except that you don’t forget it’s in the nature of the disease to underestimate one’s chances of recovery, and that mental health is worth more than the next paper. Please ask for help if you need it.

Like Oliver, I believe that the conditions under which postdoctoral researchers must presently sell their skills are not conductive to mental health. Postdocs see friends the same age in other professions having families, working independently, getting permanent contracts, pension plans, and houses with tricycles in the yard. Postdoctoral research collects some of the most intelligent and creative people on the planet, but in the present circumstances many are unable to follow their own interests, and get little appreciation for their work, if they get feedback at all. There are lots of reasons why being a postdoc sucks, and most of them we can do little about, like those supervisors who’d rather die then say you did a good job, only once. But what we can do is improve employment conditions and lower the pressure to constantly move.

Even in the richest countries on the planet, like Germany and Sweden, it is very common to park postdocs on scholarships without benefits. These scholarships are tax-free and come, for the employer, at low cost. Since the tax evasion is regulated by law, the scholarships can typically last only one or two years. It’s not that one couldn’t hire postdocs on longer, regular contracts with social and health benefits, it’s just that in current thinking quantity counts more than quality: More postdocs produce more papers, which looks better in the statistic. That’s practiced, among many others, at my own workplace.

There are some fields of research which lend themselves to short projects and in these fields one or two year gigs work just fine. In other fields that isn’t so. What you get from people on short-term contracts is short-term thinking. It isn’t only that this situation is stressful for postdocs, it isn’t good for science either. You might be saving money with these scholarships, but there is always a price to pay.

We will probably never know exactly what Francis went through. But for me just the possibility that the isolation and financial insecurity, which are all too often part of postdoc life, may have contributed to his suffering is sufficient reason to draw attention to this.

The last time I met Francis’ friend Oliver, he was a postdoc too. He now has two children, a beautiful garden, and has left academia for a saner profession. Oliver sends the following message to our readers:
“I think maybe the best thing I can think of is advising never to be ashamed of depression and to make sure you keep talking to your friends and that you get medical help. As for academia, one thing I have discovered is that it is possible to do research as a hobby. It isn't always easy to find the time (and motivation!) but leaving academia needn't be the end of one's research career. So for people wondering whether academia will ultimately take too high a toll on their (mental) health, the decision to leave academia needn't necessarily equate with the decision to stop doing research; it's just that a different balance in one's life has to be found!”

[If you speak German or trust Google translate, the FAZ blogs also wrote about this.]

John BaezWorld Energy Outlook 2015

It’s an exciting and nerve-racking time as global carbon emissions from energy production have begun to drop, at least for a little while:

yet keeping warming below 2°C seems ever more difficult:

The big international climate negotiations to be concluded in Paris in December 2015 bring these issues to the forefront in a dramatic way. Countries are already saying what they plan to do: you can read their Intended Nationally Determined Contributions online!

But it’s hard to get an overall picture of the situation. Here’s a new report that helps:

• International Energy Agency, World Energy Outlook Special Report 2015: Energy and Climate Change.

Since the International Energy Agency seems intelligent to me, I’ll just quote their executive summary. If you’re too busy for even the executive summary, let me summarize the summary:

Given the actions that countries are now planning, we could have an increase of around 2.6 °C over preindustrial temperature by 2100, and more after that.

Executive summary

A major milestone in efforts to combat climate change is fast approaching. The importance of the 21st Conference of the Parties (COP21) – to be held in Paris in December 2015 – rests not only in its specific achievements by way of new contributions, but also in the direction it sets. There are already some encouraging signs with a historic joint announcement by the United States and China on climate change, and climate pledges for COP21 being submitted by a diverse range of countries and in development in many others. The overall test of success for COP21 will be the conviction it conveys that governments are determined to act to the full extent necessary to achieve the goal they have already set to keep the rise in global average temperatures below 2 degrees Celsius (°C), relative to pre-industrial levels.

Energy will be at the core of the discussion. Energy production and use account for two-thirds of the world’s greenhouse-gas (GHG) emissions, meaning that the pledges made at COP21 must bring deep cuts in these emissions, while yet sustaining the growth of the world economy, boosting energy security around the world and bringing modern energy to the billions who lack it today. The agreement reached at COP21 must be comprehensive geographically, which means it must be equitable, reflecting both national responsibilities and prevailing circumstances. The importance of the energy component is why this World Energy Outlook Special Report presents detailed energy and climate analysis for the sector and recommends four key pillars on which COP21 can build success.

Energy and emissions: moving apart?

The use of low-carbon energy sources is expanding rapidly, and there are signs that growth in the global economy and energy-related emissions may be starting to decouple. The global economy grew by around 3% in 2014 but energy-related carbon dioxide (CO2) emissions stayed flat, the first time in at least 40 years that such an outcome has occurred outside economic crisis.

Renewables accounted for nearly half of all new power generation capacity in 2014, led by growth in China, the United States, Japan and Germany, with investment remaining strong (at $270 billion) and costs continuing to fall. The energy intensity of the global economy dropped by 2.3% in 2014, more than double the average rate of fall over the last decade, a result stemming from improved energy efficiency and structural changes in some economies, such as China.

Around 11% of global energy-related CO2 emissions arise in areas that operate a carbon market (where the average price is $7 per tonne of CO2), while 13% of energy-related CO2 emissions arise in markets with fossil-fuel consumption subsidies (an incentive equivalent to $115 per tonne of CO2, on average). There are some encouraging signs on both fronts, with reform in sight for the European Union’s Emissions Trading Scheme and countries including India, Indonesia, Malaysia and Thailand taking the opportunity of lower oil prices to diminish fossil-fuel subsidies, cutting the incentive for wasteful consumption.

The energy contribution to COP21

Nationally determined pledges are the foundation of COP21. Intended Nationally
Determined Contributions (INDCs) submitted by countries in advance of COP21 may vary in scope but will contain, implicitly or explicitly, commitments relating to the energy sector. As of 14 May 2015, countries accounting for 34% of energy-related emissions had submitted their new pledges.

A first assessment of the impact of these INDCs and related policy statements (such as by China) on future energy trends is presented in this report in an “INDC Scenario”. This shows, for example, that the United States’ pledge to cut net greenhouse-gas emissions by 26% to 28% by 2025 (relative to 2005 levels) would deliver a major reduction in emissions while the economy grows by more than one-third over current levels. The European Union’s pledge to cut GHG emissions by at least 40% by 2030 (relative to 1990 levels) would see energy-related CO2 emissions decline at nearly twice the rate achieved since 2000, making it one of the world’s least carbon-intensive energy economies. Russia’s energy-related emissions decline slightly from 2013 to 2030 and it meets its 2030 target comfortably, while implementation of Mexico’s pledge would see its energy-related emissions increase slightly while its economy grows much more rapidly. China has yet to submit its INDC, but has stated an intention to achieve a peak in its CO2 emissions around 2030 (if not earlier), an important change in direction, given the pace at which they have grown on average since 2000.

Growth in global energy-related GHG emissions slows but there is no peak by 2030 in the INDC Scenario. The link between global economic output and energy-related GHG emissions weakens significantly, but is not broken: the economy grows by 88% from 2013 to 2030 and energy-related CO2 emissions by 8% (reaching 34.8 gigatonnes). Renewables become the leading source of electricity by 2030, as average annual investment in nonhydro renewables is 80% higher than levels seen since 2000, but inefficient coal-fired power generation capacity declines only slightly.

With INDCs submitted so far, and the planned energy policies in countries that have yet to submit, the world’s estimated remaining carbon budget consistent with a 50% chance of keeping the rise in temperature below 2 °C is consumed by around 2040—eight months later than is projected in the absence of INDCs. This underlines the need for all countries to submit ambitious INDCs for COP21 and for these INDCs to be recognised as a basis upon which to build stronger future action, including from opportunities for collaborative/co-ordinated action or those enabled by a transfer of resources (such as technology and finance). If stronger action is not forthcoming after 2030, the path in the INDC Scenario would be consistent with an an average temperature increase of around 2.6 °C by 2100 and 3.5 °C after 2200.

What does the energy sector need from COP21?

National pledges submitted for COP21 need to form the basis for a “virtuous circle” of rising ambition. From COP21, the energy sector needs to see a projection from political leaders at the highest level of clarity of purpose and certainty of action, creating a clear expectation of global and national low-carbon development. Four pillars can support that achievement:

1. Peak in emissions – set the conditions which will achieve an early peak in global
energy-related emissions.

2. Five-year revision – review contributions regularly, to test the scope to lift the level of ambition.

3. Lock in the vision – translate the established climate goal into a collective long-term emissions goal, with shorter-term commitments that are consistent with the long-term vision.

4. Track the transition – establish an effective process for tracking achievements in
the energy sector.

Peak in emissions

The IEA proposes a bridging strategy that could deliver a peak in global energy-related
emissions by 2020. A commitment to target such a near-term peak would send a clear message of political determination to stay below the 2 °C climate limit. The peak can be
achieved relying solely on proven technologies and policies, without changing the economic and development prospects of any region, and is presented in a “Bridge Scenario”. The technologies and policies reflected in the Bridge Scenario are essential to secure the long-term decarbonisation of the energy sector and their near-term adoption can help keep the door to the 2 °C goal open. For countries that have submitted their INDCs, the proposed strategy identifies possible areas for over-achievement. For those that have yet to make a submission, it sets out a pragmatic baseline for ambition.

The Bridge Scenario depends upon five measures:

• Increasing energy efficiency in the industry, buildings and transport sectors.

• Progressively reducing the use of the least-efficient coal-fired power plants and
banning their construction.

• Increasing investment in renewable energy technologies in the power sector from
$270 billion in 2014 to $400 billion in 2030.

• Gradual phasing out of fossil-fuel subsidies to end-users by 2030.

• Reducing methane emissions in oil and gas production.

These measures have profound implications for the global energy mix, putting a brake on growth in oil and coal use within the next five years and further boosting renewables. In the Bridge Scenario, coal use peaks before 2020 and then declines while oil demand rises to 2020 and then plateaus. Total energy-related GHG emissions peak around 2020. Both the energy intensity of the global economy and the carbon intensity of power generation improve by 40% by 2030. China decouples its economic expansion from emissions growth by around 2020, much earlier than otherwise expected, mainly through improving the energy efficiency of industrial motors and the buildings sector, including through standards for appliances and lighting. In countries where emissions are already in decline today, the decoupling of economic growth and emissions is significantly accelerated; compared with recent years, the pace of this decoupling is almost 30% faster in the European Union (due to improved energy efficiency) and in the United States (where renewables contribute one-third of the achieved emissions savings in 2030). In other regions, the link between economic growth and emissions growth is weakened significantly, but the relative importance of different measures varies. India utilises energy more efficiently, helping it
to reach its energy sector targets and moderate emissions growth, while the reduction of
methane releases from oil and gas production and reforming fossil-fuel subsidies (while
providing targeted support for the poorest) are key measures in the Middle East and Africa, and a portfolio of options helps reduce emissions in Southeast Asia. While universal access to modern energy is not achieved in the Bridge Scenario, the efforts to reduce energy related emissions do go hand-in-hand with delivering access to electricity to 1.7 billion people and access to clean cookstoves to 1.6 billion people by 2030.

June 15, 2015

Doug NatelsonBrief news items

In the wake of travel, I wanted to point readers to a few things that might have been missed:
  • Physics Today asks "Has science 'taken a turn towards darkness'?"  I tend to think that the physical sciences and engineering are inherently less problematic (because of the ability of others to try to reproduce results in a controlled environment) than biology/medicine (incredibly complex and therefore difficult or impractical to do controlled experimentation) or the social sciences.  
  • Likewise, Physics Today's Steven Corneliussen also asks, "Could the evolution of theoretical physics harm public trust in science?"  This gets at the extremely worrying (to me) tendency of some high energy/cosmology theorists these days to decry that the inability to test their ideas is really not a big deal, and that we shouldn't be so hung up on the idea of falsifiability
  • Ice spikes are cool.
  • Anshul Kogar and Ethan Brown have started a new condensed matter blog!  The more the merrier, definitely.
  • My book is available for download right now in kindle form, with hard copies available in the UK in a few days and in the US next month.

June 14, 2015

Jordan EllenbergTranslator’s notes

The Brazilian edition of How Not To Be Wrong, with its beautiful cover, just showed up at my house.  One of the interesting things about leafing through it is reading the translator’s notes, which provide explanations for words and phrases that will be mysterious to Brazilian readers.  E.G.:

  • yeshiva
  • Purim
  • NCAA
  • Affordable Care Act
  • Rube Goldberg
  • home run
  • The Tea Party (identified by the translator as “radical wing of the Republican party”
  • “likely voters” — translator notes that “in the United States, voting is not obligatory”
  • home run (again!)
  • RBI (charmingly explained as “run battled in”)

I am also proud to have produced, on two separate occasions, a “trocadilho intraduzivel do ingles” (untranslatable English pun)


ResonaancesWeekend plot: minimum BS conjecture

This weekend plot completes my last week's post:

It shows the phase diagram for models of natural electroweak symmetry breaking. These models can be characterized by 2 quantum numbers:

  • B [Baroqueness], describing how complicated is the model relative to the standard model;   
  • S [Strangeness], describing the fine-tuning needed to achieve electroweak symmetry breaking with the observed Higgs boson mass. 

To allow for a fair comparison, in all models the cut-off scale is fixed to Λ=10 TeV. The standard model (SM) has, by definition,  B=1, while S≈(Λ/mZ)^2≈10^4.  The principle of naturalness postulates that S should be much smaller, S ≲ 10.  This requires introducing new hypothetical particles and interactions, therefore inevitably increasing B.

The most popular approach to reducing S is by introducing supersymmetry.  The minimal supersymmetric standard model (MSSM) does not make fine-tuning better than 10^3 in the bulk of its parameter space. To improve on that, one needs to introduce large A-terms (aMSSM), or  R-parity breaking interactions (RPV), or an additional scalar (NMSSM).  Another way to decrease S is achieved in models the Higgs arises as a composite Goldstone boson of new strong interactions. Unfortunately, in all of those models,  S cannot be smaller than 10^2 due to phenomenological constraints from colliders. To suppress S even further, one has to resort to the so-called neutral naturalness, where new particles beyond the standard model are not charged under the SU(3) color group. The twin Higgs - the simplest  model of neutral naturalness - can achieve S10 at the cost of introducing a whole parallel mirror world.

The parametrization proposed here leads to a striking observation. While one can increase B indefinitely (many examples have been proposed  the literature),  for a given S there seems to be a minimum value of B below which no models exist.  In fact, the conjecture is that the product B*S is bounded from below:
BS ≳ 10^4. 
One robust prediction of the minimum BS conjecture is the existence of a very complicated (B=10^4) yet to be discovered model with no fine-tuning at all.  The take-home message is that one should always try to minimize BS, even if for fundamental reasons it cannot be avoided completely ;)

ResonaancesNaturalness' last bunker

Last week Symmetry Breaking ran the article entitled "Natural SUSY's last stand". That title is a bit misleading as it makes you think of General Custer at the eve of Battle of the Little Bighorn, whereas natural supersymmetry has long been dead bodies torn by vultures. Nevertheless, it is  interesting to ask a more general question: are there any natural theories that survived? And if yes, what can we learn about them from the LHC run-2?

For over 30 years naturalness has been the guiding principle in theoretical particle physics.  The standard model by itself has no naturalness problem: it contains 19 free parameters  that are simply not calculable and have to be taken from experiment. The problem arises because we believe the standard model is eventually embedded in a more fundamental  theory where all these parameters, including the Higgs boson mass, are calculable. Once that is done, the calculated Higgs mass will typically be proportional to the heaviest state in that theory as a result of quantum corrections. The exception to this rule is when the fundamental theory possesses a symmetry forbidding the Higgs mass, in which case the mass will be proportional to the scale where the symmetry becomes manifest. Given the Higgs mass is 125  GeV, the concept of naturalness leads to the following prediction: 1) new particles beyond the standard model should appear around the mass scale of 100-300 GeV, and  2) the new theory with the new particles should have a  protection mechanism for the Higgs mass built in.  

There are two main realizations of this idea. In supersymmetry, the protection is provided by opposite-spin partners of the known particles. In particular, the top quark is accompanied by stop quarks who are spin-0 scalars but otherwise they have the same color and electric charge as the top quark. Another protection mechanism can be provided by a spontaneously broken global symmetry, usually realized in the context of new strong interactions from which the Higgs arises as a composite particle. In that case, the protection is provided by the same spin partners, for example the top quark has a fermionic partner with the same quantum numbers but a different mass.

Both of these ideas are theoretically very attractive but are difficult to realize in practice. First of all, it is hard to understand how these 100 new partner particles could be hiding around the corner without leaving any trace in numerous precision experiments. But even if we were willing to believe in the Universal conspiracy, the LHC run-1 was the final nail in the coffin. The point is that both of these scenarios make a very specific  prediction: the existence of new particles with color charges around the weak scale. As the LHC is basically a quark and gluon collider, it can produce colored particles in large quantities. For example, for a 1 TeV gluino (supersymmetric partner of the gluon) some 1000 pairs would have been already produced at the LHC. Thanks to  the large production rate, the limits on colored partners are already quite stringent. For example, the LHC limits on masses of gluinos and massive spin-1 gluon resonances extend well above 1 TeV, while for scalar and fermionic top partners the limits are not far below 1 TeV. This means that a conspiracy theory is not enough: in supersymmetry and composite Higgs one also has to accept a certain degree of fine-tuning, which means we don't even solve the problem that is the very motivation for these theories.

The reasoning above suggests a possible way out.  What if naturalness could be realized without colored partners: without gluinos, stops, or heavy tops. The conspiracy problem would not go away, but at least we could avoid stringent limits from the LHC. It turns out that theories with such a property do exist. They linger away from the mainstream,  but recently they have been gaining popularity under the name of the neutral naturalness.  The reason for that is obvious: such theories may offer a nuclear bunker that will allow naturalness to survive beyond the LHC run-2.

The best known realization of neutral naturalness is the twin Higgs model. It assumes the existence of a mirror world, with mirror gluons, mirror top quarks, a mirror Higgs boson, etc., which is related to the standard model by an approximate parity symmetry.  The parity gives rise to an accidental global symmetry that could protect the Higgs boson mass. At the technical level, the protection mechanism is similar as in composite Higgs models where standard model particles have partners with the same spins.  The crucial difference, however, is that the mirror top quarks and mirror gluons are charged under the mirror color group, not the standard model color.  As we don't have a mirror proton collider yet, the mirror partners are not produced in large quantities at the LHC. Therefore, they could well be as light as our top quark without violating any experimental bounds,  and in agreement with the requirements of naturalness.

A robust prediction of twin-Higgs-like models is that the Higgs boson couplings to matter deviate from the standard model predictions, as a consequence of mixing with the mirror Higgs. The size of this deviation is of the same order as  the fine-tuning in the theory, for example order 10% deviations  are expected when the fine-tuning is 1 in 10. This is perhaps the best motivation for precision Higgs studies:  measuring the Higgs couplings with an accuracy better than 10% may invalidate or boost the idea.  However,  the neutral naturalness points us to experimental signals that are often very different than in the popular models. For example, the mirror color interactions are  expected to behave at low energies similarly to our QCD:  there should be mirror mesons, baryons, glueballs.  By construction, the Higgs boson  must couple to the mirror world, and therefore it offers a portal via which the mirror hadronic junk can be produced and decay, which  may lead to truly exotic signatures such as displaced jets. This underlines the importance to search for exotic Higgs boson decays - very few such studies have been carried out by the LHC experiments so far. Finally, as it has been speculated for long time, dark matter may have something to do the with the mirror world. Neutral naturalness provides a reason for the existence of the mirror world and an approximate parity symmetry relating it to the real world. It may be our best shot at understanding why the amounts of ordinary and dark matter in the Universe are equal  up to a factor of  5 - something that arises as a complete accident in the usual WIMP dark matter scenario.

There's no doubt that the neutral naturalness is a  desperate attempt to save natural electroweak symmetry breaking from the reality check, or at least postpone the inevitable. Nevertheless, the existence of a mirror world is certainly a logical possibility. The recent resurgence of this scenario has led to identifying new interesting models, and new ways to search for them in  experiment. The persistence of the naturalness principle may thus be turned into a positive force, as it may motivate better searches for hidden particles.  It is possible that the LHC data hold the answer to the naturalness puzzle, but we will have to look deeper to extract it.

June 13, 2015

ResonaancesOn the LHC diboson excess

The ATLAS diboson resonance search showing a 3.4 sigma excess near 2 TeV has stirred some interest. This is understandable: 3 sigma does not grow on trees, and moreover CMS also reported anomalies in related analyses. Therefore it is worth looking at these searches in a bit more detail in order to gauge how excited we should be.

The ATLAS one is actually a dijet search: it focuses on events with two very energetic jets of hadrons.  More often than not, W and Z boson decay to quarks. When a TeV-scale  resonance decays to electroweak bosons, the latter, by energy conservation,  have to move with large velocities. As a consequence, the 2 quarks from W or Z boson decays will be very collimated and will be seen as a single jet in the detector.  Therefore, ATLAS looks for dijet events where 1) the mass of each jet is close to that of W (80±13 GeV) or Z (91±13 GeV), and  2) the invariant mass of the dijet pair is above 1 TeV.  Furthermore, they look into the substructure of the jets, so as to identify the ones that look consistent with W or Z decays. After all this work, most of the events still originate from ordinary QCD production of quarks and gluons, which gives a smooth background falling with the dijet invariant mass.  If LHC collisions lead to a production of  a new particle that decays to WW, WZ, or ZZ final states, it should show as a bump on top of the QCD background. ATLAS observes is this:

There is a bump near 2 TeV, which  could indicate the existence of a particle decaying to WW and/or WZ and/or ZZ. One important thing to be aware of is that this search cannot distinguish well between the above 3  diboson states. The difference between W and Z masses is only 10 GeV, and the jet mass windows used in the search for W and Z  partly overlap. In fact, 20% of the events fall into all 3 diboson categories.   For all we know, the excess could be in just one final state, say WZ, and simply feed into the other two due to the overlapping selection criteria.

Given the number of searches that ATLAS and CMS have made, 3 sigma fluctuations of the background should happen a few times in the LHC run-1 just by sheer chance.  The interest in the ATLAS  excess is however amplified by the fact that diboson searches in CMS also show anomalies (albeit smaller) just below 2 TeV. This can be clearly seen on this plot with limits on the Randall-Sundrum graviton excitation, which is one  particular model leading to diboson resonances. As W and Z bosons sometimes decay to, respectively, one and two charged leptons, diboson resonances can be searched for not only via dijets but also in final states with one or two leptons.  One can see that, in CMS, the ZZ dilepton search (blue line), the WW/ZZ dijet search (green line), and the WW/WZ one-lepton (red line)  search all report a small (between 1 and 2 sigma) excess around 1.8 TeV.  To make things even more interesting,  the CMS search for WH resonances return 3 events  clustering at 1.8 TeV where the standard model background is very small (see Tommaso's post). Could the ATLAS and CMS events be due to the same exotic physics?

Unfortunately, building a model explaining all the diboson data is not easy. Enough to say that the ATLAS excess has been out for a week and there's isn't yet any serious ambulance chasing paper on arXiv. One challenge is the event rate. To fit the excess, the resonance should be produced with a cross section of a couple of tens of femtobarns. This requires the new particle to couple quite strongly to quarks or gluons. At the same time, it should remain a narrow resonance decaying dominantly to dibosons. Furthermore, in concrete models, a sizable coupling to electroweak gauge bosons will get you in trouble with electroweak precision tests.

However, there is yet a bigger problem, which can be also  seen in the plot above. Although the excesses in CMS occur roughly at the same mass, they are not compatible when it comes to the cross section. And so the limits in the single-lepton search are not consistent with the new particle interpretation of the excess in dijet  and  the dilepton searches, at least in the context of the Randall-Sundrum graviton model. Moreover, the limits from the CMS one-lepton search are grossly inconsistent with the diboson interpretation of the ATLAS excess! In order to believe that the ATLAS 3 sigma excess is real one has to move to much more baroque models. One possibility is that  the dijets observed by ATLAS do not originate from  electroweak bosons, but rather from an exotic particle with a similar mass. Another possibility is that the resonance decays only to a pair of Z bosons and not to W bosons, in which case the CMS limits are weaker; but I'm not sure if there exist consistent models with this property.  

My conclusion...  For sure this is something to observe in the early run-2. If this is real, it should clearly show in both experiments already this year.  However, due to the inconsistencies between different search channels and the theoretical challenges, there's little reason to get excited yet.

Thanks to Chris for digging out the CMS plot.

June 12, 2015

BackreactionWhere are we on the road to quantum gravity?

Damned if I know! But I got to ask some questions to Lee Smolin which he kindly replied to, and you can read his answers over at Starts with a Bang. If you’re a string theorist you don’t have to read it of course because we already know you’ll hate it.

But I would be acting out of character if not having an answer to the question posed in the title did prevent me from going on and distributing opinions, so here we go. On my postdoctoral path through institutions I’ve passed by string theory and loop quantum gravity, and after some closer inspection stayed at a distance from both because I wanted to do physics and not math. I wanted to describe something in the real world and not spend my days proving convergence theorems or doing stability analyses of imaginary things. I wanted to do something meaningful with my life, and I was – still am – deeply disturbed by how detached quantum gravity is from experiment. So detached in fact one has to wonder if it’s science at all.

That’s why I’ve worked for years on quantum gravity phenomenology. The recent developments in string theory to apply the AdS/CFT duality to the description of strongly coupled systems are another way to make this contact to reality, but then we were talking about quantum gravity.

For me the most interesting theoretical developments in quantum gravity are the ones Lee hasn’t mentioned. There are various emergent gravity scenarios and though I don’t find any of them too convincing, there might be something to the idea that gravity is a statistical effect. And then there is Achim Kempf’s spectral geometry that for all I can see would just fit together very nicely with causal sets. But yeah, there are like two people in the world working on this and they’re flying below the pop sci radar. So you’d probably never have heard of them if it wasn’t for my awesome blog, so listen: Have an eye on Achim Kempf and Raffael Sorkin, they’re both brilliant and their work is totally underappreciated.

Personally, I am not so secretly convinced that the actual reason we haven’t yet figured out which theory of quantum gravity describes our universe is that we haven’t understood quantization. The so-called “problem of time”, the past hypothesis, the measurement problem, the cosmological constant – all this signals to me the problem isn’t gravity, the problem is the quantization prescription itself. And what a strange procedure this is, to take a classical theory and then quantize and second quantize it to obtain something more fundamental. How do we know this procedure isn’t scale dependent? How do we know it works the same at the Planck scale as in our labs? We don’t. Unfortunately, this topic rests at the intersection of quantum gravity and quantum foundations and is dismissed by both sides, unless you count my own small contribution. It’s a research area with only one paper!

Having said that, I found Lee’s answers interesting because I understand better now the optimism behind the quote from his 2001 book, that predicted we’d know the theory of quantum gravity by 2015.

I originally studied mathematics, and it just so happened that the first journal club I ever attended, in '97 or '98, was held by a professor for mathematical physics on the topic of Ashtekar’s variables. I knew some General Relativity and was just taking a class on quantum field theory, and this fit in nicely. It was somewhat over my head but basically the same math and not too difficult to follow. And it all seemed to make much sense! I switched from math to physics and in fact for several years to come I lived under the impression that gravity had been quantized and it wouldn’t take long until somebody calculated exactly what is inside a black hole and how the big bang works. That, however, never happened. And here we are in 2015, still looking to answer the same questions.

I’ll restrain from making a prediction because predicting when we’ll know the theory for quantum gravity is more difficult than finding it in the first place ;o)

John Preskill20 years of qubits: the arXiv data

Editor’s Note: The preceding post on Quantum Frontiers inspired the always curious Paul Ginsparg to do some homework on usage of the word “qubit” in papers posted on the arXiv. Rather than paraphrase Paul’s observations I will quote his email verbatim, so you can experience its Ginspargian style.qubit-data

fig has total # uses of qubit in arxiv (divided by 10) per month, and
total # docs per month:
an impressive 669394 total in 29587 docs.

the graph starts at 9412 (dec '94), but that is illusory since qubit
only shows up in v2 of hep-th/9412048, posted in 2004.
the actual first was quant-ph/9503016 by bennett/divicenzo/shor et al
(posted 23 Mar '95) where they carefully attribute the term to
schumacher ("PRA, to appear '95") and jozsa/schumacher ("J. Mod Optics
'94"), followed immediately by quant-ph/9503017 by deutsch/jozsa et al
(which no longer finds it necessary to attribute term)

[neither of schumacher's first two articles is on arxiv, but otherwise
probably have on arxiv near 100% coverage of its usage and growth, so
permits a viral epidemic analysis along the lines of kaiser's "drawing
theories apart"  of use of Feynman diagrams in post wwII period].

ever late to the party, the first use by j.preskill was
quant-ph/9602016, posted 21 Feb 1996

#articles by primary subject area as follows (hep-th is surprisingly
low given the firewall connection...):

quant-ph 22096
cond-mat.mes-hall 3350
cond-mat.supr-con 880
cond-mat.str-el 376
cond-mat.mtrl-sci 250
math-ph 244
hep-th 228
physics.atom-ph 224
cond-mat.stat-mech 213
cond-mat.other 200
physics.optics 177
cond-mat.quant-gas 152
physics.gen-ph 120
gr-qc 105
cond-mat 91
cs.CC 85
cs.IT 67
cond-mat.dis-nn 55
cs.LO 49
cs.CR 43
physics.chem-ph 33
cs.ET 25
physics.ins-det 21
math.CO,nlin.CD 20
physics.hist-ph,,math.OC 19
hep-ph 18
cond-mat.soft,cs.DS,math.OA 17
cs.NE,cs.PL,math.QA 13
cs.AR,cs.OH 12
physics.comp-ph 11
math.LO 10
physics.soc-ph,physics.ed-ph,cs.AI 9
math.ST,physics.pop-ph,cs.GT 8
nlin.AO,astro-ph,cs.DC,cs.FL,q-bio.GN 7
nlin.PS,math.FA,cs.NI,math.PR,q-bio.NC,physics.class-ph,math.GM, 6
nlin.SI,math.CT,q-fin.GN,cs.LG,q-bio.BM,cs.DM,math.GT 5
math.DS,physics.atm-clus,q-bio.PE 4
math.RA,math.AG,astro-ph.IM,q-bio.OT 3
math.RT 2
nucl-ex 1

n-Category Café Carnap and the Invariance of Logical Truth

I see Steve Awodey has a paper just out Carnap and the invariance of logical truth. We briefy discussed this idea in the context of Mautner’s 1946 article back here.

Steve ends the article by portraying homotopy type theory as following in the same tradition, but now where invariance is under homotopy equivalence. I wonder if we’ll see some variant of the model/theory duality he and Forssell found in the case of HoTT.

June 10, 2015

Doug NatelsonMolecular electronics: 40+ years

More than 40 years ago, this paper was published, articulating clearly from a physical chemistry point of view the possibility that it might be possible to make a nontrivial electronic device (a rectifier, or diode) out of a single small molecule (a "donor"-bridge-"acceptor" structure, analogous to a pn junction - see this figure, from that paper).  Since then, there has been a great deal of interest in "molecular electronics".  This week I am at this conference in Israel, celebrating both this anniversary and the 70th birthday of Mark Ratner, the tremendous theoretical physical chemist who coauthored that paper and has maintained an infectious level of enthusiasm about this and all related topics.

The progress of the field has been interesting.  In the late '90s through about 2002, there was enormous enthusiasm, with some practitioners making rather wild statements about where things were going.  It turned out that this hype was largely over-the-top - some early measurements proved to be very poorly reproducible and/or incorrectly interpreted; being able to synthesize 1022 identical "components" in a beaker is great, but if each one has to be bonded with atomic precision to get reproducible responses that's less awesome; getting molecular devices to have genuinely useful electronic properties was harder than it looked, with some fundamental limitations;  Hendrik Schoen was a fraud and his actions tainted the field; DARPA killed their Moletronics program, etc.    That's roughly when I entered the field.  Timing is everything.

Even with all these issues, these systems have proven to be a great proving ground for testing our understanding of a fair bit of physics and chemistry - how should we think about charge transport through small quantum systems?  How important are quantum effects, electron-electron interactions, electron-vibrational interactions?   How does dissipation really work at these scales?  Do we really understand how to compute molecular levels/gaps in free space and on surfaces with quantitative accuracy?  Can we properly treat open quantum systems, where particles and energy flow in and out?  What about time-dependent cases, relevant when experiments involve pump/probe optical approaches?  Even though we are (in my opinion) very unlikely to use single- or few-molecule devices in technologies, we are absolutely headed toward molecular-scale (countably few atom) silicon devices, and a lot of this physics is relevant there.  Similarly, the energetic and electronic structure issues involved are critically important to understanding catalysis, surface chemistry, organic photovoltaics, etc.

John PreskillWho named the qubit?

Perhaps because my 40th wedding anniversary is just a few weeks away, I have been thinking about anniversaries lately, which reminded me that we are celebrating the 20th anniversary of a number of milestones in quantum information science. In 1995 Cirac and Zoller proposed, and Wineland’s group first demonstrated, the ion trap quantum computer. Quantum error-correcting codes were invented by Shor and Steane, entanglement concentration and purification were described by Bennett et al., and there were many other fast-breaking developments. It was an exciting year.

But the event that moved me to write a blog post is the 1995 appearance of the word “qubit” in an American Physical Society journal. When I was a boy, two-level quantum systems were called “two-level quantum systems.” Which is a descriptive name, but a mouthful and far from euphonious. Think of all the time I’ve saved in the past 20 years by saying “qubit” instead of “two-level quantum system.” And saying “qubit” not only saves time, it also conveys the powerful insight that a quantum state encodes a novel type of information. (Alas, the spelling was bound to stir controversy, with the estimable David Mermin a passionate holdout for “qbit”. Give it up, David, you lost.)

Ben Schumacher. Thanks for the qubits, Ben!

Ben Schumacher. Thanks for the qubits, Ben!

For the word “qubit” we know whom to thank: Ben Schumacher. He introduced the word in his paper “Quantum Coding” which appeared in the April 1995 issue of Physical Review A. (History is complicated, and in this case the paper was actually submitted in 1993, which allowed another paper by Jozsa and Schumacher to be published earlier even though it was written and submitted later. But I’m celebrating the 20th anniversary of the qubit now, because otherwise how will I justify this blog post?)

In the acknowledgments of the paper, Ben provided some helpful background on the origin of the word:

The term “qubit” was coined in jest during one of the author’s many intriguing and valuable conversations with W. K. Wootters, and became the initial impetus for this work.

I met Ben (and other luminaries of quantum information theory) for the first time at a summer school in Torino, Italy in 1996. After reading his papers my expectations were high, all the more so after Sam Braunstein warned me that I would be impressed: “Ben’s a good talker,” Sam assured me. I was not disappointed.

(I also met Asher Peres at that Torino meeting. When I introduced myself Asher asked, “Isn’t there someone with a similar name in particle theory?” I had no choice but to come clean. I particularly remember that conversation because Asher told me his secret motivation for studying quantum entanglement: it might be important in quantum gravity!)

A few years later Ben spent his sabbatical year at Caltech, which gave me an opportunity to compose a poem for the introduction to Ben’s (characteristically brilliant) talk at our physics colloquium. This poem does homage to that famous 1995 paper in which Ben not only introduced the word “qubit” but also explained how to compress a quantum state to the minimal number of qubits from which the original state can be recovered with a negligible loss of fidelity, thus formulating and proving the quantum version of Shannon’s famous source coding theorem, and laying the foundation for many subsequent developments in quantum information theory.

Sometimes when I recite a poem I can sense the audience’s appreciation. But in this case there were only a few nervous titters. I was going for edgy but might have crossed the line into bizarre.. Since then I’ve (usually) tried to be more careful.

(For reading the poem, it helps to know that the quantum state appears to be random when it has been compressed as much as possible.)

On Quantum Compression (in honor of Ben Schumacher)

He rocks.
I remember
He showed me how to fit
A qubit
In a small box.

I wonder how it feels
To be compressed.
And then to pass
A fidelity test.

Or does it feel
At all, and if it does
Would I squeal
Or be just as I was?

If not undone
I’d become as I’d begun
And write a memorandum
On being random.
Had it felt like a belt
Of rum?

And might it be predicted
That I’d become addicted,
Longing for my session
Of compression?

I’d crawl
To Ben again.
And call,
“Put down your pen!
Don’t stall!
Make me small!”

June 09, 2015

n-Category Café Semigroup Puzzles

Suppose you have a semigroup: that is, a set with an associative product. Also suppose that

xyx=x x y x = x

for all xx and all yy.

Puzzle 1. Prove that

xyz=xz x y z = x z

for all x,yx,y and zz.

Puzzle 2. Prove that

xx=x x x = x

for all xx.

The proofs I know are not ‘deep’: they involve nothing more than simple equation-pushing. But the results were surprising to me, because they feel like you’re getting something for nothing.

Regarding Puzzle 2: of course xyx=xx y x = x gives xx=xx x = x if you’re in a monoid, since you can take y=1y = 1. But in a monoid, the law xyx=xx y x = x is deadly, since you can take x=1x = 1 and conclude that y=1y = 1 for all yy. So these puzzles are only interesting for semigroups that aren’t monoids.