Planet Musings

November 18, 2017

Jordan Ellenberg“Worst of the worst maps”: a factual mistake in Gill v. Whitford

The oral arguments in Gill v. Whitford, the Wisconsin gerrymandering case, are now a month behind us.  But there’s a factual error in the state’s case, and I don’t want to let it be forgotten.  Thanks to Mira Bernstein for pointing this issue out to me.

Misha Tseytlin, Wisconsin’s solicitor general, was one of two lawyers arguing that the state’s Republican-drawn legislative boundaries should be allowed to stand.  Tseytlin argued that the metrics that flagged Wisconsin’s maps as drastically skewed in the GOP’s favor were unreliable:

And I think the easiest way to see this is to take a look at a chart that plaintiff’s own expert created, and that’s available on Supplemental Appendix 235. This is plain — plaintiff’s expert studied maps from 30 years, and he identified the 17 worst of the worst maps. What is so striking about that list of 17 is that 10 were neutral draws.  There were court-drawn maps, commission-drawn maps, bipartisan drawn maps, including the immediately prior Wisconsin drawn map.

That’s a strong claim, which jumped out at me when I read the transcripts–10 of the 17 very worst maps, according to the metrics, were drawn by neutral parties!  That really makes it sound like whatever those metrics are measuring, it’s not partisan gerrymandering.

But the claim isn’t true.

(To be clear, I believe Tseytlin made a mistake here, not a deliberate misrepresentation.)

The table he’s referring to is on p.55 of this paper by Simon Jackman, described as follows:

Of these, 17 plans are utterly unambiguous with respect to the sign of the efficiency gap estimates recorded over the life of the plan:

Why would the efficiency gap be ambiguous?  Because the basic formula assumes both parties are running candidates in every district.  If there’s an uncontested race, you have to make your best estimate for what the candidate’s vote shares would have been if there had been candidates of both parties.  So you have an estimate for the efficiency gap, but also some uncertainty.  The more uncontested races, the more uncertainty.

These are not the 17 “worst of the worst maps.”  They are not the ones with the highest efficiency gaps, not the ones most badly gerrymandered by any measure.  They’re the ones in states with so few uncontested races that we can be essentially sure the efficiency gap favored the same party three years running.

Tseytlin’s argument is supposed to make you think that bad efficiency gaps are as likely to come from neutral maps as partisan ones.  In fact, maps drawn by Democratic legislatures have average efficiency gap favoring Democrats; those by GOP on average favor GOP; neutral maps are in between.

That’s from p.35 of another Jackman paper.  Note the big change after 2010.  It wasn’t always the case that partisan legislators automatically thumbed the scales strongly in their favor when drawing the maps.  But it kind of is now.  Is that because partisanship is worse now?  Or because cheaper, faster computation makes it easier for one-party legislatures to do what they always would have done, if they could?  I can’t say for sure.

Efficiency gap isn’t a perfect measure, and neither side in this case is arguing it should be the single or final arbiter of unconstitutional gerrymandering.  But the idea that efficiency gap flags neutral maps as often as partisan maps is just wrong, and it shouldn’t have been part of the state’s argument before the court.


November 17, 2017

Tommaso DorigoMy Interview On Physics Today

Following the appearance of Kent Staley's review of my book "Anomaly!" in the November 2017 issue of Physics Today, the online site of the magazine offers, starting today, an interview with yours truly. I think the piece is quite readable and I encourage you to give it a look. Here I only quote a couple of passages for the laziest readers.

read more

Doug NatelsonMax the Demon and the Entropy of Doom

My readers know I've complained/bemoaned repeatedly how challenging it can be to explain condensed matter physics on a popular level in an engaging way, even though that's the branch of physics that arguably has the greatest impact on our everyday lives.  Trying to take such concepts and reach an audience of children is an even greater, more ambitious task, and teenagers might be the toughest crowd of all.  A graphic novel or comic format is one visually appealing approach that is a lot less dry and perhaps more nonthreatening than straight prose.   Look at the success of xkcd and Randall Munroe!   The APS has had some reasonable success with their comics about their superhero Spectra.  Prior to that, Larry Gonick had done a very nice job on the survey side with the Cartoon Guide to Physics.  (On the parody side, I highly recommend Science Made Stupid (pdf) by Tom Weller, a key text from my teen years.  I especially liked Weller's description of the scientific method, and his fictional periodic table.)

Max the Demon and the Entropy of Doom is a new entry in the field, by Assa Auerbach and Richard Codor.  Prof. Auerbach is a well-known condensed matter theorist who usually writes more weighty tomes, and Mr. Codor is a professional cartoonist and illustrator.  The book is an entertaining explanation of the laws of thermodynamics, with a particular emphasis on the Second Law, using a humanoid alien, Max (the Demon), as an effective superhero.  

The comic does a good job, with nicely illustrated examples, of getting the point across about entropy as counting how many (microscopic) ways there are to do things.  One of Max's powers is the ability to see and track microstates (like the detailed arrangement and trajectory of every air molecule in this room), when mere mortals can only see macrostates (like the average density and temperature).    It also illustrates what we mean by temperature and heat with nice examples (and a not very subtle at all environmental message).   There's history (through the plot device of time travel), action, adventure, and a Bad Guy who is appropriately not nice (and has a connection to history that I was irrationally pleased about guessing before it was revealed).   My kids thought it was good, though my sense is that some aspects were too conceptually detailed for 12 years old and others were a bit too cute for world-weary 15.  Still, a definite good review from a tough crowd, and efforts like this should be applauded - overall I was very impressed.

n-Category Café Star-autonomous categories are pseudo Frobenius algebras

A little while ago I talked about how multivariable adjunctions naturally form a polycategory: a structure like a multicategory, but in which codomains as well as domains can involve multiple objects. Now I want to talk about some structures we can define inside this polycategory MVarMVar.

What can you define inside a polycategory? Well, to start with, a polycategory has an underlying multicategory, consisting of the arrows with unary target; so anything we can define in a multicategory we can define in a polycategory. And the most basic thing we can define in a multicategory is a monoid object — in fact, there are some senses in which this is the canonical thing we can define in a multicategory.

So what is a monoid object in MVarMVar?

Well, actually it’s more interesting to ask about pseudomonoid objects, using the 2-categorical structure of MVarMVar. In this case what we have is an object AA, a (0,1)-variable adjunction i:()Ai:()\to A (which, recall, is just an object iAi\in A), and a (2,1)-variable adjunction m:(A,A)Am:(A,A)\to A, together with coherent associativity and unit isomorphisms. The left adjoint part of mm is a functor A×AAA\times A\to A, and the associativity and unit isomorphisms then make AA into a monoidal category. And to say that this functor extends to a multivariable adjunction is precisely to say that AA is a closed monoidal category, i.e. that its tensor product has a right adjoint in each variable:

A(xy,z)A(y,xz)A(x,zy) A(x\otimes y,z) \cong A(y, x\multimap z) \cong A(x, z \;⟜\; y)

Similarly, we can define braided pseudomonoids and symmetric pseudomonoids in any 2-multicategory, and in MVarMVar these specialize to braided and symmetric closed monoidal categories.

Now, what can we define in a polycategory that we can’t define in a multicategory? The most obvious monoid-like thing that involves multiple objects in a codomain is a comonoid. So what is a pseudo-comonoid in MVarMVar?

I think this question is easiest to answer if we use the duality of MVarMVar to turn everything around. So a pseudo-comonoid structure on a category AA is the same as a pseudo-monoid structure on A opA^{op}. In terms of AA, that means it’s a monoidal structure that’s co-closed, i.e. the tensor product functor has a left adjoint in each variable:

A(x,yz)A(yx,z)A(xz,y). A(x, y \odot z) \cong A(y \rhd x, z) \cong A(x \lhd z , y).

The obvious next thing to do is to mix a monoid structure with a comonoid structure. In general, there’s more than one way to do that: we could think about bimonoids, Hopf monoids, or Frobenius monoids. However, while all of these can be defined in any symmetric monoidal category (or PROP), in a polycategory, bimonoids and Hopf monoids don’t make sense, because their axioms involve composing along multiple objects at once, whereas in a polycategory we are only allowed to compose along one object at a time.

Frobenius algebras, however, make perfect sense in a polycategory. If you look at the usual definition in a monoidal category, you can see that the axioms only involve composing along one object at once; when they’re written topologically that corresponds to the “absence of holes”.

So what is a pseudo Frobenius algebra in MVarMVar? Actually, let’s ask a more general question first: what is a lax Frobenius algebra in MVarMVar? By a lax Frobenius algebra I mean an object with a pseudo-monoid structure and a pseudo-comonoid structure, together with not-necessarily invertible “Frobenius-ator” 2-cells

satisfying some coherence axioms, which can be found for instance in this paper (pages 52-55). This isn’t quite as scary as it looks; there are 20 coherence diagrams listed there, but the first 2 are the associativity pentagons for the pseudomonoid and pseudo-comonoid, while the last 8 are the unit axioms for the pseudomonoid and pseudo-comonoid (of which the 17 th17^{\mathrm{th}} and 18 th18^{\mathrm{th}} imply the other 6, by an old observation of Kelly). Of the remaining 10 axioms, 6 assert compatibility of the Frobenius-ators with the associativities, while 4 assert their compatibility with the units.

Now, to work out what a lax Frobenius algebra in MVarMVar is, we need to figure out what (2,2)-variable adjunctions (A,A)(A,A)(A,A)\to (A,A) those pictures represent. To work out what these functors are, I find it helpful to draw the monoid and comonoid structures with all the possible choices for input/output:

By the mates correspondence, to characterize a 2-cell in MVarMVar it suffices to consider one of the functors involved in the multivariable adjunctions, which means we should pick one of the copies of AA to be the “output” and consider all the others as the “input”. I find it easier to pick different copies of AA for the two Frobenius-ators. For the first one, let’s pick the second copy of AA in the codomain; this gives

In the domain of the 2-cell, on the right, xx and yy come in and get combined into xyx\otimes y, and then that gets treated as ww and gets combined with uu coming in from the lower-left to give u(xy)u\rhd (x\otimes y). In the codomain of the 2-cell, on the left, first xx gets combined with uu to give uxu\rhd x, then that gets multiplied with yy to give (ux)y(u\rhd x) \otimes y. Thus, the first Frobenius-ator is

u(xy)(ux)y. u\rhd (x\otimes y)\to (u\rhd x) \otimes y.

For the second Frobenius-ator, let’s dually pick the first copy of AA in the codomain to be the output:

Thus the second Frobenius-ator is

(xy)vx(yv). (x\otimes y)\lhd v \to x\otimes (y\lhd v).

What is this? Well, let’s take mates once with respect to the co-closed monoidal structure to reexpress both Frobenius-ators in terms of \otimes and \odot. The first gives

(ux)yu(u((ux)y))u((u(xx))y)u(xy). (u \odot x) \otimes y \to u \odot (u\rhd ((u \odot x) \otimes y)) \to u\odot ((u\rhd (x\odot x)) \otimes y) \to u \odot (x\otimes y).

and the second dually gives

x(yv)((x(yv))v)v(x((yv)v))v(xy)v. x \otimes (y\odot v) \to ((x \otimes (y\odot v)) \lhd v) \odot v \to (x \otimes ((y\odot v) \lhd v)) \odot v \to (x\otimes y)\odot v.

These two transformations (ux)yu(xy)(u \odot x) \otimes y \to u \odot (x\otimes y) and x(yv)(xy)vx \otimes (y\odot v) \to (x\otimes y)\odot v have exactly the shape of the “linear distributivity” transformations in a linearly distributive category! (Remember from last time that linearly distributive categories are the “representable” version of polycategories.) The latter are supposed to satisfy their own coherence axioms, which aren’t listed on the nLab, but if you look up the original Cockett-Seely paper and count them there are… 10 axioms… 6 asserting compatibility with associativity of \otimes and \odot, and 4 asserting compatibility with the unit. In other words,

A lax Frobenius algebra in MVarMVar is precisely a linearly distributive category! (In which \otimes is closed and \odot is co-closed.)

Note that this is at least an approximate instance of the microcosm principle. (I have to admit that I have not actually checked that the two groups of coherence axioms coincide under the mates correspondence, but I find it inconceivable that they don’t.)

The next thing to ask is what a pseudo Frobenius algebra is, i.e. what it means for the Frobenius-ators to be invertible. If you’ve come this far (or if you read the title of the post) you can probably guess the answer: a *\ast-autonomous category, i.e. a linearly distributive category in which all objects have duals (in the polycategorical sense I defined in the first post).

First note that in a *\ast-autonomous category, \otimes is always closed and \odot is co-closed, with (xz)=(x *z)(x\multimap z) = (x^\ast \odot z) and (uw)=(u *w)(u\rhd w) = (u^\ast \otimes w) and so on. With these definitions, the Frobenius-ators become just associativity isomorphisms:

u(xy)=u *(xy)(u *x)y=(ux)y. u\rhd (x\otimes y) = u^\ast \otimes (x\otimes y) \cong (u^\ast \otimes x) \otimes y = (u\rhd x) \otimes y.

(xy)v=(xy)v *x(yv *)=x(yv). (x\otimes y)\lhd v = (x\otimes y)\otimes v^\ast \cong x\otimes (y\otimes v^\ast) = x\otimes (y\lhd v).

Thus, an *\ast-autonomous category is a pseudo Frobenius algebra in MVarMVar. Conversely, if AA is a pseudo Frobenius algebra in MVarMVar, then letting x=ix=i be the unit object of \otimes, we have

uyu(iy)(ui)y u\rhd y \cong u\rhd (i\otimes y) \cong (u\rhd i) \otimes y

giving an isomorphism

A(y,uv)A(uy,v)A((ui)y,v).A(y, u\odot v) \cong A(u\rhd y, v) \cong A((u\rhd i) \otimes y, v).

Thus uiu\rhd i behaves like a dual of uu, and with a little more work we can show that it actually is. (I’m totally glossing over the symmetric/non-symmetric distinction here; in the non-symmetric case one has to distinguish between left and right duals, blah blah blah, but it all works.) So

A pseudo Frobenius algebra in MVarMVar is precisely a *\ast-autonomous category!

The fact that there’s a relationship between Frobenius algebras and *\ast-autonomous categories is not new. In this paper, Brian Day and Ross Street showed that pseudo Frobenius algebras in ProfProf can be identified with “pro-*\ast-autonomous categories”, i.e. promonoidal categories that are *\ast-autonomous in a suitable sense. In this paper Jeff Egger showed that Frobenius algebras in the *\ast-autonomous category Sup of suplattices can be identified with *\ast-autonomous cocomplete posets. And Jeff has told me personally that he also noticed that lax Frobenius algebras correspond to mere linear distributivity. (By the way, the above characterization of *\ast-autonomous categories as closed and co-closed linearly distributive ones such that certain transformations are invertible is due to Cockett and Seely.)

What’s new here is that the pseudo Frobenius algebras in MVarMVar are exactly *\ast-autonomous categories — not pro, not posets, not cocomplete.

There’s more that could be said. For instance, it’s known that Frobenius algebras can be defined in many different ways. One example is that instead of giving an algebra and coalgebra structure related by a Frobenius axiom, we could give just the algebra structure along with a compatible nondegenerate pairing AAIA\otimes A \to I. This is also true for pseudo Frobenius algebras in a polycategory, and in MVarMVar such a pairing (A,A)()(A,A) \to () corresponds to a contravariant self-equivalence () *:AA op(-)^\ast : A\simeq A^{op}, leading to the perhaps-more-common definition of star-autonomous category involving such a self-duality. And so on; but maybe I’ll stop here.

November 16, 2017

Dirac Sea ShoreWhat’s on my mind

Terence TaoAn inverse theorem for an inequality of Kneser

I have just uploaded to the arXiv the paper “An inverse theorem for an inequality of Kneser“, submitted to a special issue of the Proceedings of the Steklov Institute of Mathematics in honour of Sergei Konyagin. It concerns an inequality of Kneser discussed previously in this blog, namely that

\displaystyle \mu(A+B) \geq \min(\mu(A)+\mu(B), 1) \ \ \ \ \ (1)

whenever {A,B} are compact non-empty subsets of a compact connected additive group {G} with probability Haar measure {\mu}.  (A later result of Kemperman extended this inequality to the nonabelian case.) This inequality is non-trivial in the regime

\displaystyle \mu(A), \mu(B), 1- \mu(A)-\mu(B) > 0. \ \ \ \ \ (2)

The connectedness of {G} is essential, otherwise one could form counterexamples involving proper subgroups of {G} of positive measure. In the blog post, I indicated how this inequality (together with a more “robust” strengthening of it) could be deduced from submodularity inequalities such as

\displaystyle \mu( (A_1 \cup A_2) + B) + \mu( (A_1 \cap A_2) + B) \leq \mu(A_1+B) + \mu(A_2+B) \ \ \ \ \ (3)

which in turn easily follows from the identity {(A_1 \cup A_2) + B = (A_1+B) \cup (A_2+B)} and the inclusion {(A_1 \cap A_2) + B \subset (A_1 +B) \cap (A_2+B)}, combined with the inclusion-exclusion formula.

In the non-trivial regime (2), equality can be attained in (1), for instance by taking {G} to be the unit circle {G = {\bf R}/{\bf Z}} and {A,B} to be arcs in that circle (obeying (2)). A bit more generally, if {G} is an arbitrary connected compact abelian group and {\xi: G \rightarrow {\bf R}/{\bf Z}} is a non-trivial character (i.e., a continuous homomorphism), then {\xi} must be surjective (as {{\bf R}/{\bf Z}} has no non-trivial connected subgroups), and one can take {A = \xi^{-1}(I)} and {B = \xi^{-1}(J)} for some arcs {I,J} in that circle (again choosing the measures of these arcs to obey (2)). The main result of this paper is an inverse theorem that asserts that this is the only way in which equality can occur in (1) (assuming (2)); furthermore, if (1) is close to being satisfied with equality and (2) holds, then {A,B} must be close (in measure) to an example of the above form {A = \xi^{-1}(I), B = \xi^{-1}(J)}. Actually, for technical reasons (and for the applications we have in mind), it is important to establish an inverse theorem not just for (1), but for the more robust version mentioned earlier (in which the sumset {A+B} is replaced by the partial sumset {A +_\varepsilon B} consisting of “popular” sums).

Roughly speaking, the idea is as follows. Let us informally call {(A,B)} a critical pair if (2) holds and the inequality (1) (or more precisely, a robust version of this inequality) is almost obeyed with equality. The notion of a critical pair obeys some useful closure properties. Firstly, it is symmetric in {A,B}, and invariant with respect to translation of either {A} or {B}. Furthermore, from the submodularity inequality (3), one can show that if {(A_1,B)} and {(A_2,B)} are critical pairs (with {\mu(A_1 \cap A_2)} and {1 - \mu(A_1 \cup A_2) - \mu(B)} positive), then {(A_1 \cap A_2,B)} and {(A_1 \cup A_2, B)} are also critical pairs. (Note that this is consistent with the claim that critical pairs only occur when {A,B} come from arcs of a circle.) Similarly, from associativity {(A+B)+C = A+(B+C)}, one can show that if {(A,B)} and {(A+B,C)} are critical pairs, then so are {(B,C)} and {(A,B+C)}.

One can combine these closure properties to obtain further ones. For instance, suppose {A,B} is such that {\mu(A+B) < 2 \mu(A)}, so that all translates {A+b}, {b \in B} intersect each other in a set of positive measure. Suppose also that {(A,C)} is a critical pair and {1-\mu(A+B) - \mu(C) > 0}. Then (cheating a little bit), one can show that {(A+B,C)} is also a critical pair, basically because {A+B} is the union of the {A+b}, {b \in B}, the {(A+b,C)} are all critical pairs, and the {A+b} all intersect each other. This argument doesn’t quite work as stated because one has to apply the closure property under union an uncountable number of times, but it turns out that if one works with the robust version of sumsets and uses a random sampling argument to approximate {A+B} by the union of finitely many of the {A+b}, then the argument can be made to work.

Using all of these closure properties, it turns out that one can start with an arbitrary critical pair {(A,B)} and end up with a small set {C} such that {(A,C)} and {(kC,C)} are also critical pairs for all {1 \leq k \leq 10^4} (say), where {kC} is the {k}-fold sumset of {C}. (Intuitively, if {A,B} are thought of as secretly coming from the pullback of arcs {I,J} by some character {\xi}, then {C} should be the pullback of a much shorter arc by the same character.) In particular, {C} exhibits linear growth, in that {\mu(kC) = k\mu(C)} for all {1 \leq k \leq 10^4}. One can now use standard technology from inverse sumset theory to show first that {C} has a very large Fourier coefficient (and thus is biased with respect to some character {\xi}), and secondly that {C} is in fact almost of the form {C = \xi^{-1}(K)} for some arc {K}, from which it is not difficult to conclude similar statements for {A} and {B} and thus finish the proof of the inverse theorem.

In order to make the above argument rigorous, one has to be more precise about what the modifier “almost” means in the definition of a critical pair. I chose to do this in the language of “cheap” nonstandard analysis (aka asymptotic analysis), as discussed in this previous blog post; one could also have used the full-strength version of nonstandard analysis, but this does not seem to convey any substantial advantages. (One can also work in a more traditional “non-asymptotic” framework, but this requires one to keep much more careful account of various small error terms and leads to a messier argument.)

 

[Update, Nov 15: Corrected the attribution of the inequality (1) to Kneser instead of Kemperman.  Thanks to John Griesmer for pointing out the error.]


Filed under: math.CO, paper Tagged: inverse theorems, Kneser's theorem, sum set estimates

John BaezApplied Category Theory at UCR (Part 3)

We had a special session on applied category theory here at UCR:

Applied category theory, Fall Western Sectional Meeting of the AMS, 4-5 November 2017, U.C. Riverside.

A bunch of people stayed for a few days afterwards, and we had a lot of great discussions. I wish I could explain everything that happened, but I’m too busy right now. Luckily, even if you couldn’t come here, you can now see slides of almost all the talks… and videos of many!

Click on talk titles to see abstracts. For multi-author talks, the person whose name is in boldface is the one who gave the talk. For videos, go here: I haven’t yet created links to all the videos.

Saturday November 4, 2017

9:00 a.m.A higher-order temporal logic for dynamical systemstalk slides.
David I. Spivak, MIT.

10:00 a.m.
Algebras of open dynamical systems on the operad of wiring diagramstalk slides.
Dmitry Vagner, Duke University
David I. Spivak, MIT
Eugene Lerman, University of Illinois at Urbana-Champaign

10:30 a.m.
Abstract dynamical systemstalk slides.
Christina Vasilakopoulou, UCR
David Spivak, MIT
Patrick Schultz, MIT

3:00 p.m.
Decorated cospanstalk slides.
Brendan Fong, MIT

4:00 p.m.
Compositional modelling of open reaction networkstalk slides.
Blake S. Pollard, UCR
John C. Baez, UCR

4:30 p.m.
A bicategory of coarse-grained Markov processestalk slides.
Kenny Courser, UCR

5:00 p.m.
A bicategorical syntax for pure state qubit quantum mechanicstalk slides.
Daniel M. Cicala, UCR

5:30 p.m.
Open systems in classical mechanicstalk slides.
Adam Yassine, UCR

Sunday November 5, 2017

9:00 a.m.
Controllability and observability: diagrams and dualitytalk slides.
Jason Erbele, Victor Valley College

9:30 a.m.
Frobenius monoids, weak bimonoids, and corelationstalk slides.
Brandon Coya, UCR

10:00 a.m.
Compositional design and tasking of networks.
John D. Foley, Metron, Inc.
John C. Baez, UCR
Joseph Moeller, UCR
Blake S. Pollard, UCR

10:30 a.m.
Operads for modeling networkstalk slides.
Joseph Moeller, UCR
John Foley, Metron Inc.
John C. Baez, UCR
Blake S. Pollard, UCR

2:00 p.m.
Reeb graph smoothing via cosheavestalk slides.
Vin de Silva, Department of Mathematics, Pomona College

3:00 p.m.
Knowledge representation in bicategories of relationstalk slides.
Evan Patterson, Stanford University, Statistics Department

3:30 p.m.
The multiresolution analysis of flow graphstalk slides.
Steve Huntsman, BAE Systems

4:00 p.m.
Data modeling and integration using the open source tool Algebraic Query Language (AQL)talk slides.
Peter Y. Gates, Categorical Informatics
Ryan Wisnesky, Categorical Informatics


David Hoggyou never really understand a model until you implement it

Eilers (MPIA) and I discussed puzzling results she was getting in which she could fit just about any data (including insanely random data) with the Gaussian Process latent variable model (GPLVM) but with no predictive power on new data. We realized that we were missing a term in the model: We need to constrain the latent variables with a prior (or regularization), otherwise the latent variables can go off to crazy corners of space and the data points have (effectively) nothing to do with one another. Whew! This all justifies a point we have been making for a while, which is that you never really understand a model until you implement it.

November 15, 2017

David Hoggmodeling the heck out of the atmosphere

The day started with planning between Bedell (Flatiron), Foreman-Mackey (Flatiron), and I about a possible tri-linear model for stellar spectra. The model is that the star has a spectrum, which is drawn from a subspace in spectral space, and doppler shifted, and the star is subject to telluric absorption, which is drawn from a subspace in spectral space, and doppler shifted. The idea is to learn the telluric subspace using all the data ever taken from a spectrograph (HARPS, in this case). But of course the idea behind that is to account for the tellurics by simultaneously fitting them and thereby getting better radial velocities. This was all planning for the arrival of Ben Montet (Chicago), who arrived later in the day for a two-week visit.

At lunch time, Mike Blanton (NYU) gave the CCPP brown-bag talk about SDSS-V. He did a nice job of explaining how you measure the composition of ionized gas by looking at thermal state. And etc!

November 13, 2017

n-Category Café HoTT at JMM

At the 2018 U.S. Joint Mathematics Meetings in San Diego, there will be an AMS special session about homotopy type theory. It’s a continuation of the HoTT MRC that took place this summer, organized by some of the participants to especially showcase the work done during and after the MRC workshop. Following is the announcement from the organizers.

We are pleased to announce the AMS Special Session on Homotopy Type Theory, to be held on January 11, 2018 in San Diego, California, as part of the Joint Mathematics Meetings (to be held January 10 - 13).

Homotopy Type Theory (HoTT) is a new field of study that relates constructive type theory to abstract homotopy theory. Types are regarded as synthetic spaces of arbitrary dimension and type equality as homotopy equivalence. Experience has shown that HoTT is able to represent many mathematical objects of independent interest in a direct and natural way. Its foundations in constructive type theory permit the statement and proof of theorems about these objects within HoTT itself, enabling formalization in proof assistants and providing a constructive foundation for other branches of mathematics.

This Special Session is affiliated with the AMS Mathematics Research Communities (MRC) workshop for early-career researchers in Homotopy Type Theory organized by Dan Christensen, Chris Kapulkin, Dan Licata, Emily Riehl and Mike Shulman, which took place last June.

The Special Session will include talks by MRC participants, as well as by senior researchers in the field, on various aspects of higher-dimensional type theory including categorical semantics, computation, and the formalization of mathematical theories. There will also be a panel discussion featuring distinguished experts from the field.

Further information about the Special Session, including a schedule and abstracts, can be found at: http://jointmathematicsmeetings.org/meetings/national/jmm2018/2197_program_ss14.html. Please note that the early registration deadline is December 20, 2017.

If you have any questions about about the Special Session, please feel free to contact one of the organizers. We look forward to seeing you in San Diego.

Simon Cho (University of Michigan)

Liron Cohen (Cornell University)

Ed Morehouse (Wesleyan University)

David Hoggdetailed abundances of pairs; coherent red-giant modes

In the morning I sat in on a meeting of the GALAH team, who are preparing for a data release to precede Gaia DR2. In that meeting, Jeffrey Simpson (USyd) showed me GALAH results on the Oh et al comoving pairs of stars. He finds that pairs from the Oh sample that are confirmed to have the same radial velocity (and are therefore likely to be truly comoving) have similar detailed element abundances, and the ones that aren't, don't. So awesome! But interestingly he doesn't find that the non-confirmed pairs are as different as randomly chosen stars from the sample. That's interesting, and suggests that we should make (or should have made) a carefully constructed null sample for A/B testing etc. Definitely for Gaia DR2!

In the afternoon, I joined the USyd asteroseismology group meeting. We discussed classification of seismic spectra using neural networks (I advised against) or kernel SVM (I advised in favor). We also discussed using very narrow (think: coherent) modes in red-giant stars to find binaries. This is like what my host Simon Murphy (USyd) does for delta-Scuti stars, but we would not have enough data to phase up little chunks of spectrum: We would have to do one huge simultaneous fit. I love that idea, infinitely! I asked them to give me a KIC number.

I gave two talks today, making it six talks (every one very different) in five days! I spoke about the pros and cons of machine learning (or what is portrayed as machine learning on TV) as my final Hunstead Lecture at the University of Sydney. I ended up being very negative on neural networks in comparison to Gaussian processes, at least for astrophysics applications. In my second talk, I spoke about de-noising Gaia data at Macquarie University. I got great crowds and good feedback at both places. It's been an exhausting but absolutely excellent week.

November 12, 2017

BackreactionAway Note

I am overseas the coming week, giving a seminar at Perimeter Institute on Tuesday, a colloq in Toronto on Wednesday, and on Thursday I am scheduled to “make sense of mind-blowing physics” with Natalie Wolchover in New York. The latter event, I am told, has a live webcast starting at 6:30 pm Eastern, so dial in if you fancy seeing my new haircut. (Short again.) Please be warned that things on

David Hoggmixture of factor analyzers; centroiding stars

On this, day four of my Hunstead Lectures, Andy Casey (Monash) came into town, which was absolutely great. We talked about many things, including the mixture-of-factor-analyzers model, which is a good and under-used model in astrophysics. I think (if I remember correctly) that it can be generalized to heteroskedastic and missing data too. We also talked about using machine learning to interpolate models, and future projects with The Cannon.

At lunch I sat with Peter Tuthill (Sydney) and Kieran Larkin (Sydney) who are working on a project design that would permit measurement of the separation between two (nearby) stars to better than one millionth of a pixel. It's a great project; the designs they are thinking about involve making a very large, but very finely featured point-spread function, so that hundreds or thousands of pixels are importantly involved in the positional measurements. We discussed various directions of optimization.

My talk today was about The Cannon and the relationships between methods that are thought of as “machine learning” and the kinds of data analyses that I think will win in the long run.

November 11, 2017

Terence TaoContinuous approximations to arithmetic functions

A basic object of study in multiplicative number theory are the arithmetic functions: functions {f: {\bf N} \rightarrow {\bf C}} from the natural numbers to the complex numbers. Some fundamental examples of such functions include

Given an arithmetic function {f}, we are often interested in statistics such as the summatory function

\displaystyle \sum_{n \leq x} f(n), \ \ \ \ \ (1)

 

the logarithmically (or harmonically) weighted summatory function

\displaystyle \sum_{n \leq x} \frac{f(n)}{n}, \ \ \ \ \ (2)

 

or the Dirichlet series

\displaystyle {\mathcal D}[f](s) := \sum_n \frac{f(n)}{n^s}.

In the latter case, one typically has to first restrict {s} to those complex numbers whose real part is large enough in order to ensure the series on the right converges; but in many important cases, one can then extend the Dirichlet series to almost all of the complex plane by analytic continuation. One is also interested in correlations involving additive shifts, such as {\sum_{n \leq x} f(n) f(n+h)}, but these are significantly more difficult to study and cannot be easily estimated by the methods of classical multiplicative number theory.

A key operation on arithmetic functions is that of Dirichlet convolution, which when given two arithmetic functions {f,g: {\bf N} \rightarrow {\bf C}}, forms a new arithmetic function {f*g: {\bf N} \rightarrow {\bf C}}, defined by the formula

\displaystyle f*g(n) := \sum_{d|n} f(d) g(\frac{n}{d}).

Thus for instance {1*1 = d_2}, {1 * \Lambda = L}, {1 * \mu = \delta}, and {\delta * f = f} for any arithmetic function {f}. Dirichlet convolution and Dirichlet series are related by the fundamental formula

\displaystyle {\mathcal D}[f * g](s) = {\mathcal D}[f](s) {\mathcal D}[g](s), \ \ \ \ \ (3)

 

at least when the real part of {s} is large enough that all sums involved become absolutely convergent (but in practice one can use analytic continuation to extend this identity to most of the complex plane). There is also the identity

\displaystyle {\mathcal D}[Lf](s) = - \frac{d}{ds} {\mathcal D}[f](s), \ \ \ \ \ (4)

 

at least when the real part of {s} is large enough to justify interchange of differentiation and summation. As a consequence, many Dirichlet series can be expressed in terms of the Riemann zeta function {\zeta = {\mathcal D}[1]}, thus for instance

\displaystyle {\mathcal D}[d_2](s) = \zeta^2(s)

\displaystyle {\mathcal D}[L](s) = - \zeta'(s)

\displaystyle {\mathcal D}[\delta](s) = 1

\displaystyle {\mathcal D}[\mu](s) = \frac{1}{\zeta(s)}

\displaystyle {\mathcal D}[\Lambda](s) = -\frac{\zeta'(s)}{\zeta(s)}.

Much of the difficulty of multiplicative number theory can be traced back to the discrete nature of the natural numbers {{\bf N}}, which form a rather complicated abelian semigroup with respect to multiplication (in particular the set of generators is the set of prime numbers). One can obtain a simpler analogue of the subject by working instead with the half-infinite interval {{\bf N}_\infty := [1,+\infty)}, which is a much simpler abelian semigroup under multiplication (being a one-dimensional Lie semigroup). (I will think of this as a sort of “completion” of {{\bf N}} at the infinite place {\infty}, hence the terminology.) Accordingly, let us define a continuous arithmetic function to be a locally integrable function {f: {\bf N}_\infty \rightarrow {\bf C}}. The analogue of the summatory function (1) is then an integral

\displaystyle \int_1^x f(t)\ dt,

and similarly the analogue of (2) is

\displaystyle \int_1^x \frac{f(t)}{t}\ dt.

The analogue of the Dirichlet series is the Mellin-type transform

\displaystyle {\mathcal D}_\infty[f](s) := \int_1^\infty \frac{f(t)}{t^s}\ dt,

which will be well-defined at least if the real part of {s} is large enough and if the continuous arithmetic function {f: {\bf N}_\infty \rightarrow {\bf C}} does not grow too quickly, and hopefully will also be defined elsewhere in the complex plane by analytic continuation.

For instance, the continuous analogue of the discrete constant function {1: {\bf N} \rightarrow {\bf C}} would be the constant function {1_\infty: {\bf N}_\infty \rightarrow {\bf C}}, which maps any {t \in [1,+\infty)} to {1}, and which we will denote by {1_\infty} in order to keep it distinct from {1}. The two functions {1_\infty} and {1} have approximately similar statistics; for instance one has

\displaystyle \sum_{n \leq x} 1 = \lfloor x \rfloor \approx x-1 = \int_1^x 1\ dt

and

\displaystyle \sum_{n \leq x} \frac{1}{n} = H_{\lfloor x \rfloor} \approx \log x = \int_1^x \frac{1}{t}\ dt

where {H_n} is the {n^{th}} harmonic number, and we are deliberately vague as to what the symbol {\approx} means. Continuing this analogy, we would expect

\displaystyle {\mathcal D}[1](s) = \zeta(s) \approx \frac{1}{s-1} = {\mathcal D}_\infty[1_\infty](s)

which reflects the fact that {\zeta} has a simple pole at {s=1} with residue {1}, and no other poles. Note that the identity {{\mathcal D}_\infty[1_\infty](s) = \frac{1}{s-1}} is initially only valid in the region {\mathrm{Re} s > 1}, but clearly the right-hand side can be continued analytically to the entire complex plane except for the pole at {1}, and so one can define {{\mathcal D}_\infty[1_\infty]} in this region also.

In a similar vein, the logarithm function {L: {\bf N} \rightarrow {\bf C}} is approximately similar to the logarithm function {L_\infty: {\bf N}_\infty \rightarrow {\bf C}}, giving for instance the crude form

\displaystyle \sum_{n \leq x} L(n) = \log \lfloor x \rfloor! \approx x \log x - x = \int_1^\infty L_\infty(t)\ dt

of Stirling’s formula, or the Dirichlet series approximation

\displaystyle {\mathcal D}[L](s) = -\zeta'(s) \approx \frac{1}{(s-1)^2} = {\mathcal D}_\infty[L_\infty](s).

The continuous analogue of Dirichlet convolution is multiplicative convolution using the multiplicative Haar measure {\frac{dt}{t}}: given two continuous arithmetic functions {f_\infty, g_\infty: {\bf N}_\infty \rightarrow {\bf C}}, one can define their convolution {f_\infty *_\infty g_\infty: {\bf N}_\infty \rightarrow {\bf C}} by the formula

\displaystyle f_\infty *_\infty g_\infty(t) := \int_1^t f_\infty(s) g_\infty(\frac{t}{s}) \frac{ds}{s}.

Thus for instance {1_\infty * 1_\infty = L_\infty}. A short computation using Fubini’s theorem shows the analogue

\displaystyle D_\infty[f_\infty *_\infty g_\infty](s) = D_\infty[f_\infty](s) D_\infty[g_\infty](s)

of (3) whenever the real part of {s} is large enough that Fubini’s theorem can be justified; similarly, differentiation under the integral sign shows that

\displaystyle D_\infty[L_\infty f_\infty](s) = -\frac{d}{ds} D_\infty[f_\infty](s) \ \ \ \ \ (5)

 

again assuming that the real part of {s} is large enough that differentiation under the integral sign (or some other tool like this, such as the Cauchy integral formula for derivatives) can be justified.

Direct calculation shows that for any complex number {\rho}, one has

\displaystyle \frac{1}{s-\rho} = D_\infty[ t \mapsto t^{\rho-1} ](s)

(at least for the real part of {s} large enough), and hence by several applications of (5)

\displaystyle \frac{1}{(s-\rho)^k} = D_\infty[ t \mapsto \frac{1}{(k-1)!} t^{\rho-1} \log^{k-1} t ](s)

for any natural number {k}. This can lead to the following heuristic: if a Dirichlet series {D[f](s)} behaves like a linear combination of poles {\frac{1}{(s-\rho)^k}}, in that

\displaystyle D[f](s) \approx \sum_\rho \frac{c_\rho}{(s-\rho)^{k_\rho}}

for some set {\rho} of poles and some coefficients {c_\rho} and natural numbers {k_\rho} (where we again are vague as to what {\approx} means, and how to interpret the sum {\sum_\rho} if the set of poles is infinite), then one should expect the arithmetic function {f} to behave like the continuous arithmetic function

\displaystyle t \mapsto \sum_\rho \frac{c_\rho}{(k_\rho-1)!} t^{\rho-1} \log^{k_\rho-1} t.

In particular, if we only have simple poles,

\displaystyle D[f](s) \approx \sum_\rho \frac{c_\rho}{s-\rho}

then we expect to have {f} behave like continuous arithmetic function

\displaystyle t \mapsto \sum_\rho c_\rho t^{\rho-1}.

Integrating this from {1} to {x}, this heuristically suggests an approximation

\displaystyle \sum_{n \leq x} f(n) \approx \sum_\rho c_\rho \frac{x^\rho-1}{\rho}

for the summatory function, and similarly

\displaystyle \sum_{n \leq x} \frac{f(n)}{n} \approx \sum_\rho c_\rho \frac{x^{\rho-1}-1}{\rho-1},

with the convention that {\frac{x^\rho-1}{\rho}} is {\log x} when {\rho=0}, and similarly {\frac{x^{\rho-1}-1}{\rho-1}} is {\log x} when {\rho=1}. One can make these sorts of approximations more rigorous by means of Perron’s formula (or one of its variants) combined with the residue theorem, provided that one has good enough control on the relevant Dirichlet series, but we will not pursue these rigorous calculations here. (But see for instance this previous blog post for some examples.)

For instance, using the more refined approximation

\displaystyle \zeta(s) \approx \frac{1}{s-1} + \gamma

to the zeta function near {s=1}, we have

\displaystyle {\mathcal D}[d_2](s) = \zeta^2(s) \approx \frac{1}{(s-1)^2} + \frac{2 \gamma}{s-1}

we would expect that

\displaystyle d_2 \approx L_\infty + 2 \gamma

and thus for instance

\displaystyle \sum_{n \leq x} d_2(n) \approx x \log x - x + 2 \gamma x

which matches what one actually gets from the Dirichlet hyperbola method (see e.g. equation (44) of this previous post).

Or, noting that {\zeta(s)} has a simple pole at {s=1} and assuming simple zeroes elsewhere, the log derivative {-\zeta'(s)/\zeta(s)} will have simple poles of residue {+1} at {s=1} and {-1} at all the zeroes, leading to the heuristic

\displaystyle {\mathcal D}[\Lambda](s) = -\frac{\zeta'(s)}{\zeta(s)} \approx \frac{1}{s-1} - \sum_\rho \frac{1}{s-\rho}

suggesting that {\Lambda} should behave like the continuous arithmetic function

\displaystyle t \mapsto 1 - \sum_\rho t^{\rho-1}

leading for instance to the summatory approximation

\displaystyle \sum_{n \leq x} \Lambda(n) \approx x - \sum_\rho \frac{x^\rho-1}{\rho}

which is a heuristic form of the Riemann-von Mangoldt explicit formula (see Exercise 45 of these notes for a rigorous version of this formula).

Exercise 1 Go through some of the other explicit formulae listed at this Wikipedia page and give heuristic justifications for them (up to some lower order terms) by similar calculations to those given above.

Given the “adelic” perspective on number theory, I wonder if there are also {p}-adic analogues of arithmetic functions to which a similar set of heuristics can be applied, perhaps to study sums such as {\sum_{n \leq x: n = a \hbox{ mod } p^j} f(n)}. A key problem here is that there does not seem to be any good interpretation of the expression {\frac{1}{t^s}} when {s} is complex and {t} is a {p}-adic number, so it is not clear that one can analyse a Dirichlet series {p}-adically. For similar reasons, we don’t have a canonical way to define {\chi(t)} for a Dirichlet character {\chi} (unless its conductor happens to be a power of {p}), so there doesn’t seem to be much to say in the {q}-aspect either.


Filed under: expository, math.NT Tagged: Dirichlet series, prime number theorem

Tommaso DorigoAnomaly Reviewed On Physics Today

Another quite positive review of my book "Anomaly! Collider Physics and the Quest for New Phenomena at Fermilab"  (which these days is 40% off at the World Scientific site I am linking) has appeared on Physics Today this month.

read more

BackreactionNaturalness is dead. Long live naturalness.

I was elated when I saw that Gian Francesco Giudice announced the “Dawn of the Post-Naturalness Era,” as the title of his recent paper promises. The craze in particle physics, I thought, might finally come to an end; data brought reason back to Earth after all. But disillusionment followed swiftly when I read the paper. Gian Francesco Giudice is a theoretical physicist at CERN. He is maybe

n-Category Café Topology Puzzles

Let’s say the closed unit interval [0,1][0,1] maps onto a metric space XX if there is a continuous map from [0,1][0,1] onto XX. Similarly for the Cantor set.

Puzzle 0. Does the Cantor set map onto the closed unit interval, and/or vice versa?

Puzzle 1. Which metric spaces does the closed unit interval map onto?

Puzzle 2. Which metric spaces does the Cantor set map onto?

The first one is easy; the second two are well-known… but still, perhaps, not well-known enough!

The answers to Puzzles 1 and 2 can be seen as ‘versal’ properties of the closed unit interval and Cantor set — like universal properties, but without the uniqueness clause.

n-Category Café Applied Category Theory Papers

In preparation for the Applied Category Theory special session at U.C. Riverside this weekend, my crew dropped three papers on the arXiv.

My student Adam Yassine has been working on Hamiltonian and Lagrangian mechanics from an ‘open systems’ point of view:

  • Adam Yassine, Open systems in classical mechanics.

    Abstract. Using the framework of category theory, we formalize the heuristic principles that physicists employ in constructing the Hamiltonians for open classical systems as sums of Hamiltonians of subsystems. First we construct a category where the objects are symplectic manifolds and the morphisms are spans whose legs are surjective Poisson maps. Using a slight variant of Fong’s theory of decorated cospans, we then decorate the apices of our spans with Hamiltonians. This gives a category where morphisms are open classical systems, and composition allows us to build these systems from smaller pieces.

He also gets a functor from a category of Lagrangian open systems to this category of Hamiltonian systems.

Kenny Courser and I have been continuing my work with Blake Pollard and Brendan Fong on open Markov processes, bringing 2-morphisms into the game. It seems easiest to use a double category:

Abstract. Coarse-graining is a standard method of extracting a simple Markov process from a more complicated one by identifying states. Here we extend coarse-graining to open Markov processes. An ‘open’ Markov process is one where probability can flow in or out of certain states called ‘inputs’ and ‘outputs’. One can build up an ordinary Markov process from smaller open pieces in two basic ways: composition, where we identify the outputs of one open Markov process with the inputs of another, and tensoring, where we set two open Markov processes side by side. In previous work, Fong, Pollard and the first author showed that these constructions make open Markov processes into the morphisms of a symmetric monoidal category. Here we go further by constructing a symmetric monoidal double category where the 2-morphisms are ways of coarse-graining open Markov processes. We also extend the already known ‘black-boxing’ functor from the category of open Markov processes to our double category. Black-boxing sends any open Markov process to the linear relation between input and output data that holds in steady states, including nonequilibrium steady states where there is a nonzero flow of probability through the process. To extend black-boxing to a functor between double categories, we need to prove that black-boxing is compatible with coarse-graining.

Finally, the Complex Adaptive Systems Composition and Design Environment project with John Foley of Metron Scientific Solutions and my students Joseph Moeller and Blake Pollard has finally given birth to a paper! I hope this is just the first; it starts laying down the theoretical groundwork for designing networked systems. John is here now and we’re coming up with a bunch of new ideas:

  • John Baez, John Foley, Joseph Moeller and Blake Pollard, Network models.

Abstract. Networks can be combined in many ways, such as overlaying one on top of another or setting two side by side. We introduce network models to encode these ways of combining networks. Different network models describe different kinds of networks. We show that each network model gives rise to an operad, whose operations are ways of assembling a network of the given kind from smaller parts. Such operads, and their algebras, can serve as tools for designing networks. Technically, a network model is a lax symmetric monoidal functor from the free symmetric monoidal category on some set to Cat, and the construction of the corresponding operad proceeds via a symmetric monoidal version of the Grothendieck construction.

I blogged about this last one here:

David Hoggnoise, calibration, and GALAH

Today I gave my second of five Hunstead Lectures at University of Sydney. It was about finding planets in the Kepler and K2 data, using our non-stationary Gaussian Process or linear model as a noise model. This is the model we wrote up in our Research Note of the AAS. In the question period, the question of confirmation or validation of planets came up. It is very real that the only way to validate most tiny planets is to make predictions for other data. But when will we have data more sensitive than Kepler? This is a significant problem for much of bleeding-edge astronomy.

Early in the morning I had a long call with Jason Wright (PSU) and Bedell (Flatiron) about the assessment of the calibration programs for extreme-precision RV surveys. My position is that it is possible to assess the end-to-end error budget in a data-driven way. That is, we can use ideas from causal inference to figure out what parts of the RV noise are coming from telescope plus instrument plus software. Wright didn't agree: He believes that large parts of the error budget can't be seen or calibrated. I guess we better start writing some kind of paper here.

In the afternoon I had a great discussion with Buder (MPIA), Sharma (USyd), and Bland-Hawthorn (USyd) about the current status of detailed elemental abundance measurements in GALAH. The element–element plots look fantastic, and clear trends and high precision are evident, just looking at the data. To extract these abundances, Buder has made a clever variant of The Cannon which makes use of the residuals away from a low-dimensional model to measure the detailed abundances. They are planning on doing a large data release in April.

n-Category Café The Polycategory of Multivariable Adjunctions

Adjunctions are well-known and fundamental in category theory. Somewhat less well-known are two-variable adjunctions, consisting of functors f:A×BCf:A\times B\to C, g:A op×CBg:A^{op}\times C\to B, and h:B op×CAh:B^{op}\times C\to A and natural isomorphisms

C(f(a,b),c)B(b,g(a,c))A(a,h(b,c)). C(f(a,b),c) \cong B(b,g(a,c)) \cong A(a,h(b,c)).

These are also ubiquitous in mathematics, for instance in the notion of closed monoidal category, or in the hom-power-copower situation of an enriched category. But it seems that only fairly recently has there been a wider appreciation that it is worth defining and studying them in their own right (rather than simply as a pair of parametrized adjunctions f(a,)g(a,)f(a,-)\dashv g(a,-) and f(,b)h(b,)f(-,b) \dashv h(b,-)).

Now, ordinary adjunctions are the morphisms of a 2-category AdjAdj (with an arbitrary choice of direction, say pointing in the direction of the left adjoint), whose 2-cells are compatible pairs of natural transformations (a fundamental result being that either uniquely determines the other). It’s obvious to guess that two-variable adjunctions should be the binary morphisms in a multicategory of “nn-ary adjunctions”, and this is indeed the case. In fact, Eugenia, Nick, and Emily showed that multivariable adjunctions form a cyclic multicategory, and indeed even a cyclic double multicategory.

In this post, however, I want to argue that it’s even better to regard multivariable adjunctions as forming a slightly different structure called a polycategory.

What is a polycategory? The first thing to say about it is that it’s like a multicategory, but it allows the codomain of a morphism to contain multiple objects, as well as the domain. Thus we have morphisms like f:(A,B)(C,D)f: (A,B) \to (C,D). However, this description is incomplete, even informally, because it doesn’t tell us how we are allowed to compose such morphisms. Indeed, there are many different structures that admit this same description, but differ in the ways that morphisms can be composed.

One such structure is a prop, which John and his students have been writing a lot about recently. In a prop, we compose by simply matching domains and codomains as lists — given f:(A,B)(C,D)f: (A,B) \to (C,D) and g:(C,D)(E,F)g:(C,D) \to (E,F) we get gf:(A,B)(E,F)g\circ f : (A,B) \to (E,F) — and we can also place morphisms side by side — given f:(A,B)(C,D)f:(A,B) \to (C,D) and f:(A,B)(C,D)f':(A',B') \to (C',D') we get (f,f):(A,B,A,B)(C,D,C,D)(f,f') : (A,B,A',B') \to (C,D,C',D').

A polycategory is different: in a polycategory we can only “compose along single objects”, with the “leftover” objects in the codomain of ff and the domain of gg surviving into the codomain and domain of gfg\circ f. For instance, given f:(A,B)(C,D)f: (A,B) \to (C,D) and g:(E,C)(F,G)g:(E,C) \to (F,G) we get g Cf:(E,A,B)(F,G,D)g\circ_C f : (E,A,B) \to (F,G,D). This may seem a little weird at first, and the usual examples (semantics for two-sided sequents in linear logic) are rather removed from the experience of most mathematicians. But in fact it’s exactly what we need for multivariable adjunctions!

I claim there is a polycategory MVarMVar whose objects are categories and whose “poly-arrows” are multivariable adjunctions. What is a multivariable adjunction (A,B)(C,D)(A,B) \to (C,D)? There’s really only one possible answer, once you think to ask the question: it consists of four functors

f:C op×A×BDg:A×B×D opCh:A op×C×DBk:C×D×B opA f:C^{op}\times A\times B \to D \quad g:A \times B \times D^{op} \to C \quad h : A^{op}\times C\times D\to B \quad k : C\times D \times B^{op}\to A

and natural isomorphisms

D(f(c,a,b),d)C(g(a,b,d),c)B(b,h(a,c,d))A(a,k(c,d,b)). D(f(c,a,b),d) \cong C(g(a,b,d),c) \cong B(b,h(a,c,d)) \cong A(a,k(c,d,b)).

I find this definition quite illuminating already. One of the odd things about a two-variable adjunction, as usually defined, is the asymmetric placement of opposites. (Indeed, I suspect this oddness may have been a not insignificant inhibitor to their formal study.) The polycategorical perspective reveals that this arises simply from the asymmetry of having a 2-ary domain but a 1-ary codomain: a “(2,2)(2,2)-variable adjunction” as above looks much more symmetrical.

At this point it’s an exercise for the reader to write down the general notion of (n,m)(n,m)-variable adjunction. Of course, a (1,1)(1,1)-variable adjunction is an ordinary adjunction, and a (2,1)(2,1)-variable adjunction is a two-variable adjunction in the usual sense. It’s also a nice exercise to convince yourself that polycategory-style composition “along one object” is also exactly right for multivariable adjunctions. For instance, suppose in addition to (f,g,h,k):(A,B)(C,D)(f,g,h,k) : (A,B) \to (C,D) as above, we have a two-variable adjunction (,m,n):(D,E)Z(\ell,m,n) : (D,E)\to Z with Z((d,e),z)D(d,m(e,z))E(e,n(d,z))Z(\ell(d,e),z) \cong D(d,m(e,z)) \cong E(e,n(d,z)). Then we have a composite multivariable adjunction (A,B,E)(C,Z)(A,B,E) \to (C,Z) defined by C(g(a,b,m(e,z)),c)Z((f(c,a,b),e),z)A(a,k(c,m(e,z),b)) C(g(a,b,m(e,z)),c) \cong Z(\ell(f(c,a,b),e),z) \cong A(a,k(c,m(e,z),b)) \cong \cdots

It’s also interesting to consider what happens when the domain or codomain is empty. For instance, a (0,2)(0,2)-variable adjunction ()(A,B)() \to (A,B) consists of functors f:A opBf:A^{op}\to B and g:B opAg:B^{op}\to A and a natural isomorphism B(b,f(a))A(a,g(b))B(b,f(a)) \cong A(a,g(b)). This is sometimes called a mutual right adjunction or dual adjunction, and such things do arise in plenty of examples. Many Galois connections are mutual right adjunctions between posets, and also for instance the contravariant powerset functor is mutually right adjoint to itself. Similarly, a (2,0)(2,0)-variable adjunction (A,B)()(A,B) \to () is a mutual left adjunction B(f(a),b)A(g(b),a)B(f(a),b) \cong A(g(b),a). Of course a mutual right or left adjunction can also be described as an ordinary adjunction between A opA^{op} and BB, or between AA and B opB^{op}, but the choice of which category to oppositize is arbitrary; the polycategory MVarMVar respects mutual right and left adjunctions as independent objects rather than forcing them into the mold of ordinary adjunctions.

More generally, a (0,n)(0,n)-variable adjunction ()(A 1,,A n)() \to (A_1,\dots,A_n) is a “mutual right multivariable adjunction” between nn contravariant functors f i:A i+1××A n×A 1××A i1A i op.f_i : A_{i+1}\times \cdots \times A_n \times A_1 \times \cdots \times A_{i-1}\to A_i^{op}. Just as a (0,2)(0,2)-variable adjunction can be forced into the mold of a (1,1)(1,1)-variable adjunction by oppositizing one category, an (n,1)(n,1)-variable adjunction can be forced into the mold of a (0,n)(0,n)-variable adjunction by oppositizing all but one of the categories — Eugenia, Nick, and Emily found this helpful in describing the cyclic action. But the polycategory MVarMVar again treats them as independent objects.

What role, then, do opposite categories play in the polycategory MVarMVar? Or put differently, what happened to the cyclic action on the multicategory? The answer is once again quite beautiful: opposite categories are duals. The usual notion of dual pair (A,B)(A,B) in a monoidal category consists of a unit and counit η:IAB\eta : I \to A\otimes B and ε:BAI\varepsilon :B \otimes A \to I satisfying the triangle identities. This cannot be phrased in a mere multicategory, because η\eta involves two objects in its codomain (and ε\varepsilon involves zero), whereas in a multicategory every morphism has exactly one object in its codomain. But in a polycategory, with this restriction lifted, we can write η:()(A,B)\eta : () \to (A, B) and ε:(B,A)()\varepsilon : (B,A)\to (), and it turns out that the composition rule of a polycategory is exactly what we need for the triangle identities to make sense: ε Aη=1 B\varepsilon \circ_A \eta = 1_{B} and ε Bη=1 A\varepsilon \circ_{B} \eta = 1_A.

What is a dual pair in MVarMVar? As we saw above, η\eta is a mutual right adjunction B(b,η 1(a))A(a,η 2(b))B(b,\eta_1(a)) \cong A(a,\eta_2(b)), and ε\varepsilon is a mutual left adjunction B(ε 1(a),b)A(ε 2(b),a)B(\varepsilon_1(a),b) \cong A(\varepsilon_2(b),a). The triangle identities (suitably weakened up to isomorphism) say that ε 2η 11 A\varepsilon_2 \circ \eta_1 \cong 1_A and η 2ε 11 A\eta_2 \circ \varepsilon_1 \cong 1_A and ε 1η 21 B\varepsilon_1 \circ \eta_2 \cong 1_B and η 1ε 21 B\eta_1 \circ \varepsilon_2 \cong 1_B; thus these two adjunctions are actually both the same dual equivalence BA opB\simeq A^{op}. In particular, there is a canonical dual pair (A,A op)(A,A^{op}), and any other dual pair is equivalent to this one.

Let me say that again: in the polycategory MVarMVar, opposite categories are duals. I find this really exciting: opposite categories are one of the more mysterious parts of category theory to me, largely because they don’t have a universal property in CatCat; but in MVarMVar, they do! To be sure, they also have universal properties in other places. In 1606.05058 I noted that you can give them a universal property as a representing object for contravariant functors; but this is fairly tautological. And it’s also well-known that they are duals in the usual monoidal sense (not our generalized polycategory sense) in the monoidal bicategory Prof; but this characterizes them only up to Morita equivalence, whereas the duality in MVarMVar characterizes them up to ordinary equivalence of categories. Of course, we did already use opposite categories in defining the notion of multivariable adjunction, so it’s not as if this produces them out of thin air; but I do feel that it does give an important insight into what they are.

In particular, the dual pair (A,A op)(A,A^{op}) allows us to implement the “cyclic action” on multivariable adjunctions by simple composition. Given a (2,1)(2,1)-variable adjunction (A,B)C(A,B) \to C, we can compose it polycategorically with η:()(A,A op)\eta : () \to (A,A^{op}) to obtain a (1,2)(1,2)-variable adjunction B(A op,C)B \to (A^{op},C). Then we can compose that with ε:(C op,C)()\varepsilon : (C^{op},C)\to () to obtain another (2,1)(2,1)-variable adjunction (B,C op)A op(B,C^{op})\to A^{op}. This is exactly the action of the cyclic structure described by Eugenia, Nick, and Emily on our original multivariable adjunction. (In fact, there’s a precise sense in which a cyclic multicategory is “almost” equivalent to a polycategory with duals; for now I’ll leave that as an exercise for the reader.)

Note the similarity to how dual pairs in a monoidal category shift back and forth: Hom(AB,C)Hom(B,A *C)Hom(BC *,A *).Hom(A\otimes B, C) \cong Hom(B, A^\ast \otimes C) \cong Hom(B\otimes C^\ast, A^\ast). In string diagram notation, the latter is represented by “turning strings around”, regarding the unit and counit of the dual pair (A,A *)(A,A^\ast) as a “cup” and “cap”. Pleasingly, there is also a string diagram notation for polycategories, in which dual pairs behave exactly the same way; we simply restrict the ways that strings are allowed to be connected together — for instance, no two vertices can be joined by more than one string. (More generally, the condition is that the string diagram should be “simply connected”.)

In future posts I’ll explore some other neat things related to the polycategory MVarMVar. For now, let me leave you with some negative thinking puzzles:

  • What is a (0,1)(0,1)-variable adjunction?
  • How about a (1,0)(1,0)-variable adjunction?
  • How about a (0,0)(0,0)-variable adjunction?

November 10, 2017

John BaezBiology as Information Dynamics (Part 3)

On Monday I’m giving this talk at Caltech:

Biology as information dynamics, November 13, 2017, 4:00–5:00 pm, General Biology Seminar, Kerckhoff 119, Caltech.

If you’re around, please check it out! I’ll be around all day talking to people, including Erik Winfree, my graduate student host Fangzhou Xiao, and other grad students.

If you can’t make it, you can watch this video! It’s a neat subject, and I want to do more on it:

Abstract. If biology is the study of self-replicating entities, and we want to understand the role of information, it makes sense to see how information theory is connected to the ‘replicator equation’ — a simple model of population dynamics for self-replicating entities. The relevant concept of information turns out to be the information of one probability distribution relative to another, also known as the Kullback–Liebler divergence. Using this we can get a new outlook on free energy, see evolution as a learning process, and give a clearer, more general formulation of Fisher’s fundamental theorem of natural selection.


November 09, 2017

Robert HellingWhy is there a supercontinent cycle?

One of the most influential books of my early childhood was my "Kinderatlas"
There were many things to learn about the world (maps were actually only the last third of the book) and for example I blame my fascination for scuba diving on this book. Also last year, when we visited the Mont-Doré in Auvergne and I had to explain how volcanos are formed to my kids to make them forget how many stairs were still ahead of them to the summit, I did that while mentally picturing the pages in that book about plate tectonics.


But there is one thing I about tectonics that has been bothering me for a long time and I still haven't found a good explanation for (or at least an acknowledgement that there is something to explain): Since the days of Alfred Wegener we know that the jigsaw puzzle pieces of the continents fit in a way that geologists believe that some hundred million years ago they were all connected as a supercontinent Pangea.
Pangea animation 03.gif
By Original upload by en:User:Tbower - USGS animation A08, Public Domain, Link

In fact, that was only the last in a series of supercontinents, that keep forming and breaking up in the "supercontinent cycle".
Platetechsimple.png
By SimplisticReps - Own work, CC BY-SA 4.0, Link

So here is the question: I am happy with the idea of several (say $N$) plates roughly containing a continent each that a floating around on the magma driven by all kinds of convection processes in the liquid part of the earth. They are moving around in a pattern that looks to me to be pretty chaotic (in the non-technical sense) and of course for random motion you would expect that from time to time two of those collide and then maybe stick for a while.

Then it would be possible that also a third plate collides with the two but that would be a coincidence (like two random lines typically intersect but if you have three lines they would typically intersect in pairs but typically not in a triple intersection). But to form a supercontinent, you need all $N$ plates to miraculously collide at the same time. This order-$N$ process seems to be highly unlikely when random let alone the fact that it seems to repeat. So this motion cannot be random (yes, Sabine, this is a naturalness argument). This needs an explanation.

So, why, every few hundred million years, do all the land masses of the earth assemble on side of the earth?

One explanation could for example be that during those tines, the center of mass of the earth is not in the symmetry center so the water of the oceans flow to one side of the earth and reveals the seabed on the opposite side of the earth. Then you would have essentially one big island. But this seems not to be the case as the continents (those parts that are above sea-level) appear to be stable on much longer time scales. It is not that the seabed comes up on one side and the land on the other goes under water but the land masses actually move around to meet on one side.

I have already asked this question whenever I ran into people with a geosciences education but it is still open (and I have to admit that in a non-zero number of cases I failed to even make the question clear that an $N$-body collision needs an explanation). But I am sure, you my readers know the answer or even better can come up with one.

November 08, 2017

Terence TaoIPAM program in quantitative linear algebra, Mar 19-Jun 15 2018

Alice Guionnet, Assaf Naor, Gilles Pisier, Sorin Popa, Dimitri Shylakhtenko, and I are organising a three month program here at the Institute for Pure and Applied Mathematics (IPAM) on the topic of Quantitative Linear Algebra.  The purpose of this program is to bring together mathematicians and computer scientists (both junior and senior) working in various quantitative aspects of linear operators, particularly in large finite dimension.  Such aspects include, but are not restricted to discrepancy theory, spectral graph theory, random matrices, geometric group theory, ergodic theory, von Neumann algebras, as well as specific research directions such as the Kadison-Singer problem, the Connes embedding conjecture and the Grothendieck inequality.  There will be several workshops and tutorials during the program (for instance I will be giving a series of introductory lectures on random matrix theory).

While we already have several confirmed participants, we are still accepting applications for this program until Dec 4; details of the application process may be found at this page.


Filed under: advertising Tagged: ipam, linear algebra

November 07, 2017

Doug NatelsonTaxes and grad student tuition

As has happened periodically over the last couple of decades (I remember a scare about this when Newt Gingrich's folks ran Congress in the mid-1990s), a tax bill has been put forward in the US House that would treat graduate student tuition waivers like taxable income (roughly speaking).   This is discussed a little bit here, and here.

Here's an example of why this is an ill-informed idea.  Suppose a first-year STEM grad student comes to a US university, and they are supported by, say, departmental fellowship funds or a TA position during that first year.  Their stipend is something like $30K.  These days the university waives their graduate tuition - that is, they do not expect the student to pony up tuition funds.  At Rice, that tuition is around $45K.  Under the proposed legislation, the student would end up getting taxed as if their income was $75K, when their actual gross pay is $30K.   

That would be extremely bad for both graduate students and research universities.  Right off the bat this would create unintended (I presume) economic incentives, for grad students to drop out of their programs, and/or for universities to play funny games with what they say is graduate tuition.   

This has been pitched multiple times before, and my hypothesis is that it's put forward by congressional staffers who do not understand graduate school (and/or think that this is the same kind of tuition waiver as when a faculty member's child gets a vastly reduced tuition for attending the parent's employing university).  Because it is glaringly dumb, it has been fixed whenever it's come up before.  In the present environment, the prudent thing to do would be to exercise caution and let legislators know that this is a problem that needs to be fixed.

November 06, 2017

BackreactionHow Popper killed Particle Physics

Popper, upside-down. Image: Wikipedia. Popper is dead. Has been dead since 1994 to be precise. But also his philosophy, that a scientific idea needs to be falsifiable, is dead. And luckily so, because it was utterly impractical. In practice, scientists can’t falsify theories. That’s because any theory can be amended in hindsight so that it fits new data. Don’t roll your eyes – updating your

Tommaso DorigoThings That Can Decay To Boson Pairs

Writing a serious review of research in particle physics is a refreshing job - all the things that you already knew on that specific topic once sat on a fuzzy cloud somewhere in your brain, and now find their place in a tidily organized space, with clear interdependence among them. That's what I am experiencing as I progress with a 60-pageish thing on hadron collider searches for diboson resonances, which will appear sometime next year in a very high impact factor journal.

read more

November 03, 2017

John BaezComplex Adaptive Systems (Part 6)

I’ve been slacking off on writing this series of posts… but for a good reason: I’ve been busy writing a paper on the same topic! In the process I caught a couple of mistakes in what I’ve said so far. But more importantly, there’s a version out now, that you can read:

• John Baez, John Foley, Blake Pollard and Joseph Moeller, Network models.

There will be two talks about this at the AMS special session on Applied Category Theory this weekend at U. C. Riverside: one by John Foley of Metron Inc., and one by my grad student Joseph Moeller. I’ll try to get their talk slides someday. But for now, here’s the basic idea.

Our goal is to build operads suited for designing networks. These could be networks where the vertices represent fixed or moving agents and the edges represent communication channels. More generally, they could be networks where the vertices represent entities of various types, and the edges represent relationships between these entities—for example, that one agent is committed to take some action involving the other. This paper arose from an example where the vertices represent planes, boats and drones involved in a search and rescue mission in the Caribbean. However, even for this one example, we wanted a flexible formalism that can handle networks of many kinds, described at a level of detail that the user is free to adjust.

To achieve this flexibility, we introduced a general concept of ‘network model’. Simply put, a network model is a kind of network. Any network model gives an operad whose operations are ways to build larger networks of this kind by gluing smaller ones. This operad has a ‘canonical’ algebra where the operations act to assemble networks of the given kind. But it also has other algebras, where it acts to assemble networks of this kind equipped with extra structure and properties. This flexibility is important in applications.

What exactly is a ‘kind of network’? That’s the question we had to answer. We started with some examples, At the crudest level, we can model networks as simple graphs. If the vertices are agents of some sort and the edges represent communication channels, this means we allow at most one channel between any pair of agents.

However, simple graphs are too restrictive for many applications. If we allow multiple communication channels between a pair of agents, we should replace simple graphs with ‘multigraphs’. Alternatively, we may wish to allow directed channels, where the sender and receiver have different capabilities: for example, signals may only be able to flow in one direction. This requires replacing simple graphs with ‘directed graphs’. To combine these features we could use ‘directed multigraphs’.

But none of these are sufficiently general. It’s also important to consider graphs with colored vertices, to specify different types of agents, and colored edges, to specify different types of channels. This leads us to ‘colored directed multigraphs’.

All these are examples of what we mean by a ‘kind of network’, but none is sufficiently general. More complicated kinds, such as hypergraphs or Petri nets, are likely to become important as we proceed.

Thus, instead of separately studying all these kinds of networks, we introduced a unified notion that subsumes all these variants: a ‘network model’. Namely, given a set C of ‘vertex colors’, a network model is a lax symmetric monoidal functor

F: \mathbf{S}(C) \to \mathbf{Cat}

where \mathbf{S}(C) is the free strict symmetric monoidal category on C and \mathbf{Cat} is the category of small categories.

Unpacking this somewhat terrifying definition takes a little work. It simplifies in the special case where F takes values in \mathbf{Mon}, the category of monoids. It simplifies further when C is a singleton, since then \mathbf{S}(C) is the groupoid \mathbf{S}, where objects are natural numbers and morphisms from m to n are bijections

\sigma: \{1,\dots,m\} \to \{1,\dots,n\}

If we impose both these simplifying assumptions, we have what we call a one-colored network model: a lax symmetric monoidal functor

F : \mathbf{S} \to \mathbf{Mon}

As we shall see, the network model of simple graphs is a one-colored network model, and so are many other motivating examples. If you like André Joyal’s theory of ‘species’, then one-colored network models should be pretty fun, since they’re species with some extra bells and whistles.

But if you don’t, there’s still no reason to panic. In relatively down-to-earth terms, a one-colored network model amounts to roughly this. If we call elements of F(n) ‘networks with n vertices’, then:

• Since F(n) is a monoid, we can overlay two networks with the same number of vertices and get a new one. We call this operation

\cup \colon F(n) \times F(n) \to F(n)

• Since F is a functor, the symmetric group S_n acts on the monoid F(n). Thus, for each \sigma \in S_n, we have a monoid automorphism that we call simply

\sigma \colon F(n) \to F(n)

• Since F is lax monoidal, we also have an operation

\sqcup \colon F(m) \times F(n) \to F(m+n)

We call this operation the disjoint union of networks. In examples like simple graphs, it looks just like what it sounds like.

Unpacking the abstract definition further, we see that these operations obey some equations, which we list in Theorem 11 of our paper. They’re all obvious if you draw pictures of examples… and don’t worry, our paper has a few pictures. (We plan to add more.) For example, the ‘interchange law’

(g \cup g') \sqcup (h \cup h') = (g \sqcup h) \cup (g' \sqcup h')

holds whenever g,g' \in F(m) and h, h' \in F(n). This is a nice relationship between overlaying networks and taking their disjoint union.

In Section 2 of our apper we study one-colored network models, and give lots of examples. In Section 3 we describe a systematic procedure for getting one-colored network models from monoids. In Section 4 we study general network models and give examples of these. In Section 5 we describe a category \mathbf{NetMod} of network models, and show that the procedure for getting network models from monoids is functorial. We also make \mathbf{NetMod} into a symmetric monoidal category, and give examples of how to build new networks models by tensoring old ones.

Our main result is that any network model gives a typed operad, also known as a ‘colored operad’. This operad has operations that describe how to stick networks of the given kind together to form larger networks of this kind. This operad has a ‘canonical algebra’, where it acts on networks of the given kind—but the real point is that it has lots of other algebra, where it acts on networks of the given kind equipped with extra structure and properties.

The technical heart of our paper is Section 6, mainly written by Joseph Moeller. This provides the machinery to construct operads from network models in a functorial way. Category theorists should find this section interesting, because because it describes enhancements of the well-known ‘Grothendieck construction’ of the category of elements \int F of a functor

F: \mathbf{C} \to \mathbf{Cat}

where \mathbf{C} is any small category. For example, if \mathbf{C} is symmetric monoidal and F : \mathbf{C} \to (\mathbf{Cat}, \times) is lax symmetric monoidal, then we show \int F is symmetric monoidal. Moreover, we show that the construction sending the lax symmetric monoidal functor F to the symmetric monoidal category \int F is functorial.

In Section 7 we apply this machinery to build operads from network models. In Section 8 we describe some algebras of these operads, including an algebra whose elements are networks of range-limited communication channels. In future work we plan to give many more detailed examples, and to explain how these algebras, and the homomorphisms between them, can be used to design and optimize networks.

I want to explain all this in more detail—this is a pretty hasty summary, since I’m busy this week. But for now you can read the paper!


John BaezApplied Category Theory at UCR (Part 2)

I’m running a special session on applied category theory, and now the program is available:

Applied category theory, Fall Western Sectional Meeting of the AMS, 4-5 November 2017, U.C. Riverside.

This is going to be fun.

My former student Brendan Fong is now working with David Spivak at MIT, and they’re both coming. My collaborator John Foley at Metron is also coming: we’re working on the CASCADE project for designing networked systems.

Dmitry Vagner is coming from Duke: he wrote a paper with David and Eugene Lerman on operads and open dynamical systems. Christina Vaisilakopoulou, who has worked with David and Patrick Schultz on dynamical systems, has just joined our group at UCR, so she will also be here. And the three of them have worked with Ryan Wisnesky on algebraic databases. Ryan will not be here, but his colleague Peter Gates will: together with David they have a startup called Categorical Informatics, which uses category theory to build sophisticated databases.

That’s not everyone—for example, most of my students will be speaking at this special session, and other people too—but that gives you a rough sense of some people involved. The conference is on a weekend, but John Foley and David Spivak and Brendan Fong and Dmitry Vagner are staying on for longer, so we’ll have some long conversations… and Brendan will explain decorated corelations in my Tuesday afternoon network theory seminar.

Here’s the program. Click on talk titles to see abstracts. For a multi-author talk, the person with the asterisk after their name is doing the talking. All the talks will be in Room 268 of the Highlander Union Building or ‘HUB’.

Saturday November 4, 2017, 9:00 a.m.-10:50 a.m.

9:00 a.m.
A higher-order temporal logic for dynamical systems.
David I. Spivak, MIT

10:00 a.m.
Algebras of open dynamical systems on the operad of wiring diagrams.
Dmitry Vagner*, Duke University
David I. Spivak, MIT
Eugene Lerman, University of Illinois at Urbana-Champaign

10:30 a.m.
Abstract dynamical systems.
Christina Vasilakopoulou*, University of California, Riverside
David Spivak, MIT
Patrick Schultz, MIT

Saturday November 4, 2017, 3:00 p.m.-5:50 p.m.

3:00 p.m.
Black boxes and decorated corelations.
Brendan Fong, MIT

4:00 p.m.
Compositional modelling of open reaction networks.
Blake S. Pollard*, University of California, Riverside
John C. Baez, University of California, Riverside

4:30 p.m.
A bicategory of coarse-grained Markov processes.
Kenny Courser, University of California, Riverside

5:00 p.m.
A bicategorical syntax for pure state qubit quantum mechanics.
Daniel M. Cicala, University of California, Riverside

5:30 p.m.
Open systems in classical mechanics.
Adam Yassine, University of California Riverside

Sunday November 5, 2017, 9:00 a.m.-10:50 a.m.

9:00 a.m.
Controllability and observability: diagrams and duality.
Jason Erbele, Victor Valley College

9:30 a.m.
Frobenius monoids, weak bimonoids, and corelations.
Brandon Coya, University of California, Riverside

10:00 a.m.
Compositional design and tasking of networks.
John D. Foley*, Metron, Inc.
John C. Baez, University of California, Riverside
Joseph Moeller, University of California, Riverside
Blake S. Pollard, University of California, Riverside

10:30 a.m.
Operads for modeling networks.
Joseph Moeller*, University of California, Riverside
John Foley, Metron Inc.
John C. Baez, University of California, Riverside
Blake S. Pollard, University of California, Riverside

Sunday November 5, 2017, 2:00 p.m.-4:50 p.m.

2:00 p.m.
Reeb graph smoothing via cosheaves.
Vin de Silva, Department of Mathematics, Pomona College

3:00 p.m.
Knowledge representation in bicategories of relations.
Evan Patterson*, Stanford University, Statistics Department

3:30 p.m.
The multiresolution analysis of flow graphs.
Steve Huntsman*, BAE Systems

4:00 p.m.
Data modeling and integration using the open source tool Algebraic Query Language (AQL).
Peter Y. Gates*, Categorical Informatics
Ryan Wisnesky, Categorical Informatics


November 02, 2017

BackreactionBook Review: Max Tegmark “Our Mathematical Universe”

Our Mathematical Universe: My Quest for the Ultimate Nature of Reality Knopf (January 2014) Max Tegmark just published his second book, “Life 3.0.” I gracefully declined reviewing it, seeing that three years weren’t sufficient to finish his first book. But thusly reminded of my shortfall, I made another attempt and finally got to the end. So here’s a late review or, if you haven’t made it

October 31, 2017

Doug NatelsonLinks + coming soon

Real life is a bit busy right now, but I wanted to point out a couple of links and talk about what's coming up.
  • I've been looking for ways to think about and discuss topological materials that might be more broadly accessible to non-experts, and I found this paper and videos like this one and this one.  Very cool, and I'm sorry I'd missed it back in '15 when it came out.
  • In the experimental literature talking about realizations of Majorana fermions in the solid state, a key signature is a peak in the conductance at zero voltage - that's an indicator that there is a "zero-energy mode" in the system.  There are other ways to get zero-bias peaks, though, and nailing down whether this has the expected properties (magnitude, response to magnetic fields) has been a lingering issue.  This seems to nail down the situation more firmly.
  • Discussions about "quantum supremacy" strictly in terms of how many qubits can be simulated on a classical computer right now seem a bit silly to me.  Ok, so IBM managed to simulate a handful of additional qubits (56 rather than 49).  It wouldn't shock me if they could get up to 58 - supercomputers are powerful and programmers can be very clever.  Are we going to get a flurry of news stories every time about how this somehow moves the goalposts for quantum computers?    
  • I'm hoping to put out a review of Max the Demon and the Entropy of Doom, since I received my beautifully printed copies this past weekend.

Chad OrzelGo On Till You Come to the End; Then Stop

ScienceBlogs is coming to an end. I don’t know that there was ever a really official announcement of this, but the bloggers got email a while back letting us know that the site will be closing down. I’ve been absolutely getting crushed between work and the book-in-progress and getting Charlie the pupper, but I did manage to export and re-import the content to an archive site back on steelypips.org. (The theme there is an awful default WordPress one, but I’m too slammed with work to make it look better; the point is just to have an online archive for the temporary redirects to work with.)

I’m one of a handful who were there from the very beginning to the bitter end– I got asked to join up in late 2005, and the first new post here was on January 11,2016 (I copied over some older content before it went live, so it wasn’t just a blank page with a “Welcome to my new blog!” post). It seems fitting to have the last post be on the site’s last day of operation.

The history of ScienceBlogs and my place in it was… complicated. There were some early efforts to build a real community among the bloggers, but we turned out to be an irascible lot, and after a while that kind of fell apart. The site was originally associated with Seed magazine, which folded, then it was a stand-alone thing for a bit, then partnered with National Geoographic, and the last few years it’s been an independent entity again. I’ve been mostly blogging at Forbes since mid-2015, so I’ve been pretty removed from the network– I’m honestly not even sure what blogs have been active in the past few years. I’ll continue to blog at Forbes, and may or may not re-launch more personal blogging at the archive site. A lot of that content is now posted to

What led to the slow demise of ScienceBlogs? Like most people who’ve been associated with it over the years, I have Thoughts on the subject, but I don’t really feel like airing them at this point. (If somebody else wants to write an epic oral history of SB, email me, and we can talk…) I don’t think it was ever going to be a high-margin business, and there were a number of mis-steps over the years that undercut the money-making potential even more. I probably burned or at least charred some bridges by staying with the site as long as I did, but whatever. And it’s not like anybody else is getting fabulously wealthy from running blog networks that pay reasonable rates.

ScienceBlogs unquestionably gave an enormous boost to my career. I’ve gotten any number of cool opportunities as a direct result of blogging here, most importantly my career as a writer of pop-physics books. There were some things along the way that didn’t pan out as I’d hoped, but this site launched me to what fame I have, and I’ll always be grateful for that.

So, ave atque vale, ScienceBlogs. It was a noble experiment, and the good days were very good indeed.

October 30, 2017

Chad OrzelMeet Charlie

It’s been a couple of years since we lost the Queen of Niskayuna, and we’ve held off getting a dog until now because we were planning a big home renovation– adding on to the mud room, creating a new bedroom on the second floor, and gutting and replacing the kitchen. This was quite the undertaking, and we would not have wanted to put a dog through that. It was bad enough putting us through that…

Withe the renovation complete, we started looking for a dog a month or so back, and eventually ended up working with a local rescue group with the brilliantly unsubtle name Help Orphan Puppies. This weekend, we officially adopted this cutie:

Charlie, the new pupper at Chateau Steelypips, showing off his one pointy ear.

He was listed on the website as “Prince,” but his foster family had been calling him “Charlie,” and the kids liked that name a lot, so we’re keeping it. He’s a Plott Hound mix (the “mix” being evident in the one ear that sticks up while the other flops down), one of six puppies found with his mother back in May in a ravine in I think they said South Carolina. He’s the last of the litter to find a permanent home. The name change is appropriate, as Emmy was listed as “Princess” before we adopted her and changed her name.

Charlie’s a sweet and energetic boy, who’s basically housebroken, and sorta-kinda crate trained, which is about the same as Emmy when we got her. He knows how to sit, and is learning other commands. He’s very sweet with people, and we haven’t really met any other dogs yet, but he was fostered in a home with two other dogs, so we hope he’ll do well. And he’s super good at jumping– he cleared a 28″ child safety gate we were attempting to use to keep him in the mud room– and does a zoom with the best of them:

Charlie does a zoom.

The kids are absolutely over the moon about having a dog again, as you can see from their paparazzi turn:

Charlie poses for the paparazzi.

He’s a very good boy, all in all, and we’re very pleased to have him. I can’t really describe how good it felt on Saturday afternoon to once again settle down on the couch with a football game on tv, and drop my hand down to pet a dog lying on the floor next to me. I still miss some things about Emmy, but Charlie’s already filling a huge void.

Tommaso DorigoThe Future Of The LHC, And The Human Factor

Today at CERN a workshop started on the physics of the High-Luminosity and High-Energy phases of Large Hadron Collider operations. This is a three-days event meant at preparing the ground for the decision on which, among several possible scenarios that have been pictured for the future of particle physics in Europe, will be the one on which the European Community will invest in the next few decades. The so-called "European Strategy for particle physics" will be decided in a couple of years, but getting the hard data on which to base that crucial decision is today's job. 

Some context

read more

October 28, 2017

BackreactionNo, you still cannot probe quantum gravity with quantum optics

Srsly? Several people asked me to comment on a paper that is hyped by phys.org as a test of quantum gravity. I’ll make this brief. First things first, why are you still following phys.org? Second, the paper in question is on the arXiv and is titled “Probing noncommutative theories with quantum optical experiments.” The paper is as wrong as a very similar paper was in 2012. It is correct

John BaezApplied Category Theory 2018 — Adjoint School

The deadline for applying to this ‘school’ on applied category theory is Wednesday November 1st.

Applied Category Theory: Adjoint School: online sessions starting in January 2018, followed by a meeting 23–27 April 2018 at the Lorentz Center in Leiden, the Netherlands. Organized by Bob Coecke (Oxford), Brendan Fong (MIT), Aleks Kissinger (Nijmegen), Martha Lewis (Amsterdam), and Joshua Tan (Oxford).

The name ‘adjoint school’ is a bad pun, but the school should be great. Here’s how it works:

Overview

The Workshop on Applied Category Theory 2018 takes place in May 2018. A principal goal of this workshop is to bring early career researchers into the applied category theory community. Towards this goal, we are organising the Adjoint School.

The Adjoint School will run from January to April 2018. By the end of the school, each participant will:

  • be familiar with the language, goals, and methods of four prominent, current research directions in applied category theory;
  • have worked intensively on one of these research directions, mentored by an expert in the field; and
  • know other early career researchers interested in applied category theory.

They will then attend the main workshop, well equipped to take part in discussions across the diversity of applied category theory.

Structure

The Adjoint School comprises (1) an Online Reading Seminar from January to April 2018, and (2) a four day Research Week held at the Lorentz Center, Leiden, The Netherlands, from Monday April 23rd to Thursday April 26th.

In the Online Reading Seminar we will read papers on current research directions in applied category theory. The seminar will consist of eight iterations of a two week block. Each block will have one paper as assigned reading, two participants as co-leaders, and three phases:

  • A presentation (over WebEx) on the assigned reading delivered by the two block co-leaders.
  • Reading responses and discussion on a private forum, facilitated by Brendan Fong and Nina Otter.
  • Publication of a blog post on the n-Category Café written by the co-leaders.

Each participant will be expected to co-lead one block.

The Adjoint School is taught by mentors John Baez, Aleks Kissinger, Martha Lewis, and Pawel Sobocinski. Each mentor will mentor a working group comprising four participants. During the second half of the Online Reading Seminar, these working groups will begin to meet with their mentor (again over video conference) to learn about open research problems related to their reading.

In late April, the participants and the mentors will convene for a four day Research Week at the Lorentz Center. After opening lectures by the mentors, the Research Week will be devoted to collaboration within the working groups. Morning and evening reporting sessions will keep the whole school informed of the research developments of each group.

The following week, participants will attend Applied Category Theory 2018, a discussion-based 60-attendee workshop at the Lorentz Center. Here they will have the chance to meet senior members across the applied category theory community and learn about ongoing research, as well as industry applications.

Following the school, successful working groups will be invited to contribute to a new, to be launched, CUP book series.

Reading list

Meetings will be on Mondays; we will determine a time depending on the locations of the chosen participants.

Research projects

John Baez: Semantics for open Petri nets and reaction networks
Petri nets and reaction networks are widely used to describe systems of interacting entities in computer science, chemistry and other fields, but the study of open Petri nets and reaction networks is new, and raise many new questions connected to Lawvere’s “functorial semantics”.
Reading: Fong; Baez and Pollard.

Aleks Kissinger: Unification of the logic of causality
Employ the framework of (pre-)causal categories to unite notions of causality and techniques for causal reasoning which occur in classical statistics, quantum foundations, and beyond.
Reading: Kissinger and Uijlen; Henson, Lal, and Pusey.

Martha Lewis: Compositional approaches to linguistics and cognition
Use compact closed categories to integrate compositional models of meaning with distributed, geometric, and other meaning representations in linguistics and cognitive science.
Reading: Coecke, Sadrzadeh, and Clark; Bolt, Coecke, Genovese, Lewis, Marsden, and Piedeleu.

Pawel Sobocinski: Modelling of open and interconnected systems
Use Carboni and Walters’ bicategories of relations as a multidisciplinary algebra of open and interconnected systems.
Reading: Carboni and Walters; Willems.

Applications

We hope that each working group will comprise both participants who specialise in category theory and in the relevant application field. As a prerequisite, those participants specialising in category theory should feel comfortable with the material found in Categories for the Working Mathematician or its equivalent; those specialising in applications should have a similar graduate-level introduction.

To apply, please fill out the form here. You will be asked to upload a single PDF file containing the following information:

  • Your contact information and educational history.
  • A brief paragraph explaining your interest in this course.
  • A paragraph or two describing one of your favorite topics in category theory, or your application field.
  • A ranked list of the papers you would most like to present, together with an explanation of your preferences. Note that the paper you present determines which working group you will join.

You may add your CV if you wish.

Anyone is welcome to apply, although preference may be given to current graduate students and postdocs. Women and members of other underrepresented groups within applied category theory are particularly encouraged to apply.

Some support will be available to help with the costs (flights, accommodation, food, childcare) of attending the Research Week and the Workshop on Applied Category Theory; please indicate in your application if you would like to be considered for such support.

If you have any questions, please feel free to contact Brendan Fong (bfo at mit dot edu) or Nina Otter (otter at maths dot ox dot ac dot uk).

Application deadline: November 1st, 2017.


October 27, 2017

Terence TaoUCLA Math Undergraduate Merit Scholarship for 2018

In 2010, the UCLA mathematics department launched a scholarship opportunity for entering freshman students with exceptional background and promise in mathematics. We are able to offer one scholarship each year.  The UCLA Math Undergraduate Merit Scholarship provides for full tuition, and a room and board allowance for 4 years, contingent on continued high academic performance. In addition, scholarship recipients follow an individualized accelerated program of study, as determined after consultation with UCLA faculty.   The program of study leads to a Masters degree in Mathematics in four years.

More information and an application form for the scholarship can be found on the web at:

http://www.math.ucla.edu/ugrad/mums

To be considered for Fall 2018, candidates must apply for the scholarship and also for admission to UCLA on or before November 30, 2017.


Filed under: advertising Tagged: scholarship, UCLA, undergraduate study

October 26, 2017

Tommaso DorigoA Simple Two-Mover

My activity as a chessplayer has seen a steady decline in the past three years, due to overwhelming work obligations. To play in chess tournaments at a decent level, you not only need to be physically fit and well trained for the occasion, but also have your mind free from other thoughts. Alas, I have been failing miserably in the second and third of the above requirements. So I have essentially retired from competitive chess, and my only connection to the chess world is through the occasional 5-minute blitz game over the internet.

read more

October 25, 2017

Doug NatelsonThoughts after a NSF panel

I just returned from a NSF proposal review panel.  I had written about NSF panels back in the early days of this blog here, back when I may have been snarkier.

  • Some things have gotten better.  We can work from our own laptops, and I think we're finally to the point where everyone at these things is computer literate and can use the online review system.  The program officers do a good job making sure that the reviews get in on time (ahead of the meeting).
  • Some things remain the same.  I'm still mystified at how few people from top-ranked programs (e.g., Harvard, Stanford, MIT, Cornell, Cal Tech, Berkeley) I see at these.  Maybe I just don't move in the right circles.  
  • Best quote of the panel:  "When a review of one of my papers or proposals starts with 'Author says' rather than 'The author says', I know that the referee is Russian and I'm in trouble."
  • Why does the new NSF headquarters have tighter security screenings that Reagan National Airport?  
  • The growth of funding costs and eight years of numerically flat budgets has made this process more painful.  Sure looks like morale is not great at the agency.  Really not clear where this is all going to go over the next few years.  There was a lot of gallows humor about having "tax payer advocates" on panels.  (Everyone on the panel is a US taxpayer already, though apparently that doesn't count for anything because we are scientists.)
  • NSF is still the most community-driven of the research agencies. 
  • I cannot overstate the importance of younger scientists going to one of these and seeing how the system works, so you learn how proposals are evaluated.




October 24, 2017

Jordan EllenbergThe greatest Astro/Dodger

The World Series is here and so it’s time again to figure out which player in the history of baseball has had the most distinguished joint record of contributions to both teams in contention for the title.  (Last year:  Riggs Stephenson was the greatest Cub/Indian.)  Astros history just isn’t that long, so it’s a little surprising to find we come up with a really solid winner this year:  Jimmy Wynn, “The Toy Cannon,” a longtime Astro who moved to LA in 1974 and had arguably his best season, finishing 5th in MVP voting and leading the Dodgers to a pennant.  Real three-true-outcomes guy:  led the league in walks twice and strikeouts once, and was top-10 in the National League in home runs four times in the AstrodomeCareer total of 41.4 WAR for the Astros, and 12.3 for the Dodgers in just two years there.

As always, thanks to the indispensable Baseball Reference Play Index for making this search possible.

Other contenders:  Don Sutton is clearly tops among pitchers.  Sutton was the flip side of Wynn; he had just two seasons for Houston but they were pretty good.  Beyond that it’s slim pickings.  Jeff Kent put in some years for both teams.  So did Joe Ferguson.

Who are we rooting for?  On the “ex-Orioles on the WS Roster” I guess the Dodgers have the advantage, with Rich Hill and Justin Turner (I have to admit I have no memory of Turner playing for the Orioles at all, even though it wasn’t that long ago!  It was in 2009, a season I have few occasions to recall.)  But both these teams are stocked with players I just plain like:  Kershaw, Puig, Altuve, the great Carlos Beltran…

 

 


Andrew JaffeThe Chandrasekhar Mass and the Hubble Constant

The

first direct detection of gravitational waves was announced in February of 2015 by the LIGO team, after decades of planning, building and refining their beautiful experiment. Since that time, the US-based LIGO has been joined by the European Virgo gravitational wave telescope (and more are planned around the globe).

The first four events that the teams announced were from the spiralling in and eventual mergers of pairs of black holes, with masses ranging from about seven to about forty times the mass of the sun. These masses are perhaps a bit higher than we expect to by typical, which might raise intriguing questions about how such black holes were formed and evolved, although even comparing the results to the predictions is a hard problem depending on the details of the statistical properties of the detectors and the astrophysical models for the evolution of black holes and the stars from which (we think) they formed.

Last week, the teams announced the detection of a very different kind of event, the collision of two neutron stars, each about 1.4 times the mass of the sun. Neutron stars are one possible end state of the evolution of a star, when its atoms are no longer able to withstand the pressure of the gravity trying to force them together. This was first understood by S Chandrasekhar in the early years of the 20th Century, who realised that there was a limit to the mass of a star held up simply by the quantum-mechanical repulsion of the electrons at the outskirts of the atoms making up the star. When you surpass this mass, known, appropriately enough, as the Chandrasekhar mass, the star will collapse in upon itself, combining the electrons and protons into neutrons and likely releasing a vast amount of energy in the form of a supernova explosion. After the explosion, the remnant is likely to be a dense ball of neutrons, whose properties are actually determined fairly precisely by similar physics to that of the Chandrasekhar limit (discussed for this case by Oppenheimer, Volkoff and Tolman), giving us the magic 1.4 solar mass number.

(Last week also coincidentally would have seen Chandrasekhar’s 107th birthday, and Google chose to illustrate their home page with an animation in his honour for the occasion. I was a graduate student at the University of Chicago, where Chandra, as he was known, spent most of his career. Most of us students were far too intimidated to interact with him, although it was always seen as an auspicious occasion when you spotted him around the halls of the Astronomy and Astrophysics Center.)

This process can therefore make a single 1.4 solar-mass neutron star, and we can imagine that in some rare cases we can end up with two neutron stars orbiting one another. Indeed, the fact that LIGO saw one, but only one, such event during its year-and-a-half run allows the teams to constrain how often that happens, albeit with very large error bars, between 320 and 4740 events per cubic gigaparsec per year; a cubic gigaparsec is about 3 billion light-years on each side, so these are rare events indeed. These results and many other scientific inferences from this single amazing observation are reported in the teams’ overview paper.

A series of other papers discuss those results in more detail, covering the physics of neutron stars to limits on departures from Einstein’s theory of gravity (for more on some of these other topics, see this blog, or this story from the NY Times). As a cosmologist, the most exciting of the results were the use of the event as a “standard siren”, an object whose gravitational wave properties are well-enough understood that we can deduce the distance to the object from the LIGO results alone. Although the idea came from Bernard Schutz in 1986, the term “Standard siren” was coined somewhat later (by Sean Carroll) in analogy to the (heretofore?) more common cosmological standard candles and standard rulers: objects whose intrinsic brightness and distances are known and so whose distances can be measured by observations of their apparent brightness or size, just as you can roughly deduce how far away a light bulb is by how bright it appears, or how far away a familiar object or person is by how big how it looks.

Gravitational wave events are standard sirens because our understanding of relativity is good enough that an observation of the shape of gravitational wave pattern as a function of time can tell us the properties of its source. Knowing that, we also then know the amplitude of that pattern when it was released. Over the time since then, as the gravitational waves have travelled across the Universe toward us, the amplitude has gone down (further objects look dimmer sound quieter); the expansion of the Universe also causes the frequency of the waves to decrease — this is the cosmological redshift that we observe in the spectra of distant objects’ light.

Unlike LIGO’s previous detections of binary-black-hole mergers, this new observation of a binary-neutron-star merger was also seen in photons: first as a gamma-ray burst, and then as a “nova”: a new dot of light in the sky. Indeed, the observation of the afterglow of the merger by teams of literally thousands of astronomers in gamma and x-rays, optical and infrared light, and in the radio, is one of the more amazing pieces of academic teamwork I have seen.

And these observations allowed the teams to identify the host galaxy of the original neutron stars, and to measure the redshift of its light (the lengthening of the light’s wavelength due to the movement of the galaxy away from us). It is most likely a previously unexceptional galaxy called NGC 4993, with a redshift z=0.009, putting it about 40 megaparsecs away, relatively close on cosmological scales.

But this means that we can measure all of the factors in one of the most celebrated equations in cosmology, Hubble’s law: cz=Hd, where c is the speed of light, z is the redshift just mentioned, and d is the distance measured from the gravitational wave burst itself. This just leaves H₀, the famous Hubble Constant, giving the current rate of expansion of the Universe, usually measured in kilometres per second per megaparsec. The old-fashioned way to measure this quantity is via the so-called cosmic distance ladder, bootstrapping up from nearby objects of known distance to more distant ones whose properties can only be calibrated by comparison with those more nearby. But errors accumulate in this process and we can be susceptible to the weakest rung on the chain (see recent work by some of my colleagues trying to formalise this process). Alternately, we can use data from cosmic microwave background (CMB) experiments like the Planck Satellite (see here for lots of discussion on this blog); the typical size of the CMB pattern on the sky is something very like a standard ruler. Unfortunately, it, too, needs to calibrated, implicitly by other aspects of the CMB pattern itself, and so ends up being a somewhat indirect measurement. Currently, the best cosmic-distance-ladder measurement gives something like 73.24 ± 1.74 km/sec/Mpc whereas Planck gives 67.81 ± 0.92 km/sec/Mpc; these numbers disagree by “a few sigma”, enough that it is hard to explain as simply a statistical fluctuation.

Unfortunately, the new LIGO results do not solve the problem. Because we cannot observe the inclination of the neutron-star binary (i.e., the orientation of its orbit), this blows up the error on the distance to the object, due to the Bayesian marginalisation over this unknown parameter (just as the Planck measurement requires marginalization over all of the other cosmological parameters to fully calibrate the results). Because the host galaxy is relatively nearby, the teams must also account for the fact that the redshift includes the effect not only of the cosmological expansion but also the movement of galaxies with respect to one another due to the pull of gravity on relatively large scales; this so-called peculiar velocity has to be modelled which adds further to the errors.

This procedure gives a final measurement of 70.0+12-8.0, with the full shape of the probability curve shown in the Figure, taken directly from the paper. Both the Planck and distance-ladder results are consistent with these rather large error bars. But this is calculated from a single object; as more of these events are seen these error bars will go down, typically by something like the square root of the number of events, so it might not be too long before this is the best way to measure the Hubble Constant.

GW H0

[Apologies: too long, too technical, and written late at night while trying to get my wonderful not-quite-three-week-old daughter to sleep through the night.]

Steinn Sigurðssonbusiness vs science: an anecdote

many years ago, when the Web was young, I was talking to an acquaintance - a friend-of-a-friend - a SoCal business person.
They had heard about this new Web thing, and were asking me about what use it was.

Now, if you had asked me, I'd have guessed this was end of '94,  but I checked and it must have been the summer of '95.

I had just seen Kelson and Trager order pizza on the web, so clearly I was an expert, but my acquaintance wanted to know if this Web thing could be used to help sell cars - not directly, SSL 3.0 had not been implemented yet, but to show models continuously, update inventory, specs, pricing etc.

I opined, "sure, why not", seemed like it'd be a good fit, and I would explore how feasible this might be.  So I searched (this was before google so AltaVista probably...[sigh]) and discovered that Edmunds had already done this.

At this point there was a bifurcation:  my acquaintance was very excited, this was clearly a proven concept and much more interesting than they had appreciated, and would I be interested in working with them on this;  and I was totally not interested,  somebody had done it, it was not new, booooring.

This was a revelation for me.  The concept was only interesting to me when it was new and innovative and unproven. Soon as I discovered it had been done and worked, I become disinterested.
My acquaintance, the business man, really only became interested when they discovered it was a proven idea already done by a player.

So, they got some SoCal hack to put up a website and sold more cars, and got even more rich,
and I went off to think about black hole and shit.

Learned something.

October 23, 2017

Doug NatelsonWhither science blogging?

I read yesterday of the impending demise of scienceblogs, a site that has been around since late 2005 in one form or other.  I guess I shouldn't be surprised, since some of its bloggers have shifted to other sites in recent years, such as Ethan Siegel and Chad Orzel, who largely migrated to Forbes, and Rhett Allain, who went to Wired.  Steinn Sigurðsson is going back to his own hosted blog in the wake of this.

I hope this is just indicative of a poor business model at Seed Media, and not a further overall decline in blogging by scientists.  It's wonderful that online magazines like Quanta and Aeon and Nautilus are providing high quality, long-form science writing.  Still, I think everyone benefits when scientists themselves (in addition to professional science journalists) carve out some time to write about their fields.



John PreskillParadise

The word dominates chapter one of Richard Holmes’s book The Age of WonderHolmes writes biographies of Romantic-Era writers: Mary Wollstonecraft, Percy Shelley, and Samuel Taylor Coleridge populate his bibliography. They have cameos in Age. But their scientific counterparts star.

“Their natural-philosopher” counterparts, I should say. The word “scientist” emerged as the Romantic Era closed. Romanticism, a literary and artistic movement, flourished between the 1700s and the 1800s. Romantics championed self-expression, individuality, and emotion over convention and artificiality. Romantics wondered at, and drew inspiration from, the natural world. So, Holmes argues, did Romantic-Era natural philosophers. They explored, searched, and innovated with Wollstonecraft’s, Shelley’s, and Coleridge’s zest.

Age of Wonder

Holmes depicts Wilhelm and Caroline Herschel, a German brother and sister, discovering the planet Uranus. Humphry Davy, an amateur poet from Penzance, inventing a lamp that saved miners’ lives. Michael Faraday, a working-class Londoner, inspired by Davy’s chemistry lectures.

Joseph Banks in paradise.

So Holmes entitled chapter one.

Banks studied natural history as a young English gentleman during the 1760s. He then sailed around the world, a botanist on exploratory expeditions. The second expedition brought Banks aboard the HMS Endeavor. Captain James Cook steered the ship to Brazil, Tahiti, Australia, and New Zealand. Banks brought a few colleagues onboard. They studied the native flora, fauna, skies, and tribes.

Banks, with fellow botanist Daniel Solander, accumulated over 30,000 plant samples. Artist Sydney Parkinson drew the plants during the voyage. Parkinson’s drawings underlay 743 copper engravings that Banks commissioned upon returning to England. Banks planned to publish the engravings as the book Florilegium. He never succeeded. Two institutions executed Banks’s plan more than 200 years later.

Banks’s Florilegium crowns an exhibition at the University of California at Santa Barbara (UCSB). UCSB’s Special Research Collections will host “Botanical Illustrations and Scientific Discovery—Joseph Banks and the Exploration of the South Pacific, 1768–1771” until May 2018. The exhibition features maps of Banks’s journeys, biographical sketches of Banks and Cook, contemporary art inspired by the engravings, and the Florilegium.

online poster

The exhibition spotlights “plants that have subsequently become important ornamental plants on the UCSB campus, throughout Santa Barbara, and beyond.” One sees, roaming Santa Barbara, slivers of Banks’s paradise.

2 bouganvilleas

In Santa Barbara resides the Kavli Institute for Theoretical Physics (KITP). The KITP is hosting a program about the physics of quantum information (QI). QI scientists are congregating from across the world. Everyone visits for a few weeks or months, meeting some participants and missing others (those who have left or will arrive later). Participants attend and present tutorials, explore beyond their areas of expertise, and initiate research collaborations.

A conference capstoned the program, one week this October. Several speakers had founded subfields of physics: quantum error correction (how to fix errors that dog quantum computers), quantum computational complexity (how quickly quantum computers can solve hard problems), topological quantum computation, AdS/CFT (a parallel between certain gravitational systems and certain quantum systems), and more. Swaths of science exist because of these thinkers.

KITP

One evening that week, I visited the Joseph Banks exhibition.

Joseph Banks in paradise.

I’d thought that, by “paradise,” Holmes had meant “physical attractions”: lush flowers, vibrant colors, fresh fish, and warm sand. Another meaning occurred to me, after the conference talks, as I stood before a glass case in the library.

Joseph Banks, disembarking from the Endeavour, didn’t disembark onto just an island. He disembarked onto terra incognita. Never had he or his colleagues seen the blossoms, seed pods, or sprouts before him. Swaths of science awaited. What could the natural philosopher have craved more?

QI scientists of a certain age reminisce about the 1990s, the cowboy days of QI. When impactful theorems, protocols, and experiments abounded. When they dangled, like ripe fruit, just above your head. All you had to do was look up, reach out, and prove a pineapple.

Cowboy

Typical 1990s quantum-information scientist

That generation left mine few simple theorems to prove. But QI hasn’t suffered extinction. Its frontiers have advanced into other fields of science. Researchers are gaining insight into thermodynamics, quantum gravity, condensed matter, and chemistry from QI. The KITP conference highlighted connections with quantum gravity.

…in paradise.

What could a natural philosopher crave more?

Contemporary

Artwork commissioned by the UCSB library: “Sprawling Neobiotic Chimera (After Banks’ Florilegium),” by Rose Briccetti

Most KITP talks are recorded and released online. You can access talks from the conference here. My talk, about quantum chaos and thermalization, appears here. 

With gratitude to the KITP, and to the program organizers and the conference organizers, for the opportunity to participate. 


October 22, 2017

Terence TaoThe logarithmically averaged and non-logarithmically averaged Chowla conjectures

Let {\lambda: {\bf N} \rightarrow \{-1,1\}} be the Liouville function, thus {\lambda(n)} is defined to equal {+1} when {n} is the product of an even number of primes, and {-1} when {n} is the product of an odd number of primes. The Chowla conjecture asserts that {\lambda} has the statistics of a random sign pattern, in the sense that

\displaystyle  \lim_{N \rightarrow \infty} \mathbb{E}_{n \leq N} \lambda(n+h_1) \dots \lambda(n+h_k) = 0 \ \ \ \ \ (1)

for all {k \geq 1} and all distinct natural numbers {h_1,\dots,h_k}, where we use the averaging notation

\displaystyle  \mathbb{E}_{n \leq N} f(n) := \frac{1}{N} \sum_{n \leq N} f(n).

For {k=1}, this conjecture is equivalent to the prime number theorem (as discussed in this previous blog post), but the conjecture remains open for any {k \geq 2}.

In recent years, it has been realised that one can make more progress on this conjecture if one works instead with the logarithmically averaged version

\displaystyle  \lim_{N \rightarrow \infty} \mathbb{E}_{n \leq N}^{\log} \lambda(n+h_1) \dots \lambda(n+h_k) = 0 \ \ \ \ \ (2)

of the conjecture, where we use the logarithmic averaging notation

\displaystyle  \mathbb{E}_{n \leq N}^{\log} f(n) := \frac{\sum_{n \leq N} \frac{f(n)}{n}}{\sum_{n \leq N} \frac{1}{n}}.

Using the summation by parts (or telescoping series) identity

\displaystyle  \sum_{n \leq N} \frac{f(n)}{n} = \sum_{M < N} \frac{1}{M(M+1)} (\sum_{n \leq M} f(n)) + \frac{1}{N} \sum_{n \leq N} f(n) \ \ \ \ \ (3)

it is not difficult to show that the Chowla conjecture (1) for a given {k,h_1,\dots,h_k} implies the logarithmically averaged conjecture (2). However, the converse implication is not at all clear. For instance, for {k=1}, we have already mentioned that the Chowla conjecture

\displaystyle  \lim_{N \rightarrow \infty} \mathbb{E}_{n \leq N} \lambda(n) = 0

is equivalent to the prime number theorem; but the logarithmically averaged analogue

\displaystyle  \lim_{N \rightarrow \infty} \mathbb{E}^{\log}_{n \leq N} \lambda(n) = 0

is significantly easier to show (a proof with the Liouville function {\lambda} replaced by the closely related Möbius function {\mu} is given in this previous blog post). And indeed, significantly more is now known for the logarithmically averaged Chowla conjecture; in this paper of mine I had proven (2) for {k=2}, and in this recent paper with Joni Teravainen, we proved the conjecture for all odd {k} (with a different proof also given here).

In view of this emerging consensus that the logarithmically averaged Chowla conjecture was easier than the ordinary Chowla conjecture, it was thus somewhat of a surprise for me to read a recent paper of Gomilko, Kwietniak, and Lemanczyk who (among other things) established the following statement:

Theorem 1 Assume that the logarithmically averaged Chowla conjecture (2) is true for all {k}. Then there exists a sequence {N_i} going to infinity such that the Chowla conjecture (1) is true for all {k} along that sequence, that is to say

\displaystyle  \lim_{N_i \rightarrow \infty} \mathbb{E}_{n \leq N_i} \lambda(n+h_1) \dots \lambda(n+h_k) = 0

for all {k} and all distinct {h_1,\dots,h_k}.

This implication does not use any special properties of the Liouville function (other than that they are bounded), and in fact proceeds by ergodic theoretic methods, focusing in particular on the ergodic decomposition of invariant measures of a shift into ergodic measures. Ergodic methods have proven remarkably fruitful in understanding these sorts of number theoretic and combinatorial problems, as could already be seen by the ergodic theoretic proof of Szemerédi’s theorem by Furstenberg, and more recently by the work of Frantzikinakis and Host on Sarnak’s conjecture. (My first paper with Teravainen also uses ergodic theory tools.) Indeed, many other results in the subject were first discovered using ergodic theory methods.

On the other hand, many results in this subject that were first proven ergodic theoretically have since been reproven by more combinatorial means; my second paper with Teravainen is an instance of this. As it turns out, one can also prove Theorem 1 by a standard combinatorial (or probabilistic) technique known as the second moment method. In fact, one can prove slightly more:

Theorem 2 Let {k} be a natural number. Assume that the logarithmically averaged Chowla conjecture (2) is true for {2k}. Then there exists a set {{\mathcal N}} of natural numbers of logarithmic density {1} (that is, {\lim_{N \rightarrow \infty} \mathbb{E}_{n \leq N}^{\log} 1_{n \in {\mathcal N}} = 1}) such that

\displaystyle  \lim_{N \rightarrow \infty: N \in {\mathcal N}} \mathbb{E}_{n \leq N} \lambda(n+h_1) \dots \lambda(n+h_k) = 0

for any distinct {h_1,\dots,h_k}.

It is not difficult to deduce Theorem 1 from Theorem 2 using a diagonalisation argument. Unfortunately, the known cases of the logarithmically averaged Chowla conjecture ({k=2} and odd {k}) are currently insufficient to use Theorem 2 for any purpose other than to reprove what is already known to be true from the prime number theorem. (Indeed, the even cases of Chowla, in either logarithmically averaged or non-logarithmically averaged forms, seem to be far more powerful than the odd cases; see Remark 1.7 of this paper of myself and Teravainen for a related observation in this direction.)

We now sketch the proof of Theorem 2. For any distinct {h_1,\dots,h_k}, we take a large number {H} and consider the limiting the second moment

\displaystyle  \limsup_{N \rightarrow \infty} \mathop{\bf E}_{n \leq N}^{\log} |\mathop{\bf E}_{m \leq H} \lambda(n+m+h_1) \dots \lambda(n+m+h_k)|^2.

We can expand this as

\displaystyle  \limsup_{N \rightarrow \infty} \mathop{\bf E}_{m,m' \leq H} \mathop{\bf E}_{n \leq N}^{\log} \lambda(n+m+h_1) \dots \lambda(n+m+h_k)

\displaystyle \lambda(n+m'+h_1) \dots \lambda(n+m'+h_k).

If all the {m+h_1,\dots,m+h_k,m'+h_1,\dots,m'+h_k} are distinct, the hypothesis (2) tells us that the inner averages goes to zero as {N \rightarrow \infty}. The remaining averages are {O(1)}, and there are {O( k^2 )} of these averages. We conclude that

\displaystyle  \limsup_{N \rightarrow \infty} \mathop{\bf E}_{n \leq N}^{\log} |\mathop{\bf E}_{m \leq H} \lambda(n+m+h_1) \dots \lambda(n+m+h_k)|^2 \ll k^2 / H.

By Markov’s inequality (and (3)), we conclude that for any fixed {h_1,\dots,h_k, H}, there exists a set {{\mathcal N}_{h_1,\dots,h_k,H}} of upper logarithmic density at least {1-k/H^{1/2}}, thus

\displaystyle  \limsup_{N \rightarrow \infty} \mathbb{E}_{n \leq N}^{\log} 1_{n \in {\mathcal N}_{h_1,\dots,h_k,H}} \geq 1 - k/H^{1/2}

such that

\displaystyle  \mathop{\bf E}_{n \leq N} |\mathop{\bf E}_{m \leq H} \lambda(n+m+h_1) \dots \lambda(n+m+h_k)|^2 \ll k / H^{1/2}.

By deleting at most finitely many elements, we may assume that {{\mathcal N}_{h_1,\dots,h_k,H}} consists only of elements of size at least {H^2} (say).

For any {H_0}, if we let {{\mathcal N}_{h_1,\dots,h_k, \geq H_0}} be the union of {{\mathcal N}_{h_1,\dots,h_k, H}} for {H \geq H_0}, then {{\mathcal N}_{h_1,\dots,h_k, \geq H_0}} has logarithmic density {1}. By a diagonalisation argument (using the fact that the set of tuples {(h_1,\dots,h_k)} is countable), we can then find a set {{\mathcal N}} of natural numbers of logarithmic density {1}, such that for every {h_1,\dots,h_k,H_0}, every sufficiently large element of {{\mathcal N}} lies in {{\mathcal N}_{h_1,\dots,h_k,\geq H_0}}. Thus for every sufficiently large {N} in {{\mathcal N}}, one has

\displaystyle  \mathop{\bf E}_{n \leq N} |\mathop{\bf E}_{m \leq H} \lambda(n+m+h_1) \dots \lambda(n+m+h_k)|^2 \ll k / H^{1/2}.

for some {H \geq H_0} with {N \geq H^2}. By Cauchy-Schwarz, this implies that

\displaystyle  \mathop{\bf E}_{n \leq N} \mathop{\bf E}_{m \leq H} \lambda(n+m+h_1) \dots \lambda(n+m+h_k) \ll k^{1/2} / H^{1/4};

interchanging the sums and using {N \geq H^2} and {H \geq H_0}, this implies that

\displaystyle  \mathop{\bf E}_{n \leq N} \lambda(n+h_1) \dots \lambda(n+h_k) \ll k^{1/2} / H^{1/4} \leq k^{1/2} / H_0^{1/4}.

We conclude on taking {H_0} to infinity that

\displaystyle  \lim_{N \rightarrow \infty; N \in {\mathcal N}} \mathop{\bf E}_{n \leq N} \lambda(n+h_1) \dots \lambda(n+h_k) = 0

as required.


Filed under: expository, math.CO, math.DS, math.NT, math.PR Tagged: Chowla conjecture, second moment method

October 20, 2017

Steinn Sigurðsson"In the beginning..."


"In the beginning was the command line..."

a lot of people who ought to have read Neal Stephenson's essay on the UI as a metaphor, have not done so.

This is a public service.

Go get a copy, then carry it with you until you get stuck at O'Hare long enough to read it, or whatever works for you.


October 19, 2017

Steinn Sigurðssonthe best things in life are free

The The arXiv wants your $

arXiv  is an e-print service in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics

each day it receives several hundred e-prints, mostly preprints, categorizes them and distributes the list of papers, with full access to the pdf and, when available, the TeX source  - which is most of the time, TeX rocks

authors submit the e-prints, with license to distribute, and users receive the list of the day's papers, for the categories they express interest in, and access to content, free, originally by e-mail, now web

almost all papers in theoretical physics, mathematics and astrophysics now go on the arXiv, as does an increasing fraction from the newer fields
there are multiple other *Xivs  covering other subject areas with varying degree of success

the arXiv now holds over a million e-prints, going back to 1991, and a little bit beyond, as people have sent in old stuff to archive on the arXiv, e-prints are coming at about 10,000 per month and growing

the service is slim, by design, almost minimalistic, but oh so powerful

you all use it, obsessively, you couldn't do without it!

arXiv is not actually cost free, it as an IT staff, its own server and a management team and development team
a lot of the current cost is provided by member institutions, and Cornell University

but... we could use more, so the annual fundraising drive is under way, this week only
 - ok, you can give any time, but this is sorta special

Steinn SigurðssonWhy, yes, it is all about me...


Dynamics of Cats is back.

This is the original, Ye Olde, blog, started back in March of 20065, it had a decent run through June 2006, at which point I was invited to join the Scienceblogs collective at what was then SEED magazine.

A fun decade ensued, as blogs boomed,  markets crashed, SEED realized Sb was keeping the rest of the group going, and then National Geographic ate Sb, which then started to shrivel.

I blog strictly to amuse myself, no promises.

I find blogging is good warmup for serious writing.  Actual output is very sensitive to Real Life,  I tend to be more prolific when busy doing stuff, and less prolific when subsumed by administrivia and other intrusions from The World.

I'm a physicist, educated in England, PhD from a small private university in California.
Postdocs in NoCal and back to UK, and now a Professor of Astronomy and Astrophysics at The Pennsylvania State University.
Which is Good In Parts.

I am a member of:

I have never taken a class in astronomy, for credit.

In my copious spare time I am a Science Editor for the AAS Journals,
I am also a member of the Aspen Center for Physics 
and most recently I became the Scientific Director of arXiv

Steinn Sigurdsson Appointed as arXiv Scientific Director

As noted above (you did read the disclaimer...),
I do not speak for any of these Institutions upon this here blog.

We also have a dog.
Gunnar.
Named after the poet-warrior of the Sagas.
Don't ask, he had the name when he moved in with us.



We used to have cats, but they, sadly, died.

October 17, 2017

Matt StrasslerThe Significance of Yesterday’s Gravitational Wave Announcement: an FAQ

Yesterday’s post on the results from the LIGO/VIRGO network of gravitational wave detectors was aimed at getting information out, rather than providing the pedagogical backdrop.  Today I’m following up with a post that attempts to answer some of the questions that my readers and my personal friends asked me.  Some wanted to understand better how to visualize what had happened, while others wanted more clarity on why the discovery was so important.  So I’ve put together a post which  (1) explains what neutron stars and black holes are and what their mergers are like, (2) clarifies why yesterday’s announcement was important — and there were many reasons, which is why it’s hard to reduce it all to a single soundbite.  And (3) there are some miscellaneous questions at the end.

First, a disclaimer: I am *not* an expert in the very complex subject of neutron star mergers and the resulting explosions, called kilonovas.  These are much more complicated than black hole mergers.  I am still learning some of the details.  Hopefully I’ve avoided errors, but you’ll notice a few places where I don’t know the answers … yet.  Perhaps my more expert colleagues will help me fill in the gaps over time.

Please, if you spot any errors, don’t hesitate to comment!!  And feel free to ask additional questions whose answers I can add to the list.

BASIC QUESTIONS ABOUT NEUTRON STARS, BLACK HOLES, AND MERGERS

What are neutron stars and black holes, and how are they related?

Every atom is made from a tiny atomic nucleus, made of neutrons and protons (which are very similar), and loosely surrounded by electrons. Most of an atom is empty space, so it can, under extreme circumstances, be crushed — but only if every electron and proton convert to a neutron (which remains behind) and a neutrino (which heads off into outer space.) When a giant star runs out of fuel, the pressure from its furnace turns off, and it collapses inward under its own weight, creating just those extraordinary conditions in which the matter can be crushed. Thus: a star’s interior, with a mass one to several times the Sun’s mass, is all turned into a several-mile(kilometer)-wide ball of neutrons — the number of neutrons approaching a 1 with 57 zeroes after it.

If the star is big but not too big, the neutron ball stiffens and holds its shape, and the star explodes outward, blowing itself to pieces in a what is called a core-collapse supernova. The ball of neutrons remains behind; this is what we call a neutron star. It’s a ball of the densest material that we know can exist in the universe — a pure atomic nucleus many miles(kilometers) across. It has a very hard surface; if you tried to go inside a neutron star, your experience would be a lot worse than running into a closed door at a hundred miles per hour.

If the star is very big indeed, the neutron ball that forms may immediately (or soon) collapse under its own weight, forming a black hole. A supernova may or may not result in this case; the star might just disappear. A black hole is very, very different from a neutron star. Black holes are what’s left when matter collapses irretrievably upon itself under the pull of gravity, shrinking down endlessly. While a neutron star has a surface that you could smash your head on, a black hole has no surface — it has an edge that is simply a point of no return, called a horizon. In Einstein’s theory, you can just go right through, as if passing through an open door. You won’t even notice the moment you go in. [Note: this is true in Einstein’s theory. But there is a big controversy as to whether the combination of Einstein’s theory with quantum physics changes the horizon into something novel and dangerous to those who enter; this is known as the firewall controversy, and would take us too far afield into speculation.]  But once you pass through that door, you can never return.

Black holes can form in other ways too, but not those that we’re observing with the LIGO/VIRGO detectors.

Why are their mergers the best sources for gravitational waves?

One of the easiest and most obvious ways to make gravitational waves is to have two objects orbiting each other.  If you put your two fists in a pool of water and move them around each other, you’ll get a pattern of water waves spiraling outward; this is in rough (very rough!) analogy to what happens with two orbiting objects, although, since the objects are moving in space, the waves aren’t in a material like water.  They are waves in space itself.

To get powerful gravitational waves, you want objects each with a very big mass that are orbiting around each other at very high speed. To get the fast motion, you need the force of gravity between the two objects to be strong; and to get gravity to be as strong as possible, you need the two objects to be as close as possible (since, as Isaac Newton already knew, gravity between two objects grows stronger when the distance between them shrinks.) But if the objects are large, they can’t get too close; they will bump into each other and merge long before their orbit can become fast enough. So to get a really fast orbit, you need two relatively small objects, each with a relatively big mass — what scientists refer to as compact objects. Neutron stars and black holes are the most compact objects we know about. Fortunately, they do indeed often travel in orbiting pairs, and do sometimes, for a very brief period before they merge, orbit rapidly enough to produce gravitational waves that LIGO and VIRGO can observe.

Why do we find these objects in pairs in the first place?

Stars very often travel in pairs… they are called binary stars. They can start their lives in pairs, forming together in large gas clouds, or even if they begin solitary, they can end up pairing up if they live in large densely packed communities of stars where it is common for multiple stars to pass nearby. Perhaps surprisingly, their pairing can survive the collapse and explosion of either star, leaving two black holes, two neutron stars, or one of each in orbit around one another.

What happens when these objects merge?

Not surprisingly, there are three classes of mergers which can be detected: two black holes merging, two neutron stars merging, and a neutron star merging with a black hole. The first class was observed in 2015 (and announced in 2016), the second was announced yesterday, and it’s a matter of time before the third class is observed. The two objects may orbit each other for billions of years, very slowly radiating gravitational waves (an effect observed in the 70’s, leading to a Nobel Prize) and gradually coming closer and closer together. Only in the last day of their lives do their orbits really start to speed up. And just before these objects merge, they begin to orbit each other once per second, then ten times per second, then a hundred times per second. Visualize that if you can: objects a few dozen miles (kilometers) across, a few miles (kilometers) apart, each with the mass of the Sun or greater, orbiting each other 100 times each second. It’s truly mind-boggling — a spinning dumbbell beyond the imagination of even the greatest minds of the 19th century. I don’t know any scientist who isn’t awed by this vision. It all sounds like science fiction. But it’s not.

How do we know this isn’t science fiction?

We know, if we believe Einstein’s theory of gravity (and I’ll give you a very good reason to believe in it in just a moment.) Einstein’s theory predicts that such a rapidly spinning, large-mass dumbbell formed by two orbiting compact objects will produce a telltale pattern of ripples in space itself — gravitational waves. That pattern is both complicated and precisely predicted. In the case of black holes, the predictions go right up to and past the moment of merger, to the ringing of the larger black hole that forms in the merger. In the case of neutron stars, the instants just before, during and after the merger are more complex and we can’t yet be confident we understand them, but during tens of seconds before the merger Einstein’s theory is very precise about what to expect. The theory further predicts how those ripples will cross the vast distances from where they were created to the location of the Earth, and how they will appear in the LIGO/VIRGO network of three gravitational wave detectors. The prediction of what to expect at LIGO/VIRGO thus involves not just one prediction but many: the theory is used to predict the existence and properties of black holes and of neutron stars, the detailed features of their mergers, the precise patterns of the resulting gravitational waves, and how those gravitational waves cross space. That LIGO/VIRGO have detected the telltale patterns of these gravitational waves. That these wave patterns agree with Einstein’s theory in every detail is the strongest evidence ever obtained that there is nothing wrong with Einstein’s theory when used in these combined contexts.  That then in turn gives us confidence that our interpretation of the LIGO/VIRGO results is correct, confirming that black holes and neutron stars really exist and really merge. (Notice the reasoning is slightly circular… but that’s how scientific knowledge proceeds, as a set of detailed consistency checks that gradually and eventually become so tightly interconnected as to be almost impossible to unwind.  Scientific reasoning is not deductive; it is inductive.  We do it not because it is logically ironclad but because it works so incredibly well — as witnessed by the computer, and its screen, that I’m using to write this, and the wired and wireless internet and computer disk that will be used to transmit and store it.)

THE SIGNIFICANCE(S) OF YESTERDAY’S ANNOUNCEMENT OF A NEUTRON STAR MERGER

What makes it difficult to explain the significance of yesterday’s announcement is that it consists of many important results piled up together, rather than a simple takeaway that can be reduced to a single soundbite. (That was also true of the black hole mergers announcement back in 2016, which is why I wrote a long post about it.)

So here is a list of important things we learned.  No one of them, by itself, is earth-shattering, but each one is profound, and taken together they form a major event in scientific history.

First confirmed observation of a merger of two neutron stars: We’ve known these mergers must occur, but there’s nothing like being sure. And since these things are too far away and too small to see in a telescope, the only way to be sure these mergers occur, and to learn more details about them, is with gravitational waves.  We expect to see many more of these mergers in coming years as gravitational wave astronomy increases in its sensitivity, and we will learn more and more about them.

New information about the properties of neutron stars: Neutron stars were proposed almost a hundred years ago and were confirmed to exist in the 60’s and 70’s.  But their precise details aren’t known; we believe they are like a giant atomic nucleus, but they’re so vastly larger than ordinary atomic nuclei that can’t be sure we understand all of their internal properties, and there are debates in the scientific community that can’t be easily answered… until, perhaps, now.

From the detailed pattern of the gravitational waves of this one neutron star merger, scientists already learn two things. First, we confirm that Einstein’s theory correctly predicts the basic pattern of gravitational waves from orbiting neutron stars, as it does for orbiting and merging black holes. Unlike black holes, however, there are more questions about what happens to neutron stars when they merge. The question of what happened to this pair after they merged is still out — did the form a neutron star, an unstable neutron star that, slowing its spin, eventually collapsed into a black hole, or a black hole straightaway?

But something important was already learned about the internal properties of neutron stars. The stresses of being whipped around at such incredible speeds would tear you and I apart, and would even tear the Earth apart. We know neutron stars are much tougher than ordinary rock, but how much more? If they were too flimsy, they’d have broken apart at some point during LIGO/VIRGO’s observations, and the simple pattern of gravitational waves that was expected would have suddenly become much more complicated. That didn’t happen until perhaps just before the merger.   So scientists can use the simplicity of the pattern of gravitational waves to infer some new things about how stiff and strong neutron stars are.  More mergers will improve our understanding.  Again, there is no other simple way to obtain this information.

First visual observation of an event that produces both immense gravitational waves and bright electromagnetic waves: Black hole mergers aren’t expected to create a brilliant light display, because, as I mentioned above, they’re more like open doors to an invisible playground than they are like rocks, so they merge rather quietly, without a big bright and hot smash-up.  But neutron stars are big balls of stuff, and so the smash-up can indeed create lots of heat and light of all sorts, just as you might naively expect.  By “light” I mean not just visible light but all forms of electromagnetic waves, at all wavelengths (and therefore at all frequencies.)  Scientists divide up the range of electromagnetic waves into categories. These categories are radio waves, microwaves, infrared light, visible light, ultraviolet light, X-rays, and gamma rays, listed from lowest frequency and largest wavelength to highest frequency and smallest wavelength.  (Note that these categories and the dividing lines between them are completely arbitrary, but the divisions are useful for various scientific purposes.  The only fundamental difference between yellow light, a radio wave, and a gamma ray is the wavelength and frequency; otherwise they’re exactly the same type of thing, a wave in the electric and magnetic fields.)

So if and when two neutron stars merge, we expect both gravitational waves and electromagnetic waves, the latter of many different frequencies created by many different effects that can arise when two huge balls of neutrons collide.  But just because we expect them doesn’t mean they’re easy to see.  These mergers are pretty rare — perhaps one every hundred thousand years in each big galaxy like our own — so the ones we find using LIGO/VIRGO will generally be very far away.  If the light show is too dim, none of our telescopes will be able to see it.

But this light show was plenty bright.  Gamma ray detectors out in space detected it instantly, confirming that the gravitational waves from the two neutron stars led to a collision and merger that produced very high frequency light.  Already, that’s a first.  It’s as though one had seen lightning for years but never heard thunder; or as though one had observed the waves from hurricanes for years but never observed one in the sky.  Seeing both allows us a whole new set of perspectives; one plus one is often much more than two.

Over time — hours and days — effects were seen in visible light, ultraviolet light, infrared light, X-rays and radio waves.  Some were seen earlier than others, which itself is a story, but each one contributes to our understanding of what these mergers are actually like.

Confirmation of the best guess concerning the origin of “short” gamma ray bursts:  For many years, bursts of gamma rays have been observed in the sky.  Among them, there seems to be a class of bursts that are shorter than most, typically lasting just a couple of seconds.  They come from all across the sky, indicating that they come from distant intergalactic space, presumably from distant galaxies.  Among other explanations, the most popular hypothesis concerning these short gamma-ray bursts has been that they come from merging neutron stars.  The only way to confirm this hypothesis is with the observation of the gravitational waves from such a merger.  That test has now been passed; it appears that the hypothesis is correct.  That in turn means that we have, for the first time, both a good explanation of these short gamma ray bursts and, because we know how often we observe these bursts, a good estimate as to how often neutron stars merge in the universe.

First distance measurement to a source using both a gravitational wave measure and a redshift in electromagnetic waves, allowing a new calibration of the distance scale of the universe and of its expansion rate:  The pattern over time of the gravitational waves from a merger of two black holes or neutron stars is complex enough to reveal many things about the merging objects, including a rough estimate of their masses and the orientation of the spinning pair relative to the Earth.  The overall strength of the waves, combined with the knowledge of the masses, reveals how far the pair is from the Earth.  That by itself is nice, but the real win comes when the discovery of the object using visible light, or in fact any light with frequency below gamma-rays, can be made.  In this case, the galaxy that contains the neutron stars can be determined.

Once we know the host galaxy, we can do something really important.  We can, by looking at the starlight, determine how rapidly the galaxy is moving away from us.  For distant galaxies, the speed at which the galaxy recedes should be related to its distance because the universe is expanding.

How rapidly the universe is expanding has been recently measured with remarkable precision, but the problem is that there are two different methods for making the measurement, and they disagree.   This disagreement is one of the most important problems for our understanding of the universe.  Maybe one of the measurement methods is flawed, or maybe — and this would be much more interesting — the universe simply doesn’t behave the way we think it does.

What gravitational waves do is give us a third method: the gravitational waves directly provide the distance to the galaxy, and the electromagnetic waves directly provide the speed of recession.  There is no other way to make this type of joint measurement directly for distant galaxies.  The method is not accurate enough to be useful in just one merger, but once dozens of mergers have been observed, the average result will provide important new information about the universe’s expansion.  When combined with the other methods, it may help resolve this all-important puzzle.

Best test so far of Einstein’s prediction that the speed of light and the speed of gravitational waves are identical: Since gamma rays from the merger and the peak of the gravitational waves arrived within two seconds of one another after traveling 130 million years — that is, about 5 thousand million million seconds — we can say that the speed of light and the speed of gravitational waves are both equal to the cosmic speed limit to within one part in 2 thousand million million.  Such a precise test requires the combination of gravitational wave and gamma ray observations.

Efficient production of heavy elements confirmed:  It’s long been said that we are star-stuff, or stardust, and it’s been clear for a long time that it’s true.  But there’s been a puzzle when one looks into the details.  While it’s known that all the chemical elements from hydrogen up to iron are formed inside of stars, and can be blasted into space in supernova explosions to drift around and eventually form planets, moons, and humans, it hasn’t been quite as clear how the other elements with heavier atoms — atoms such as iodine, cesium, gold, lead, bismuth, uranium and so on — predominantly formed.  Yes they can be formed in supernovas, but not so easily; and there seem to be more atoms of heavy elements around the universe than supernovas can explain.  There are many supernovas in the history of the universe, but the efficiency for producing heavy chemical elements is just too low.

It was proposed some time ago that the mergers of neutron stars might be a suitable place to produce these heavy elements.  Even those these mergers are rare, they might be much more efficient, because the nuclei of heavy elements contain lots of neutrons and, not surprisingly, a collision of two neutron stars would produce lots of neutrons in its debris, suitable perhaps for making these nuclei.   A key indication that this is going on would be the following: if a neutron star merger could be identified using gravitational waves, and if its location could be determined using telescopes, then one would observe a pattern of light that would be characteristic of what is now called a “kilonova” explosion.   Warning: I don’t yet know much about kilonovas and I may be leaving out important details. A kilonova is powered by the process of forming heavy elements; most of the nuclei produced are initially radioactive — i.e., unstable — and they break down by emitting high energy particles, including the particles of light (called photons) which are in the gamma ray and X-ray categories.  The resulting characteristic glow would be expected to have a pattern of a certain type: it would be initially bright but would dim rapidly in visible light, with a long afterglow in infrared light.  The reasons for this are complex, so let me set them aside for now.  The important point is that this pattern was observed, confirming that a kilonova of this type occurred, and thus that, in this neutron star merger, enormous amounts of heavy elements were indeed produced.  So we now have a lot of evidence, for the first time, that almost all the heavy chemical elements on and around our planet were formed in neutron star mergers.  Again, we could not know this if we did not know that this was a neutron star merger, and that information comes only from the gravitational wave observation.

MISCELLANEOUS QUESTIONS

Did the merger of these two neutron stars result in a new black hole, a larger neutron star, or an unstable rapidly spinning neutron star that later collapsed into a black hole?

We don’t yet know, and maybe we won’t know.  Some scientists involved appear to be leaning toward the possibility that a black hole was formed, but others seem to say the jury is out.  I’m not sure what additional information can be obtained over time about this.

If the two neutron stars formed a black hole, why was there a kilonova?  Why wasn’t everything sucked into the black hole?

Black holes aren’t vacuum cleaners; they pull things in via gravity just the same way that the Earth and Sun do, and don’t suck things in some unusual way.  The only crucial thing about a black hole is that once you go in you can’t come out.  But just as when trying to avoid hitting the Earth or Sun, you can avoid falling in if you orbit fast enough or if you’re flung outward before you reach the edge.

The point in a neutron star merger is that the forces at the moment of merger are so intense that one or both neutron stars are partially ripped apart.  The material that is thrown outward in all directions, at an immense speed, somehow creates the bright, hot flash of gamma rays and eventually the kilonova glow from the newly formed atomic nuclei.  Those details I don’t yet understand, but I know they have been carefully studied both with approximate equations and in computer simulations such as this one and this one.  However, the accuracy of the simulations can only be confirmed through the detailed studies of a merger, such as the one just announced.  It seems, from the data we’ve seen, that the simulations did a fairly good job.  I’m sure they will be improved once they are compared with the recent data.

 

 

 


Filed under: Astronomy, Gravitational Waves Tagged: black holes, Gravitational Waves, LIGO, neutron stars

Matt StrasslerLIGO and VIRGO Announce a Joint Observation of a Black Hole Merger

Welcome, VIRGO!  Another merger of two big black holes has been detected, this time by both LIGO’s two detectors and by VIRGO as well.

Aside from the fact that this means that the VIRGO instrument actually works, which is great news, why is this a big deal?  By adding a third gravitational wave detector, built by the VIRGO collaboration, to LIGO’s Washington and Louisiana detectors, the scientists involved in the search for gravitational waves now can determine fairly accurately the direction from which a detected gravitational wave signal is coming.  And this allows them to do something new: to tell their astronomer colleagues roughly where to look in the sky, using ordinary telescopes, for some form of electromagnetic waves (perhaps visible light, gamma rays, or radio waves) that might have been produced by whatever created the gravitational waves.

The point is that with three detectors, one can triangulate.  The gravitational waves travel for billions of years, traveling at the speed of light, and when they pass by, they are detected at both LIGO detectors and at VIRGO.  But because it takes light a few thousandths of a second to travel the diameter of the Earth, the waves arrive at slightly different times at the LIGO Washington site, the LIGO Louisiana site, and the VIRGO site in Italy.  The precise timing tells the scientists what direction the waves were traveling in, and therefore roughly where they came from.  In a similar way, using the fact that sound travels at a known speed, the times that a gunshot is heard at multiple locations can be used by police to determine where the shot was fired.

You can see the impact in the picture below, which is an image of the sky drawn as a sphere, as if seen from outside the sky looking in.  In previous detections of black hole mergers by LIGO’s two detectors, the scientists could only determine a large swath of sky where the observed merger might have occurred; those are the four colored regions that stretch far across the sky.  But notice the green splotch at lower left.  That’s the region of sky where the black hole merger announced today occurred.  The fact that this region is many times smaller than the other four reflects what including VIRGO makes possible.  It’s a small enough region that one can search using an appropriate telescope for something that is making visible light, or gamma rays, or radio waves.

Skymap of the LIGO/Virgo black hole mergers.

Image credit: LIGO/Virgo/Caltech/MIT/Leo Singer (Milky Way image: Axel Mellinger)

 

While a black hole merger isn’t expected to be observable by other telescopes, and indeed nothing was observed by other telescopes this time, other events that LIGO might detect, such as a merger of two neutron stars, may create an observable effect. We can hope for such exciting news over the next year or two.


Filed under: Astronomy, Gravitational Waves Tagged: black holes, Gravitational Waves, LIGO

October 16, 2017

Matt StrasslerA Scientific Breakthrough! Combining Gravitational and Electromagnetic Waves

Gravitational waves are now the most important new tool in the astronomer’s toolbox.  Already they’ve been used to confirm that large black holes — with masses ten or more times that of the Sun — and mergers of these large black holes to form even larger ones, are not uncommon in the universe.   Today it goes a big step further.

It’s long been known that neutron stars, remnants of collapsed stars that have exploded as supernovas, are common in the universe.  And it’s been known almost as long that sometimes neutron stars travel in pairs.  (In fact that’s how gravitational waves were first discovered, indirectly, back in the 1970s.)  Stars often form in pairs, and sometimes both stars explode as supernovas, leaving their neutron star relics in orbit around one another.  Neutron stars are small — just ten or so kilometers (miles) across.  According to Einstein’s theory of gravity, a pair of stars should gradually lose energy by emitting gravitational waves into space, and slowly but surely the two objects should spiral in on one another.   Eventually, after many millions or even billions of years, they collide and merge into a larger neutron star, or into a black hole.  This collision does two things.

  1. It makes some kind of brilliant flash of light — electromagnetic waves — whose details are only guessed at.  Some of those electromagnetic waves will be in the form of visible light, while much of it will be in invisible forms, such as gamma rays.
  2. It makes gravitational waves, whose details are easier to calculate and which are therefore distinctive, but couldn’t have been detected until LIGO and VIRGO started taking data, LIGO over the last couple of years, VIRGO over the last couple of months.

It’s possible that we’ve seen the light from neutron star mergers before, but no one could be sure.  Wouldn’t it be great, then, if we could see gravitational waves AND electromagnetic waves from a neutron star merger?  It would be a little like seeing the flash and hearing the sound from fireworks — seeing and hearing is better than either one separately, with each one clarifying the other.  (Caution: scientists are often speaking as if detecting gravitational waves is like “hearing”.  This is only an analogy, and a vague one!  It’s not at all the same as acoustic waves that we can hear with our ears, for many reasons… so please don’t take it too literally.)  If we could do both, we could learn about neutron stars and their properties in an entirely new way.

Today, we learned that this has happened.  LIGO , with the world’s first two gravitational observatories, detected the waves from two merging neutron stars, 130 million light years from Earth, on August 17th.  (Neutron star mergers last much longer than black hole mergers, so the two are easy to distinguish; and this one was so close, relatively speaking, that it was seen for a long while.)  VIRGO, with the third detector, allows scientists to triangulate and determine roughly where mergers have occurred.  They saw only a very weak signal, but that was extremely important, because it told the scientists that the merger must have occurred in a small region of the sky where VIRGO has a relative blind spot.  That told scientists where to look.

The merger was detected for more than a full minute… to be compared with black holes whose mergers can be detected for less than a second.  It’s not exactly clear yet what happened at the end, however!  Did the merged neutron stars form a black hole or a neutron star?  The jury is out.

At almost exactly the moment at which the gravitational waves reached their peak, a blast of gamma rays — electromagnetic waves of very high frequencies — were detected by a different scientific team, the one from FERMI. FERMI detects gamma rays from the distant universe every day, and a two-second gamma-ray-burst is not unusual.  And INTEGRAL, another gamma ray experiment, also detected it.   The teams communicated within minutes.   The FERMI and INTEGRAL gamma ray detectors can only indicate the rough region of the sky from which their gamma rays originate, and LIGO/VIRGO together also only give a rough region.  But the scientists saw those regions overlapped.  The evidence was clear.  And with that, astronomy entered a new, highly anticipated phase.

Already this was a huge discovery.  Brief gamma-ray bursts have been a mystery for years.  One of the best guesses as to their origin has been neutron star mergers.  Now the mystery is solved; that guess is apparently correct. (Or is it?  Probably, but the gamma ray discovery is surprisingly dim, given how close it is.  So there are still questions to ask.)

Also confirmed by the fact that these signals arrived within a couple of seconds of one another, after traveling for over 100 million years from the same source, is that, indeed, the speed of light and the speed of gravitational waves are exactly the same — both of them equal to the cosmic speed limit, just as Einstein’s theory of gravity predicts.

Next, these teams quickly told their astronomer friends to train their telescopes in the general area of the source. Dozens of telescopes, from every continent and from space, and looking for electromagnetic waves at a huge range of frequencies, pointed in that rough direction and scanned for anything unusual.  (A big challenge: the object was near the Sun in the sky, so it could be viewed in darkness only for an hour each night!) Light was detected!  At all frequencies!  The object was very bright, making it easy to find the galaxy in which the merger took place.  The brilliant glow was seen in gamma rays, ultraviolet light, infrared light, X-rays, and radio.  (Neutrinos, particles that can serve as another way to observe distant explosions, were not detected this time.)

And with so much information, so much can be learned!

Most important, perhaps, is this: from the pattern of the spectrum of light, the conjecture seems to be confirmed that the mergers of neutron stars are important sources, perhaps the dominant one, for many of the heavy chemical elements — iodine, iridium, cesium, gold, platinum, and so on — that are forged in the intense heat of these collisions.  It used to be thought that the same supernovas that form neutron stars in the first place were the most likely source.  But now it seems that this second stage of neutron star life — merger, rather than birth — is just as important.  That’s fascinating, because neutron star mergers are much more rare than the supernovas that form them.  There’s a supernova in our Milky Way galaxy every century or so, but it’s tens of millenia or more between these “kilonovas”, created in neutron star mergers.

If there’s anything disappointing about this news, it’s this: almost everything that was observed by all these different experiments was predicted in advance.  Sometimes it’s more important and useful when some of your predictions fail completely, because then you realize how much you have to learn.  Apparently our understanding of gravity, of neutron stars, and of their mergers, and of all sorts of sources of electromagnetic radiation that are produced in those merges, is even better than we might have thought. But fortunately there are a few new puzzles.  The X-rays were late; the gamma rays were dim… we’ll hear more about this shortly, as NASA is holding a second news conference.

Some highlights from the second news conference:

  • New information about neutron star interiors, which affects how large they are and therefore how exactly they merge, has been obtained
  • The first ever visual-light image of a gravitational wave source, from the Swope telescope, at the outskirts of a distant galaxy; the galaxy’s center is the blob of light, and the arrow points to the explosion.

  • The theoretical calculations for a kilonova explosion suggest that debris from the blast should rather quickly block the visual light, so the explosion dims quickly in visible light — but infrared light lasts much longer.  The observations by the visible and infrared light telescopes confirm this aspect of the theory; and you can see evidence for that in the picture above, where four days later the bright spot is both much dimmer and much redder than when it was discovered.
  • Estimate: the total mass of the gold and platinum produced in this explosion is vastly larger than the mass of the Earth.
  • Estimate: these neutron stars were formed about 10 or so billion years ago.  They’ve been orbiting each other for most of the universe’s history, and ended their lives just 130 million years ago, creating the blast we’ve so recently detected.
  • Big Puzzle: all of the previous gamma-ray bursts seen up to now have always had shone in ultraviolet light and X-rays as well as gamma rays.   But X-rays didn’t show up this time, at least not initially.  This was a big surprise.  It took 9 days for the Chandra telescope to observe X-rays, too faint for any other X-ray telescope.  Does this mean that the two neutron stars created a black hole, which then created a jet of matter that points not quite directly at us but off-axis, and shines by illuminating the matter in interstellar space?  This had been suggested as a possibility twenty years ago, but this is the first time there’s been any evidence for it.
  • One more surprise: it took 16 days for radio waves from the source to be discovered, with the Very Large Array, the most powerful existing radio telescope.  The radio emission has been growing brighter since then!  As with the X-rays, this seems also to support the idea of an off-axis jet.
  • Nothing quite like this gamma-ray burst has been seen — or rather, recognized — before.  When a gamma ray burst doesn’t have an X-ray component showing up right away, it simply looks odd and a bit mysterious.  Its harder to observe than most bursts, because without a jet pointing right at us, its afterglow fades quickly.  Moreover, a jet pointing at us is bright, so it blinds us to the more detailed and subtle features of the kilonova.  But this time, LIGO/VIRGO told scientists that “Yes, this is a neutron star merger”, leading to detailed study from all electromagnetic frequencies, including patient study over many days of the X-rays and radio.  In other cases those observations would have stopped after just a short time, and the whole story couldn’t have been properly interpreted.

 

 


Filed under: Astronomy, Gravitational Waves

Sean CarrollStandard Sirens

Everyone is rightly excited about the latest gravitational-wave discovery. The LIGO observatory, recently joined by its European partner VIRGO, had previously seen gravitational waves from coalescing black holes. Which is super-awesome, but also a bit lonely — black holes are black, so we detect the gravitational waves and little else. Since our current gravitational-wave observatories aren’t very good at pinpointing source locations on the sky, we’ve been completely unable to say which galaxy, for example, the events originated in.

This has changed now, as we’ve launched the era of “multi-messenger astronomy,” detecting both gravitational and electromagnetic radiation from a single source. The event was the merger of two neutron stars, rather than black holes, and all that matter coming together in a giant conflagration lit up the sky in a large number of wavelengths simultaneously.

Look at all those different observatories, and all those wavelengths of electromagnetic radiation! Radio, infrared, optical, ultraviolet, X-ray, and gamma-ray — soup to nuts, astronomically speaking.

A lot of cutting-edge science will come out of this, see e.g. this main science paper. Apparently some folks are very excited by the fact that the event produced an amount of gold equal to several times the mass of the Earth. But it’s my blog, so let me highlight the aspect of personal relevance to me: using “standard sirens” to measure the expansion of the universe.

We’re already pretty good at measuring the expansion of the universe, using something called the cosmic distance ladder. You build up distance measures step by step, determining the distance to nearby stars, then to more distant clusters, and so forth. Works well, but of course is subject to accumulated errors along the way. This new kind of gravitational-wave observation is something else entirely, allowing us to completely jump over the distance ladder and obtain an independent measurement of the distance to cosmological objects. See this LIGO explainer.

The simultaneous observation of gravitational and electromagnetic waves is crucial to this idea. You’re trying to compare two things: the distance to an object, and the apparent velocity with which it is moving away from us. Usually velocity is the easy part: you measure the redshift of light, which is easy to do when you have an electromagnetic spectrum of an object. But with gravitational waves alone, you can’t do it — there isn’t enough structure in the spectrum to measure a redshift. That’s why the exploding neutron stars were so crucial; in this event, GW170817, we can for the first time determine the precise redshift of a distant gravitational-wave source.

Measuring the distance is the tricky part, and this is where gravitational waves offer a new technique. The favorite conventional strategy is to identify “standard candles” — objects for which you have a reason to believe you know their intrinsic brightness, so that by comparing to the brightness you actually observe you can figure out the distance. To discover the acceleration of the universe, for example,  astronomers used Type Ia supernovae as standard candles.

Gravitational waves don’t quite give you standard candles; every one will generally have a different intrinsic gravitational “luminosity” (the amount of energy emitted). But by looking at the precise way in which the source evolves — the characteristic “chirp” waveform in gravitational waves as the two objects rapidly spiral together — we can work out precisely what that total luminosity actually is. Here’s the chirp for GW170817, compared to the other sources we’ve discovered — much more data, almost a full minute!

So we have both distance and redshift, without using the conventional distance ladder at all! This is important for all sorts of reasons. An independent way of getting at cosmic distances will allow us to measure properties of the dark energy, for example. You might also have heard that there is a discrepancy between different ways of measuring the Hubble constant, which either means someone is making a tiny mistake or there is something dramatically wrong with the way we think about the universe. Having an independent check will be crucial in sorting this out. Just from this one event, we are able to say that the Hubble constant is 70 kilometers per second per megaparsec, albeit with large error bars (+12, -8 km/s/Mpc). That will get much better as we collect more events.

So here is my (infinitesimally tiny) role in this exciting story. The idea of using gravitational-wave sources as standard sirens was put forward by Bernard Schutz all the way back in 1986. But it’s been developed substantially since then, especially by my friends Daniel Holz and Scott Hughes. Years ago Daniel told me about the idea, as he and Scott were writing one of the early papers. My immediate response was “Well, you have to call these things `standard sirens.'” And so a useful label was born.

Sadly for my share of the glory, my Caltech colleague Sterl Phinney also suggested the name simultaneously, as the acknowledgments to the paper testify. That’s okay; when one’s contribution is this extremely small, sharing it doesn’t seem so bad.

By contrast, the glory attaching to the physicists and astronomers who pulled off this observation, and the many others who have contributed to the theoretical understanding behind it, is substantial indeed. Congratulations to all of the hard-working people who have truly opened a new window on how we look at our universe.

October 14, 2017

Richard EastherThe Unforgivable Curses

Fabrication and plagiarism are the unforgivable curses of science – crimes of no return. If you are caught committing them you will not wind up in an academic Azkaban but you would be hard put to find another job in a university as a parking warden, much less a research role. Ironically, to outsiders these infractions may appear to be relatively victimless crimes. Do a few faked graphs really hurt anyone? If music can be downloaded with impunity is plagiarism a terrible sin? However, these transgressions are unforgivable because they undermine the integrity of the system, not as a result of their impact on individuals. An expert counterfeiter whose bogus Benjamins are never spotted by banks might claim that no-one was hurt by their escapades, but financial systems can be damaged by a flood of fake notes. Likewise, we trust the integrity of our colleagues when we build on their work. We tell students to "check everything" but this is an impossible goal, since at some point you would do nothing but verify the work of others, so dishonesty undermines science just as debased coinage threatens an economy. 

Last month the American Geophysical Union revised its understanding of "scientific misconduct", a term encompassesing plagiarism and data-faking, to explicitly include a new category of crime – discrimination, sexual harassment, and bullying. These are transgressions against individuals, but the AGU's decision recognises that they weaken science itself; systematically burdening those who are disproportionately on the receiving end of "poor behaviour", blighting lives and careers, boosting inequality, and robbing the field of talent. This recognises that many female geoscientists experience harassment or worse in the field, often while they are physically isolated and far from help. Just last week sickening allegations of bullying and assault during trips to remote Antarctic valleys were levelled against David Marchant, a geoscientist at Boston University. It was telling that while many of the worst allegations were corroborated by others, some of Marchant's defenders pointed out that they themselves had never witnessed such behaviour by Marchant and that these infractions were "historical", with no recent allegations of misconduct coming to light. However, if this had been a case of data-faking there would be no ill-defined statute of limitations and "some of his work is legitimate" would in no way constitute a defence.

A similar paradox was visible last year, thanks to a defamation case pursued by astrophysicist Mike Bode against Carole Mundell, who intervened after he wrote a glowing letter of recommendation for a mutual colleague facing an active harassment investigation. The claim that Mundell had defamed Bode by this action was witheringly rejected but I cannot imagine anyone writing a letter of reference – much less a good one – for a person facing live allegations of intellectual misconduct. Moreover, the position in question was Chief Scientist in the Square Kilometre Array - South Africa, an organisation which will have a key role supporting South Africa's engagement with a multi-hundred million dollar international collaboration involving vast amounts of public money from a half-dozen countries. This hire fell through, but the job was later filled by a scientist who had left (and was apparently "dismissed" from) a leadership position at the Arecibo Observatory while "under a cloud", demonstrating just how hard it can be for a senior scientist to definitively torpedo his own career.* [This situation may also lead one to draw inferences about the institutional health of SKA-South Africa, but that is another matter.]

This is the same month that the Harvey Weinstein story broke and his serial sexual assaults appear to have been an open secret within the entertainment industry. Despite this, any number of male stars who had benefited from their association with Weinstein were shocked, shocked to hear the news, or to think that their industry may suffer from an endemic harassment problem. And while we lack Hollywood's glamour, senior academics have a similar ability to make or break the careers of young people vying for their big chance, and it is similarly a breeding ground for abusive behaviour. Likewise, just as many men averred that they had personally never seen poor behaviour by Weinstein, many scientists assert that because they have never witnessed a colleague harass that person cannot be an harasser – a stunning lack of logic for people who spend their professional lives drawing inferences about events we cannot hope to witness with our own eyes. Likewise, we all know that many senior harassers in science are yet to have their "Geoff Marcy moment" and, like Weinstein, some are rumoured to have agressively lawyered up to keep a lid on simmering scandals.

If this is our truth, all scientists – and particular all white men in science whose progress has never been potentially impeded by our gender, our race, or our sexual identity – are complicit in having allowed it to happen. And it is on us to sort it out. 



FOOTNOTE: *To be clear, the investigation of the original applicant was apparently never completed so these allegations were never formally substantiated. However, published sources refer to them as "sexual assault" so it does appear that they were of a serious nature. Likewise, the specific employment issues faced by the person now in the role have not been fully disclosed.

COMMENTS: Comments off on this one. Getting way more than normal and they didn't seem to be heading anywhere constructive.

October 13, 2017

Scott AaronsonNot the critic who counts

There’s a website called Stop Timothy Gowers! !!! —yes, that’s the precise name, including the exclamation points.  The site is run by a mathematician who for years went under the pseudonym “owl / sowa,” but who’s since outed himself as Nikolai Ivanov.

For those who don’t know, Sir Timothy Gowers is a Fields Medalist, known for seminal contributions including the construction of Banach spaces with strange properties, the introduction of the Gowers norm, explicit bounds for the regularity lemma, and more—but who’s known at least as well for explaining math, in his blog, books, essays, MathOverflow, and elsewhere, in a remarkably clear, friendly, and accessible way.  He’s also been a leader in the fight to free academia from predatory publishers.

So why on earth would a person like that need to be stopped?  According to sowa, because Gowers, along with other disreputable characters like Terry Tao and Endre Szemerédi and the late Paul Erdös, represents a dangerous style of doing mathematics: a style that’s just as enamored of concrete problems as it is of abstract theory-building, and that doesn’t even mind connections to other fields like theoretical computer science.  If that style becomes popular with young people, it will prevent faculty positions and prestigious prizes from going to the only deserving kind of mathematics: the kind exemplified by Bourbaki and by Alexander Grothendieck, which builds up theoretical frameworks with principled disdain for the solving of simple-to-state problems.  Mathematical prizes going to the wrong people—or even going to the right people but presented by the wrong people—are constant preoccupations of sowa’s.  Read his blog and let me know if I’ve unfairly characterized it.


Now for something totally unrelated.  I recently discovered a forum on Reddit called SneerClub, which, as its name suggests, is devoted to sneering.  At whom?  Basically, at anyone who writes anything nice about nerds or Silicon Valley, or who’s associated with the “rationalist community,” or the Effective Altruist movement, or futurism or AI risk.  Typical targets include Scott Alexander, Eliezer Yudkowsky, Robin Hanson, Michael Vassar, Julia Galef, Paul Graham, Ray Kurzweil, Elon Musk … and with a list like that, I guess I should be honored to be a regular target too.

The basic SneerClub M.O. is to seize on a sentence that, when ripped from context and reflected through enough hermeneutic funhouse mirrors, can make nerds out to look like right-wing villains, oppressing the downtrodden with rays of disgusting white maleness (even, it seems, ones who aren’t actually white or male).  So even if the nerd under discussion turns out to be, say, a leftist or a major donor to anti-Trump causes or malaria prevention or whatever, readers can feel reassured that their preexisting contempt was morally justified after all.

Thus: Eliezer Yudkowsky once wrote a piece of fiction in which a character, breaking the fourth wall, comments that another character seems to have no reason to be in the story.  This shows that Eliezer is a fascist who sees people unlike himself as having no reason to exist, and who’d probably exterminate them if he could.  Or: many rationalist nerds spend a lot of effort arguing against Trumpists, alt-righters, and neoreactionaries.  The fact that they interact with those people, in order to rebut them, shows that they’re probably closet neoreactionaries themselves.


When I browse sites like “Stop Timothy Gowers! !!!” or SneerClub, I tend to get depressed about the world—and yet I keep browsing, out of a fascination that I don’t fully understand.  I ask myself: how can a person read Gowers’s blog, or Slate Star Codex, without seeing what I see, which is basically luminous beacons of intellectual honesty and curiosity and clear thought and sparkling prose and charity to dissenting views, shining out far across the darkness of online discourse?

(Incidentally, Gowers lists “Stop Timothy Gowers! !!!” in his blogroll, and I likewise learned of SneerClub only because Scott Alexander linked to it.)

I’m well aware that this very question will only prompt more sneers.  From the sneerers’ perspective, they and their friends are the beacons, while Gowers or Scott Alexander are the darkness.  How could a neutral observer possibly decide who was right?

But then I reflect that there’s at least one glaring asymmetry between the sides.

If you read Timothy Gowers’s blog, one thing you’ll constantly notice is mathematics.  When he’s not weighing in on current events—for example, writing against Brexit, Elsevier, or the destruction of a math department by cost-cutting bureaucrats—Gowers is usually found delighting in exploring a new problem, or finding a new way to explain a known result.  Often, as with his dialogue with John Baez and others about the recent “p=t” breakthrough, Gowers is struggling to understand an unfamiliar piece of mathematics—and, completely unafraid of looking like an undergrad rather than a Fields Medalist, he simply shares each step of his journey, mistakes and all, inviting you to follow for as long as you can keep up.  Personally, I find it electrifying: why can’t all mathematicians write like that?

By contrast, when you read sowa’s blog, for all the anger about the sullying of mathematics by unworthy practitioners, there’s a striking absence of mathematical exposition.  Not once does sowa ever say: “OK, forget about the controversy.  Since you’re here, instead of just telling you about the epochal greatness of Grothendieck, let me walk you through an example.  Let me share a beautiful little insight that came out of his approach, in so self-contained a way that even a physicist or computer scientist will understand it.”  In other words, sowa never uses his blog to do what Gowers does every day.  Sowa might respond that that’s what papers are for—but the thing about a blog is that it gives you the chance to reach a much wider readership than your papers do.  If someone is already blogging anyway, why wouldn’t they seize that chance to share something they love?

Similar comments apply to Slate Star Codex versus r/SneerClub.  When I read an SSC post, even if I vehemently disagree with the central thesis (which, yes, happens sometimes), I always leave the diner intellectually sated.  For the rest of the day, my brain is bloated with new historical tidbits, or a deep-dive into the effects of a psychiatric drug I’d never heard of, or a jaw-dropping firsthand account of life as a medical resident, or a different way to think about a philosophical problem—or, if nothing else, some wicked puns and turns of phrase.

But when I visit r/SneerClub—well, I get exactly what’s advertised on the tin.  Once you’ve read a few, the sneers become pretty predictable.  I thought that for sure, I’d occasionally find something like: “look, we all agree that Eliezer Yudkowsky and Elon Musk and Nick Bostrom are talking out their asses about AI, and are coddled white male emotional toddlers to boot.  But even granting that, what do we think about AI?  Are intelligences vastly smarter than humans possible?  If not, then what principle rules them out?  What, if anything, can be said about what a superintelligent being would do, or want?  Just for fun, let’s explore this a little: I mean the actual questions themselves, not the psychological reasons why others explore them.”

That never happens.  Why not?


There’s another fascinating Reddit forum called “RoastMe”, where people submit a photo of themselves holding a sign expressing their desire to be “roasted”—and then hundreds of Redditors duly oblige, savagely mocking the person’s appearance and anything else they can learn about the person from their profile.  Many of the roasts are so merciless that one winces vicariously for the poor schmucks who signed up for this, hopes that they won’t be driven to self-harm or suicide.  But browse enough roasts, and a realization starts to sink in: there’s no person, however beautiful or interesting they might’ve seemed a priori, for whom this roasting can’t be accomplished.  And that very generality makes the roasting lose much of its power—which maybe, optimistically, was the point of the whole exercise?

In the same way, spend a few days browsing SneerClub, and the truth hits you: once you’ve made their enemies list, there’s nothing you could possibly say or do that they wouldn’t sneer at.  Like, say it’s a nice day outside, and someone will reply:

“holy crap how much of an entitled nerdbro do you have to be, to erase all the marginalized people for whom the day is anything but ‘nice’—or who might be unable to go outside at all, because of limited mobility or other factors never even considered in these little rich white boys’ geek utopia?”

For me, this realization is liberating.  If appeasement of those who hate you is doomed to fail, why bother even embarking on it?


I’ve spent a lot of time on this blog criticizing D-Wave, and cringeworthy popular articles about quantum computing, and touted arXiv preprints that say wrong things.  But I hope regular readers feel like I’ve also tried to offer something positive: y’know, actual progress in quantum computing that actually excites me, or a talk about big numbers, or an explanation of the Bekenstein bound, whatever.  My experience with sites like “Stop Timothy Gowers! !!!” and SneerClub makes me feel like I ought to be doing less criticizing and more positive stuff.

Why, because I fear turning into a sneerer myself?  No, it’s subtler than that: because reading the sneerers drives home for me that it’s a fool’s quest to try to become what Scott Alexander once called an “apex predator of the signalling world.”

At the risk of stating the obvious: if you write, for example, that Richard Feynman was a self-aggrandizing chauvinist showboater, then even if your remarks have a nonzero inner product with the truth, you don’t thereby “transcend” Feynman and stand above him, in the same way that set theory transcends and stands above arithmetic by constructing a model for it.  Feynman’s achievements don’t thereby become your achievements.

When I was in college, I devoured Ray Monk’s two-volume biography of Bertrand Russell.  This is a superb work of scholarship, which I warmly recommend to everyone.  But there’s one problem with it: Monk is constantly harping on his subject’s failures, and he has no sense of humor, and Russell does.  The result is that, whenever Monk quotes Russell’s personal letters at length to prove what a jerk Russell was, the quoted passages just leap off the page—as if old Bertie has come back from the dead to share a laugh with you, the reader, while his biographer looks on sternly and says, “you two think this is funny?”

For a writer, I can think of no higher aspiration than that: to write like Bertrand Russell or like Scott Alexander—in such a way that, even when people quote you to stand above you, your words break free of the imprisoning quotation marks, wiggle past the critics, and enter the minds of readers of your generation and of generations not yet born.


Update (Nov. 13): Since apparently some people didn’t know (?!), the title of this post comes from the famous Teddy Roosevelt quote:

It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again, because there is no effort without error and shortcoming; but who does actually strive to do the deeds; who knows great enthusiasms, the great devotions; who spends himself in a worthy cause; who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly, so that his place shall never be with those cold and timid souls who neither know victory nor defeat.

Sean CarrollMind-Blowing Quantum Mechanics

Trying to climb out from underneath a large pile of looming (and missed) deadlines, and in the process I’m hoping to ramp back up the real blogging. In the meantime, here are a couple of videos to tide you over.

First, an appearance a few weeks ago on Joe Rogan’s podcast. Rogan is a professional comedian and mixed-martial arts commentator, but has built a great audience for his wide-ranging podcast series. One of the things that makes him a good interviewer is his sincere delight in the material, as evidenced here by noting repeatedly that his mind had been blown. We talked for over two and a half hours, covering cosmology and quantum mechanics but also some bits about AI and pop culture.

And here’s a more straightforward lecture, this time at King’s College in London. The topic was “Extracting the Universe from the Wave Function,” which I’ve used for a few talks that ended up being pretty different in execution. This one was aimed at undergraduate physics students, some of whom hadn’t even had quantum mechanics. So the first half is a gentle introduction to many-worlds theory and why it’s the best version of quantum mechanics, and the second half tries to explain our recent efforts to emerge space itself out of quantum entanglement.

I was invited to King’s by Eugene Lim, one of my former grad students and now an extremely productive faculty member in his own right. It’s always good to see your kids grow up to do great things!

October 09, 2017

John PreskillA Few Words With Caltech Research Scientist, David Boyd

Twenty years ago, David Boyd began his career at Caltech as a Postdoctoral Scholar with Dave Goodwin, and since 2012 has held the position of Research Scientist in the Division of Physics, Mathematics and Astronomy.  A 20 year career at Caltech is in itself a significant achievement considering Caltech’s flair for amassing the very best scientists from around the world.  Throughout Boyd’s career he has secured 7 patents, and most recently discovered a revolutionary single-step method for growing graphene.  The method allows for unprecedented continuity in graphene growth essential to significantly scaling-up production capacity.  Boyd worked with a number of great scientists at the outset of his career.  Notably, he gained a passion for science from Professor Thomas Wdowiak (Mars’ Wdowiak Ridge is named in his honor) at the University of Alabama at Birmingham as an undergraduate, and worked as David Goodwin’s (best known for developing methods for growing thin film high-purity diamonds) postdoc at Caltech.  Currently, Boyd is formulating a way to apply Goodwin’s reaction modeling code to graphene.  Considering Boyd’s accomplishments and extensive scientific knowledge, I feel fortunate to have been afforded the opportunity to work in his lab the past six summers. I have learned much from Boyd, but I still have more questions (not all scientific), so I requested an interview and he graciously accepted.

On the day of the interview, I meet Boyd at his office on campus at Caltech.  We walk a ways down a sunlit hallway and out to a balcony through two glass doors.  There’s a slight breeze in the air, a smell of nearby roses, and the temperature is perfect.  It’s a picturesque day in Pasadena.  We sit at a table and I ask my first question.

How many patents do you own?

I have seven patents.  The graphene patent was really hard to get, but we got it.  We just got it executed in China, so they are allowed to use it.  This is particularly exciting because of all the manufacturing in China.  The patent system has changed a bit, so it’s getting harder and harder.  You can come up with the idea, but if disparate components have already been patented, then you can’t get the patent for combining them in a unique way.  The invention has to provide a result that is unexpected or not obvious, and the patent for growing graphene with a one step process was just that.  The one step process refers to cleaning the copper substrate and growing graphene under the same chemistry in a continuous manner.  What used to be a two step process can be done in one.

You don’t have to anneal the substrate to 1000 degrees before growing.

Exactly.  Annealing the copper first and then growing doesn’t allow for a nice continuous process.  Removing the annealing step means the graphene is growing in an environment with significantly lower temperatures, which is important for CMOS or computer chip manufacturing.

Which patents do you hold most dear?

Usually in the research areas that are really cutting edge.  I have three patents in plasmonics, and that was a fun area 10 years ago.  It was a new area and we were doing something really exciting.  When you patent something, an application may never be realized, sometimes they get used and sometimes they don’t.  The graphene patent has already been licensed, so we’ve received quite a bit of traction.  As far as commercial success, the graphene has been much more successful than the other ones, but plasmonics were a lot of fun.  Water desalinization may be one application, and now there is a whole field of plasmonic chemistry.  A company has not yet licensed it, so it may have been too far ahead of its time for application anytime soon.

When did you realize you wanted to be a scientist?

I liked Physics in high school, and then I had a great mentor in college, Thomas Wdowiak.  Wdowiak showed me how to work in the lab.  Science is one of those things where an initial spark of interest drives you into action.  I became hooked, because of my love for science, the challenge it offers, and the simple fact I have fun with it.  I feel it’s very important to get into the lab and start learning science as early as possible in your education.

Were you identified as a gifted student?

I don’t think that’s a good marker.  I went to a private school early on, but no, I don’t think I was good at what they were looking for, no I wasn’t.  It comes down to what you want to do.  If you want to do something and you’re motivated to do it, you’ll find ways to make it happen.  If you want to code, you start coding, and that’s how you get good at it.  If you want to play music and have a passion for it, at first it may be your parents saying you have to go practice, but in the end it’s the passion that drives everything else.

Did you like high school?

I went to high school in Alabama and I had a good Physics teacher.  It was not the most academic of places, and if you were into academics the big thing there was to go to medical school.  I just hated memorizing things so I didn’t go that route.

Were AP classes offered at your high school, and if so, were you an AP student?

Yeah, I did take AP classes.  My high school only had AP English and AP Math, but it was just coming onboard at that time.  I took AP English because I liked the challenge and I love reading.

Were you involved in any extracurricular activities in school?

I earned the rank of Eagle Scout in the Boy Scouts.  I also raced bicycles in high school, and I was a several time state champion.  I finished high school (in America) and wanted to be a professional cyclist.  So, I got involved in the American Field Service (AFS), and did an extra year of high school in Italy as an exchange student where I ended up racing with some of the best cyclists in the world all through Italy.  It was a fantastic experience.

Did you have a college in mind for your undergraduate studies?  

No, I didn’t have a school in mind.  I had thought about the medical school path, so I considered taking pre-med courses at the local college, University of Alabama at Birmingham (UAB), because they have a good medical school.  Then UAB called me and said I earned an academic scholarship.  My father advised me that it would be a good idea to go there since it’s paid for.  I could take pre-med courses and then go to medical school afterwards if I wanted.  Well, I was in an honors program at the university and met an astronomer by the name Thomas Wdowiak.  I definitely learned from him how to be a scientist.  He also gave me a passion for being a scientist.  So, after working with Wdowiak for a while, I decided I didn’t want to go to medical school, I wanted to study Physics.  They just named a ridge on Mars after him, Wdowiak Ridge.  He was a very smart guy, and a great experimentalist who really grew my interest in science… he was great.

Did you do research while earning your undergraduate degree?  

Yes, Wdowiak had me in the lab working all the time.  We were doing real stuff in the lab.  I did a lot of undergraduate research in Astronomy, and the whole point was to get in the lab and work on science.  Because I worked with Wdowiak I had one or two papers published by the time I graduated.  Wdowiak taught me how to do science.   And that’s the thing, you have to want to do science, have a lab or a place to practice, and then start working.  

So, he was professor and experimentalist.

He was a very hands-on lab guy.  I was in the lab breaking things and fixing things. Astronomers are fun to work with.  He was an experimental astronomer who taught me, among other things, spectroscopy, vacuum technology, and much about the history of science.  In fact, it was Professor Wdowiak who told me about Millikan’s famous “Machine Shop in a Vacuum” experiment that inspired the graphene discovery… it all comes back to Caltech!

Name another scientist, other than Wdowiak, who has influenced you.

Richard Feynman also had a big influence on me.  I did not know him, but I love his books.

Were you focused solely on academics in college, or did you have a social life as well?

I was part of a concert committee that brought bands to the college.  We had some great bands like R.E.M. and the Red Hot Chili Peppers play, and I would work as a stagehand and a roadie for the shows.

So, you weren’t doing keg stands at fraternity parties?

No, it wasn’t like that.  I liked to go out and socialize, but no keg stands.  Though, I have had friends that were very successful that did do keg stands.

What’s your least favorite part of your job?

You’re always having to raise funds for salaries, equipment, and supplies.  It can be difficult, but once you get the funding it is a relief for the moment.  As a scientist, your focus isn’t always on just the science.

What are your responsibilities related to generating revenue for the university?

I raise funds for my projects via grants.  Part of the money goes to Caltech as overhead to pay for the facilities, lab space, and to keep the lights on.

What do you wish you could do more of in your job?

Less raising money.  I like working in the lab, which is fun.  Now that I have worked out the technique to grow graphene, I’m looking for applications.  I’m searching for the next impactful thing, and then I’ll figure out the necessary steps that need to be taken to get there.

Is there an aspect of your job that you believe would surprise people?

You have to be entrepreneurial, you have to sell your ideas to raise money for these projects.  You have to go with what’s hot in research.  There are certain things that get funded and things that don’t.

There may be some things you’re interested in, but other people aren’t, so there’s no funding.

Yeah, there may not be a need, therefore, no funding.  Right now, graphene is a big thing, because there are many applications and problems to be solved.  For example, diamonds were huge back in the ‘80’s.  But once they solved all the problems, research cooled off and industrial application took over.

Is there something else you’d really rather be researching, or are the trending ideas right now in line with your interests?

There is nothing else I’d rather be researching.  I’m in a good place right now.  We’re trying to commercialize the graphene research.  You try to do research projects that are complementary to one another.  For example, there’s a project underway, where graphene is being used for hydrogen storage in cars, that really interests me.  I do like the graphene work, it’s exciting, we’ll see where that goes.

What are the two most important personality traits essential to being a good scientist?

Creativity.  You have to think outside the box.  Perseverance.  I’m always reading and trying to understand something better.  Curiosity is, of course, a huge part of it as well. You gotta be obsessive too, I guess.  That’s more than two, sorry.

What does it take for someone to become a scientist?

You must have the desire to be a scientist, otherwise you’ll go be a stockbroker or something else.  It’s more of a passion thing, your personality.  You do have to have an aptitude for it though.  If you’re getting D’s in math, physics is probably not the place for you.  There’s an old joke, the medical student in physics class asks the professor, “Why do we have to take physics?  We’ll never use it.”  The Physics professor answers, “Physics saves lives, because it keeps idiots out of medical school.”  If you like science, but you’re not so good at math, then look at less quantitative areas of science where math is not as essential.  Computational physics and experimental physics will require you to be very good at math.  It takes a different temperament, a different set of skills.  Same curiosity, same drive and intelligence, but different temperament.

Do you ever doubt your own abilities?  Do you have insecurities about not being smart enough?

Sure, but there’s always going to be someone out there smarter.  Although, you really don’t want to ask yourself these types of questions.  If you do, you’re looking down the wrong end of the telescope.  Everyone has their doubts, but you need to listen to the feedback from the universe.  If you’re doing something for a long time and not getting results, then that’s telling you something.  Like I said, you must have a passion for what you’re doing.  If people are in doubt they should read biographies of scientists and explore their mindset to discover if science seems to be a good fit for them.  For a lot of people, it’s not the most fun job, it’s not the most social job, and certainly not the most glamorous type of job.  Some people need more social interaction, researchers are usually a little more introverted.  Again, it really depends on the person’s temperament. There are some very brilliant people in business, and it’s definitely not the case that only the brilliant people in a society go into science.  It doesn’t mean you can’t be doing amazing things just because you’re not in a scientific field.  If you like science and building things, then follow that path.  It’s also important not to force yourself to study something you don’t enjoy.

Scientists are often thought to work with giant math problems that are far above the intellectual capabilities of mere mortals.  Have you ever been in a particular situation where the lack of a solution to a math problem was impeding progress in the lab?  If so, what was the problem and did you discover the solution?

I’m attempting to model the process of graphene growth, so I’m facing this situation right now.  That’s why I have this book here.  I’m trying to adapt Professor Dave Goodwin’s Cantera reactor modeling code to model the reaction kinetics in graphene (Goodwin originally developed and wrote the modeling software called Cantera).  Dave was a big pioneer in diamond and he died almost 5 years ago here in Pasadena.  He developed a reaction modeling code for diamond, and I’m trying to apply that to graphene.  So, yeah, it’s a big math problem that I’ve been spending weeks on trying to figure out.  It’s not that I’m worried about the algebra or the coding, it’s trying to figure things out conceptually.

Do you love your job?

I do, I’ve done it for awhile, it’s fun, and I really enjoy it.  When it works, it’s great. Discovering stuff is fun and possesses a great sense of satisfaction.  But it’s not always that way, it can be very frustrating.  Like any good love affair, it has its peaks and valleys.  Sometimes you hate it, but that’s part of the relationship, it’s like… aaarrgghh!!

 


Alexey PetrovNon-linear teaching

I wanted to share some ideas about a teaching method I am trying to develop and implement this semester. Please let me know if you’ve heard of someone doing something similar.

This semester I am teaching our undergraduate mechanics class. This is the first time I am teaching it, so I started looking into a possibility to shake things up and maybe apply some new method of teaching. And there are plenty offered: flipped classroom, peer instruction, Just-in-Time teaching, etc.  They all look to “move away from the inefficient old model” where there the professor is lecturing and students are taking notes. I have things to say about that, but not in this post. It suffices to say that most of those approaches are essentially trying to make students work (both with the lecturer and their peers) in class and outside of it. At the same time those methods attempt to “compartmentalize teaching” i.e. make large classes “smaller” by bringing up each individual student’s contribution to class activities (by using “clickers”, small discussion groups, etc). For several reasons those approaches did not fit my goal this semester.

Our Classical Mechanics class is a gateway class for our physics majors. It is the first class they take after they are done with general physics lectures. So the students are already familiar with the (simpler version of the) material they are going to be taught. The goal of this class is to start molding physicists out of students: they learn to simplify problems so physics methods can be properly applied (that’s how “a Ford Mustang improperly parked at the top of the icy hill slides down…” turns into “a block slides down the incline…”), learn to always derive the final formula before plugging in numbers, look at the asymptotics of their solutions as a way to see if the solution makes sense, and many other wonderful things.

So with all that I started doing something I’d like to call non-linear teaching. The gist of it is as follows. I give a lecture (and don’t get me wrong, I do make my students talk and work: I ask questions, we do “duels” (students argue different sides of a question), etc — all of that can be done efficiently in a class of 20 students). But instead of one homework with 3-4 problems per week I have two types of homework assignments for them: short homeworks and projects.

Short homework assignments are single-problem assignments given after each class that must be done by the next class. They are designed such that a student need to re-derive material that we discussed previously in class with small new twist added. For example, in the block-down-to-incline problem discussed in class I ask them to choose coordinate axes in a different way and prove that the result is independent of the choice of the coordinate system. Or ask them to find at which angle one should throw a stone to get the maximal possible range (including air resistance), etc.  This way, instead of doing an assignment in the last minute at the end of the week, students have to work out what they just learned in class every day! More importantly, I get to change how I teach. Depending on how they did on the previous short homework, I adjust the material (both speed and volume) discussed in class. I also  design examples for the future sections in such a way that I can repeat parts of the topic that was hard for the students previously. Hence, instead of a linear propagation of the course, we are moving along something akin to helical motion, returning and spending more time on topics that students find more difficult. That’t why my teaching is “non-linear”.

Project homework assignments are designed to develop understanding of how topics in a given chapter relate to each other. There are as many project assignments as there are chapters. Students get two weeks to complete them.

Overall, students solve exactly the same number of problems they would in a normal lecture class. Yet, those problems are scheduled in a different way. In my way, students are forced to learn by constantly re-working what was just discussed in a lecture. And for me, I can quickly react (by adjusting lecture material and speed) using constant feedback I get from students in the form of short homeworks. Win-win!

I will do benchmarking at the end of the class by comparing my class performance to aggregate data from previous years. I’ll report on it later. But for now I would be interested to hear your comments!

 


October 08, 2017

Scott AaronsonComing to Nerd Central

While I’m generally on sabbatical in Tel Aviv this year, I’ll be in the Bay Area from Saturday Oct. 14 through Wednesday Oct. 18, where I look forward to seeing many friends new and old.  On Wednesday evening, I’ll be giving a public talk in Berkeley, through the Simons Institute’s “Theoretically Speaking” series, entitled Black Holes, Firewalls, and the Limits of Quantum Computers.  I hope to see at least a few of you there!  (I do have readers in the Bay Area, don’t I?)

But there’s more: on Saturday Oct. 14, I’m thinking of having a first-ever Shtetl-Optimized meetup, somewhere near the Berkeley campus.  Which will also be a Slate Star Codex meetup, because Scott Alexander will be there too.  We haven’t figured out many details yet, except that it will definitively involve getting fruit smoothies from one of the places I remember as a grad student.  Possible discussion topics include what the math, CS, and physics research communities could be doing better; how to advance Enlightenment values in an age of recrudescent totalitarianism; and (if we’re feeling really ambitious) the interpretation of quantum mechanics.  If you’re interested, shoot me an email, let me know if there are times that don’t work; then other Scott and I will figure out a plan and make an announcement.

On an unrelated note, some people might enjoy my answer to a MathOverflow question about why one should’ve expected number theory to be so rife with ridiculously easy-to-state yet hard-to-prove conjectures, like Fermat’s Last Theorem and the Goldbach Conjecture.  As I’ve discussed on this blog before, I’ve been deeply impressed with MathOverflow since the beginning, but never more so than today, when a decision to close the question as “off-topic” was rightfully overruled.  If there’s any idea that unites all theoretical computer scientists, I’d say it’s the idea that what makes a given kind of mathematics “easy” or “hard” is, itself, a proper subject for mathematical inquiry.

Richard EastherBike Bridges of Auckland

IMG_0962.jpg

A city can simply happen, its location and form shaped by millions of people choosing where to live and what to build. This is a classic example of emergence or complex phenomena arising from simple ingredients, one of the key ideas science uses to make sense of the world.

Of course, cities don't just happen; they reflect decisions made by a society or, more accurately, by those with power within a society. Where will the roads go? What are the planning rules and regulations? What activities and assets will be taxed? What do the laws say? Where are the schools and hospitals?

Ironically these decisions bring their own, additional layer of emergence. No-one decides that traffic jams make for a better city but the roads in many cities are clogged morning and evening, a consequence of policy choices which see private cars as the primary means of transportation. To my physicist's eyes, efficient transport is a matter of density and energy: cars take up more space per passenger than bikes, buses and trains, so building infrastructure to properly support cars is expensive* and moving people inside cars costs more energy than the alternatives.  

Unpicking the unintended consequences of design decisions is hard; new resources must be threaded through a complex environment. To its credit, Auckland is solving this problem with trainsbuses and bicycles, although "mode share" for bikes is still a small slice of the transport pie. However, cycling is perhaps the most energy efficient transport option known to human beings, better even than walking** and usage of the Northwestern bike path and my regular commute is growing at an annual rate of20%. This may even pick up a notch in the coming year, as the Northwestern is connected to the new Waterview Shared Path which opened in the last few days. I went for a ride along the new tracks today and was stunned at how well this set of cycling arteries has been stitched into the suburbs of Waterview and Mount Albert.

Te Whitinga Bridge across SH 20; south of the Waterview Tunnel.

Te Whitinga Bridge across SH 20; south of the Waterview Tunnel.

Unlike almost any other form of transport, regular cycling leaves you fitter and healthier than sitting in a car, a bus, or a train. Not only that, Auckland's best cycle paths run through parks and incorporate a series of stunning bridges, so you are likely to arrive at work with a smile on your face.

IMG_0963.jpg
IMG_0974.jpg
IMG_0978.jpg
IMG_0993.jpg

Light Path / Te Ara I Whiti to its designers) which was built on a disused motorway off-ramp and in the coming years "Skypath" will hopefully add walking and cycling options to the eight lanes of traffic on our Harbour Bridge. Personally, I can't wait.

Lightpath-1st-Birthday-01.12.2016-416.jpg
02 SkyPath spans across harbour.jpg
01 SkyPath Observation Deck view.jpg

campaigned for it, both to mitigate the impact of the motorway construction and to deliver more comprehensive transport connectivity, which was the overall goal of the project.

* Ironically, building better roads to beat traffic jams tends to increase the amount of driving people want to do, making the traffic jams come back – what planners called induced demand. 

** Including the embodied energy used to make the cars and bicycles as well as the energy costs of making both food and fuel complicates this simple picture. But for many short journeys cycling is a time and energy efficient way to get around. 

IMAGES: All by me except the Lightpath picture (BikeAuckland) and the Skypath mock-ups (Reset Urban Design)

October 06, 2017

Jordan EllenbergTrace test

Jose Rodriguez gave a great seminar here yesterday about his work on the trace test, a numerical way of identifying irreducible components of varieties.  In Jose’s world, you do a lot of work with homotopy; if a variety X intersects a linear subspace V in points p1, p2, .. pk, you can move V a little bit and numerically follow those k points around.  If you move V more than a little bit — say in a nice long path in the Grassmannian that loops around and returns to its starting point — you’ll come back to p1, p2, .. pk, but maybe in a different order.  In this way you can compute the monodromy of those points; if it’s transitive, and if you’re careful about avoiding some kind of discriminant locus, you’ve proven that p1,p2…pk are all on the same component of V.

But the trace test is another thing; it’s about short paths, not long paths.  For somebody like me, who never thinks about numerical methods, this means “oh we should work in the local ring.”  And then it says something interesting!  It comes down to this.  Suppose F(x,y) is a form (not necessarily homogenous) of degree at most d over a field k.  Hitting it with a linear transformation if need be, we can assume the x^d term is nonzero.  Now think of F as an element of k((y))[x]:  namely

F = x^d + a_1(y) x^{d-1} + \ldots + a_d(y).

Letting K be the algebraic closure of k((y)), we can then factor F as (x-r_1) … (x-r_d).  Each of these roots can be computed as explicitly to any desired precision by Hensel’s lemma.  While the r_i may be power series in k((y)) (or in some algebraic extension), the sum of the r_i is -a_1(y), which is a linear function A+by.

Suppose you are wondering whether F factors in k[x,y], and whether, for instance, r_1 and r_2 are the roots of an irreducible factor of F.  For that to be true, r_1 + r_2 must be a linear function of y!  (In Jose’s world, you grab a set of points, you homotopy them around, and observe that if they lie on an irreducible component, their centroid moves linearly as you translate the plane V.)

Anyway, you can falsify this easily; it’s enough for e.g. the quadratic term of r_1 + r_2 to be nonzero.  If you want to prove F is irreducible, you just check that every proper subset of the r_i sums to something nonlinear.

  1.  Is this something I already know in another guise?
  2.  Is there a nice way to express the condition (which implies irreducibility) that no proper subset of the r_i sums to something with zero quadratic term?