Every year the Los Angeles Institute for the Humanities has a luncheon at the Getty jointly with the Getty Research Institute and the LAIH fellows get to hang out with the Getty Scholars and people on the Getty Visiting Scholars program (Alexa Sekyra, the head of the program, was at the luncheon today, so I got to meet her). The talk is usually given by a curator of an exhibition or program that's either current, or coming up. The first time I went, a few years ago, it was the Spring before the launch of the Pacific Standard Time region-wide celebration of 35 years of Southern California art and art movements ('45-'80) that broke away from letting New York and Western Europe call the tunes and began to define some of the distinctive voices of their own that are now so well known world wide... then we had a talk from a group of curators about the multi-museum collaboration to make that happen. One of the things I learned today from Andrew Perchuck, the Deputy Director of the Getty Research Institute who welcomed us all in a short address, was that there will be a new Pacific Standard Time event coming up in 2018, so stay tuned. This time it will have more of a focus on Latino and Latina American art. See here.
Today we had Nancy Perloff tell us about the current exhibit (for which she is [...] Click to continue reading this post

Sometimes I think I am really lucky to have grown convinced that the Standard Model will not be broken by LHC results. It gives me peace of mind, detachment, and the opportunity to look at every new result found in disagreement with predictions with the right spirit - the "what's wrong with it ?" attitude that every physicist should have in his or her genes.

It is Monday afternoon and the day seems to be a productive one, if not yet quite memorable. As I revise some notes on my desk, Beni Yoshida walks into my office to remind me that the high-energy physics seminar … Continue reading

It is Monday afternoon and the day seems to be a productive one, if not yet quite memorable. As I revise some notes on my desk, Beni Yoshida walks into my office to remind me that the high-energy physics seminar is about to start. I hesitate, somewhat apprehensive of the near-certain frustration of being lost during the first few minutes of a talk in an unfamiliar field. I normally avoid such a situation, but in my email I find John’s forecast for an accessible talk by Daniel Harlow and a title with three words I can cling onto. “Quantum error correction” has driven my curiosity for the last seven years. The remaining acronyms in the title will become much more familiar in the four months to come.

Most of you are probably familiar with holograms, these shiny flat films representing a 3D object from essentially any desired angle. I find it quite remarkable how all the information of a 3D object can be printed on an essentially 2D film. True, the colors are not represented as faithfully as in a traditional photograph, but it looks as though we have taken a photograph from every possible angle! The speaker’s main message that day seemed even more provocative than the idea of holography itself. Even if the hologram is broken into pieces, and some of these are lost, we may still use the remaining pieces to recover parts of the 3D image or even the full thing given a sufficiently large portion of the hologram. The 3D object is not only recorded in 2D, it is recorded redundantly!

Half way through Daniel’s exposition, Beni and I exchange a knowing glance. We recognize a familiar pattern from our latest project. A pattern which has gained the moniker of “cleaning lemma” within the quantum information community which can be thought of as a quantitative analog of reconstructing the 3D image from pieces of the hologram. Daniel makes connections using a language that we are familiar with. Beni and I discuss what we have understood and how to make it more concrete as we stride back through campus. We scribble diagrams on the whiteboard and string words such as tensor, encoder, MERA and negative curvature into our discussion. An image from the web gives us some intuition on the latter. We are onto something. We have a model. It is simple. It is new. It is exciting.

Food has not come our way so we head to my apartment as we enthusiastically continue our discussion. I can only provide two avocados and some leftover pasta but that is not important, we are sharing the joy of insight. We arrange a meeting with Daniel to present our progress. By Wednesday Beni and I introduce the holographic pentagon code at the group meeting. A core for a new project is already there, but we need some help to navigate the high-energy waters. Who better to guide us in such an endeavor than our mentor, John Preskill, who recognized the importance of quantum information in Holography as early as 1999 and has repeatedly proven himself a master of both trades.

“I feel that the idea of holography has a strong whiff of entanglement—for we have seen that in a profoundly entangled state the amount of information stored locally in the microscopic degrees of freedom can be far less than we would naively expect. For example, in the case of the quantum error-correcting codes, the encoded information may occupy a small ‘global’ subspace of a much larger Hilbert space. Similarly, the distinct topological phases of a fractional quantum Hall system look alike locally in the bulk, but have distinguishable edge states at the boundary.”

-J. Preskill, 1999

As Beni puts it, the time for using modern quantum information tools in high-energy physics has come. By this he means quantum error correction and maybe tensor networks. First privately, then more openly, we continue to sharpen and shape our project. Through conferences, Skype calls and emails, we further our discussion and progressively shape ideas. Many speculations mature to conjectures and fall victim to counterexamples. Some stand the test of simulations or are even promoted to theorems by virtue of mathematical proofs.

I publicly present the project for the first time at a select quantum information conference in Australia. Two months later, after a particularly intense writing, revising and editing process, the article is almost complete. As we finalize the text and relabel the figures, Daniel and Beni unveil our work to quantum entanglement experts in Puerto Rico. The talks are a hit and it is time to let all our peers read about it.

You are invited to do so and Beni will even be serving a reader’s guide in an upcoming post.

The lessons we learned from the Ryu-Takayanagi formula, the firewall paradox and the ER=EPR conjecture have convinced us that quantum information theory can become a powerful tool to sharpen our understanding of various problems in high-energy physics. But, many of … Continue reading

The lessons we learned from the Ryu-Takayanagi formula, the firewall paradox and the ER=EPR conjecture have convinced us that quantum information theory can become a powerful tool to sharpen our understanding of various problems in high-energy physics. But, many of the concepts utilized so far rely on entanglement entropy and its generalizations, quantities developed by Von Neumann more than 60 years ago. We live in ~~the 21st century. Why don’t we use more modern concepts, such as the theory of quantum error-correcting codes?~~

In a recent paper with Daniel Harlow, Fernando Pastawski and John Preskill, we have proposed a toy model of the AdS/CFT correspondence based on quantum error-correcting codes. Fernando has already written how this research project started after a fateful visit by Daniel to Caltech and John’s remarkable prediction in 1999. In this post, I hope to write an introduction which may serve as a reader’s guide to our paper, explaining why I’m so fascinated by the beauty of the toy model.

This is certainly a challenging task because I need to make it accessible to everyone while explaining real physics behind the paper. My personal philosophy is that a toy model must be as simple as possible while capturing key properties of the system of interest. In this post, I will try to extract some key features of the AdS/CFT correspondence and construct a toy model which captures these features. This post may be a bit technical compared to other recent posts, but anyway, let me give it a try…

**Bulk locality paradox and quantum error-correction**

The AdS/CFT correspondence says that there is some kind of correspondence between quantum gravity on (d+1)-dimensional asymptotically-AdS space and d-dimensional conformal field theory on its boundary. But how are they related?

The AdS-Rindler reconstruction tells us how to “reconstruct” a bulk operator from boundary operators. Consider a bulk operator and a boundary region A on a hyperbolic space (in other words, a negatively-curved plane). On a fixed time-slice, the causal wedge of A is a bulk region enclosed by the geodesic line of A (a curve with a minimal length). The AdS-Rindler reconstruction says that can be represented by some integral of local boundary operators supported on A if and only if is contained inside the causal wedge of A. Of course, there are multiple regions A,B,C,… whose causal wedges contain , and the reconstruction should work for any such region.

That a bulk operator in the causal wedge can be reconstructed by local boundary operators, however, leads to a rather perplexing paradox in the AdS/CFT correspondence. Consider a bulk operator at the center of a hyperbolic space, and split the boundary into three pieces, A, B, C. Then the geodesic line for the union of BC encloses the bulk operator, that is, is contained inside the causal wedge of BC. So, can be represented by local boundary operators supported on BC. But the same argument applies to AB and CA, implying that the bulk operator corresponds to local boundary operators which are supported inside AB, BC and CA simultaneously. It would seem then that the bulk operator must correspond to an identity operator times a complex phase. In fact, similar arguments apply to any bulk operators, and thus, all the bulk operators must correspond to identity operators on the boundary. Then, the AdS/CFT correspondence seems so boring…

Almheiri, Dong and Harlow have recently proposed an intriguing way of reconciling this paradox with the AdS/CFT correspondence. They proposed that *the AdS/CFT correspondence can be viewed as a quantum error-correcting code*. Their idea is as follows. Instead of corresponding to a single boundary operator, may correspond to different operators in different regions, say , , living in AB, BC, CA respectively. Even though , , are different boundary operators, they may be equivalent inside a certain low energy subspace on the boundary.

This situation resembles the so-called quantum secret-sharing code. The quantum information at the center of the bulk cannot be accessed from any single party A, B or C because does not have representation on A, B, or C. It can be accessed only if multiple parties cooperate and perform joint measurements. It seems that a quantum secret is shared among three parties, and the AdS/CFT correspondence somehow realizes the three-party quantum secret-sharing code!

**Entanglement wedge reconstruction?**

Recently, causal wedge reconstruction has been further generalized to the notion of entanglement wedge reconstruction. Imagine we split the boundary into four pieces A,B,C,D such that A,C are larger than B,D. Then the geodesic lines for A and C do not form the geodesic line for the *union* of A and C because we can draw shorter arcs by connecting endpoints of A and C, which form the global geodesic line. The entanglement wedge of AC is a bulk region enclosed by this global geodesic line of AC. And the entanglement wedge reconstruction predicts that can be represented as an integral of local boundary operators on AC if and only if is inside the entanglement wedge of AC [1].

**Building a minimal toy model; the five-qubit code**

Okay, now let’s try to construct a toy model which admits causal and entanglement wedge reconstructions of bulk operators. Because I want a simple toy model, I take a rather bold assumption that *the bulk consists of a single qubit while the boundary consists of five qubits, denoted by A, B, C, D, E*.

What does causal wedge reconstruction teach us in this minimal setup of five and one qubits? First, we split the boundary system into two pieces, ABC and DE and observe that the bulk operator is contained inside the causal wedge of ABC. From the rotational symmetries, we know that the bulk operator must have representations on ABC, BCD, CDE, DEA, EAB. Next, we split the boundary system into four pieces, AB, C, D and E, and observe that the bulk operator is contained inside the entanglement wedge of AB and D. So, the bulk operator must have representations on ABD, BCE, CDA, DEB, EAC. In summary, we have the following:

- The bulk operator must have representations on R if and only if R contains three or more qubits.

This is the property I want my toy model to possess.

What kinds of physical systems have such a property? Luckily, we quantum information theorists know the answer; the five-qubit code. The five-qubit code, proposed here and here, has an ability to encode one logical qubit into five-qubit entangled states and corrects any single qubit error. We can view the five-qubit code as a quantum encoding isometry from one-qubit states to five-qubit states:

where and are the basis for a logical qubit. In quantum coding theory, logical Pauli operators and are Pauli operators which act like Pauli X (bit flip) and Z (phase flip) on a logical qubit spanned by and . In the five-qubit code, for any set of qubits R with volume 3, some representations of logical Pauli X and Z operators, and , can be found on R. While and are different operators for , they act exactly in the same manner on the codeword subspace spanned by and . This is exactly the property I was looking for.

**Holographic quantum error-correcting codes**

We just found possibly the smallest toy model of the AdS/CFT correspondence, the five-qubit code! The remaining task is to construct a larger model. For this goal, we view the encoding isometry of the five-qubit code as a six-leg tensor. The holographic quantum code is a network of such six-leg tensors covering a hyperbolic space where each tensor has one open leg. These open legs on the bulk are interpreted as logical input legs of a quantum error-correcting code while open legs on the boundary are identified as outputs where quantum information is encoded. Then the entire tensor network can be viewed as an encoding isometry.

The six-leg tensor has some nice properties. Imagine we inject some Pauli operator into one of six legs in the tensor. Then, for any given choice of three legs, there always exists a Pauli operator acting on them which counteracts the effect of the injection. An example is shown below:

In other words, if an operator is injected from one tensor leg, one can “push” it into other three tensor legs.

Finally, let’s demonstrate causal wedge reconstruction of bulk logical operators. Pick an arbitrary open tensor leg in the bulk and inject some Pauli operator into it. We can “push” it into three tensor legs, which are then injected into neighboring tensors. By repeatedly pushing operators to the boundary in the network, we eventually have some representation of the operator living on a piece of boundary region A. And the bulk operator is contained inside the causal wedge of A. (Here, the length of the curve can be defined as the number of tensor legs cut by the curve). You can also push operators into the boundary by choosing different tensor legs which lead to different representations of a logical operator. You can even have a rather exotic representation which is supported non-locally over two disjoint pieces of the boundary, realizing entanglement wedge reconstruction.

**What’s next?**

This post is already pretty long and I need to wrap it up…

Shor’s quantum factoring algorithm is a revolutionary invention which opened a whole new research avenue of quantum information science. It is often forgotten, but the first quantum error-correcting code is another important invention by Peter Shor (and independently by Andrew Steane) which enabled a proof that the quantum computation can be performed fault-tolerantly. The theory of quantum error-correcting codes has found interesting applications in studies of condensed matter physics, such as topological phases of matter. Perhaps then, quantum coding theory will also find applications in high energy physics.

Indeed, many interesting open problems are awaiting us. Is entanglement wedge reconstruction a generic feature of tensor networks? How do we describe black holes by quantum error-correcting codes? Can we build a fast scrambler by tensor networks? Is entanglement a wormhole (or maybe a perfect tensor)? Can we resolve the firewall paradox by holographic quantum codes? Can the physics of quantum gravity be described by tensor networks? Or can the theory of quantum gravity provide us with novel constructions of quantum codes?

I feel that now is the time for quantum information scientists to jump into the research of black holes. We don’t know if we will be burned by a firewall or not … , but it is worth trying.

That’s the title of the talk I gave yesterday at Vanderbilt, and here are the slides:

The central idea is the same as in past versions of the talk– stealing Robert Krulwich’s joke contrasting the publication styles of Newton and Galileo to argue that scientists spend too much time writing technical articles aimed at an audience of other experts, and need to do more “Galileian” publication aimed at a broad audience. And that social media technologies offer powerful tools that can enable those who are interested to do this kind of communication with relatively little effort.

This version of the talk is a little more image-based than older versions, reflecting a general shift in the way I give talks these days, which might make it less comprehensible from just the slides than older versions. But then, that’s just more reason to invite me to give it live and in person at *your* place of work…

Recent years have seen a significant increase in the overall accuracy of lattice QCD calculations of various hadronic observables. Results for quark and hadron masses, decay constants, form factors, the strong coupling constant and many other quantities are becoming increasingly important for testing the validity of the Standard Model. Prominent examples include calculations of Standard Model parameters, such as quark masses and the strong coupling constant, as well as the determination of CKM matrix elements, which is based on a variety of input quantities from experiment and theory. In order to make lattice QCD calculations more accessible to the entire particle physics community, several initiatives and working groups have sprung up, which collect the available lattice results and produce global averages.

The scientific programme "Fundamental Parameters from Lattice QCD" at the Mainz Institute of Theoretical Physics (MITP) is designed to bring together lattice practitioners with members of the phenomenological and experimental communities who are using lattice estimates as input for phenomenological studies. In addition to sharing the expertise among several communities, the aim of the programme is to identify key quantities which allow for tests of the CKM paradigm with greater accuracy and to discuss the procedures in order to arrive at more reliable global estimates.

The deadline for registration is**Tuesday, 31 March 2015**.

The scientific programme "Fundamental Parameters from Lattice QCD" at the Mainz Institute of Theoretical Physics (MITP) is designed to bring together lattice practitioners with members of the phenomenological and experimental communities who are using lattice estimates as input for phenomenological studies. In addition to sharing the expertise among several communities, the aim of the programme is to identify key quantities which allow for tests of the CKM paradigm with greater accuracy and to discuss the procedures in order to arrive at more reliable global estimates.

The deadline for registration is

API is an abbreviation that stands for “Application Program Interface.” Roughly speaking an API is a specification of a software component in terms of the operations one can perform with that component. For example, a common kind of an API … Continue reading

API is an abbreviation that stands for “Application Program Interface.” Roughly speaking an API is a specification of a software component in terms of the operations one can perform with that component. For example, a common kind of an API is the set of methods supported by a encapsulated bit of code a.k.a. a library (for example, a library could have the purpose of “drawing pretty stuff on the screen”, the API is then the set of commands like “draw a rectangle”, and specify how you pass parameters to this method, how rectangles overlay on each other, etc.) Importantly the API is supposed to specify how the library functions, but does this in a way that is independent of the inner workings of the library (though this wall is often broken in practice). Another common API is found when a service exposes remote calls that can be made to manipulate and perform operations on that service. For example, Twitter supports an API for reading and writing twitter data. This later example, of a service exposing a set of calls that can manipulate the data stored on a remote server, is particularly powerful, because it allows one to gain access to data through simple access to a communication network. (As an interesting aside, see this rant for why APIs are likely key to some of Amazon’s success.)

As you might guess, (see for example my latest flop Should Papers Have Unit Tests?), I like smooshing together disparate concepts and seeing what comes out the other side. When thinking about APIs then led me to consider the question “What if Papers had APIs”?

In normal settings academic papers are considered to be relatively static objects. Sure papers on the arXiv, for example, have versions (some more than others!) And there are efforts like Living Reviews in Relativity, where review articles are updated by the authors. But in general papers exist, as fixed “complete” works. In programming terms we would say that are “immutable”. So if we consider the question of exposing an API for papers, one might think that this might just be a read only API. And indeed this form of API exists for many journals, and also for the arXiv. These forms of “paper APIs” allow one to read information, mostly metadata, about a paper.

But what about a paper API that allows mutation? At first glance this heresy is rather disturbing: allowing calls from outside of a paper to change the content of the paper seems dangerous. It also isn’t clear what benefit could come from this. With, I think, one exception. Citations are the currency of academia (last I checked they were still, however not fungible with bitcoins). But citations really only go in one direction (with exceptions for simultaneous works): you cite a paper whose work you build upon (or whose work you demonstrate is wrong, etc). What if a paper exposed a reverse citation index. That is, if I put my paper on the arXiv, and then, when you write your paper showing how my paper is horribly wrong, you can make a call to my paper’s api that mutates my paper and adds to it links to your paper. Of course, this seems crazy: what is to stop rampant back spamming of citations, especially by *ahem* cranks? Here it seems that one could implement a simple approval system for the receiving paper. If this were done on some common system, then you could expose the mutated paper either A) with approved mutations or B) with unapproved mutations (or one could go ‘social’ on this problem and allow voting on the changes).

What benefit would such a system confer? In some ways it would make more accessible something that we all use: the “cited by” index of services like Google Scholar. One difference is that it could be possible to be more precise in the reverse citation: for example while Scholar provides a list of relevant papers, if the API could expose the ability to add links to specific locations in a paper, one could arguably get better reverse citations (because, frankly, the weakness of the cited by indices is their lack of specificity).

What else might a paper API expose? I’m not convinced this isn’t an interesting question to ponder. Thanks for reading another wacko mashup episode of the Quantum Pontiff!

Talks from QIP 2015 are now available on this YouTube channel. Great to see! I’m still amazed by the wondrous technology that allows me to watch talks given on the other side of the world, at my own leisure, on … Continue reading

Talks from QIP 2015 are now available on this YouTube channel. Great to see! I’m still amazed by the wondrous technology that allows me to watch talks given on the other side of the world, at my own leisure, on such wonderful quantum esoterica.

guest post by Marc Harper A while back, in the article Relative entropy minimization in evolutionary dynamics, we looked at extensions of the information geometry / evolutionary game theory story to more general time-scales, incentives, and geometries. Today we’ll see how to make this all work in finite populations! Let’s recall the basic idea from […]

*guest post by Marc Harper*

A while back, in the article Relative entropy minimization in evolutionary dynamics, we looked at extensions of the information geometry / evolutionary game theory story to more general time-scales, incentives, and geometries. Today we’ll see how to make this all work in finite populations!

Let’s recall the basic idea from last time, which John also described in his information geometry series. The main theorem is this: when there’s an evolutionarily stable state for a given fitness landscape, the relative entropy between the stable state and the population distribution decreases along the population trajectories as they converge to the stable state. In short, relative entropy is a Lyapunov function. This is a nice way to look at the action of a population under natural selection, and it has interesting analogies to Bayesian inference.

The replicator equation is a nice model from an intuitive viewpoint, and it’s mathematically elegant. But it has some drawbacks when it comes to modeling real populations. One major issue is that the replicator equation implicitly assumes that the population proportions of each type are differentiable functions of time, obeying a differential equation. This only makes sense in the limit of large populations. Other closely related models, such as the Lotka-Volterra model, focus on the number of individuals of each type (e.g. predators and prey) instead of the proportion. But they often assume that the number of individuals is a differentiable function of time, and a population of 3.5 isn’t very realistic either.

Real populations of replicating entities are not infinitely large; in fact they are often relatively small and of course have whole numbers of each type, at least for large biological replicators (like animals). They take up space and only so many can interact meaningfully. There are quite a few models of evolution that handle finite populations and some predate the replicator equation. Models with more realistic assumptions typically have to leave the realm of derivatives and differential equations behind, which means that the analysis of such models is more difficult, but the behaviors of the models are often much more interesting. Hopefully by the end of this post, you’ll see how all of these diagrams fit together:

One of the best-known finite population models is the Moran process, which is a Markov chain on a finite population. This is the quintessential birth-death process. For a moment consider a population of just two types and The state of the population is given by a pair of nonnegative integers with the total number of replicators in the population, and and the number of individuals of type and respectively. Though it may artificial to fix the population size this often turns out not to be that big of a deal, and you can assume the population is at its carrying capacity to make the assumption realistic. (Lots of people study populations that can change size and that have replicators spatially distributed say on a graph, but we’ll assume they can all interact with each whenever they want for now).

A Markov model works by transitioning from state to state in each round of the process, so we need to define the transitions probabilities to complete the model. Let’s put a fitness landscape on the population, given by two functions and of the population state Now we choose an individual to reproduce proportionally to fitness, e.g. we choose an individual to reproduce with probability

since there are individuals of type and they each have fitness This is analogous to the ratio of fitness to mean fitness from the discrete replicator equation, since

and the discrete replicator equation is typically similar to the continuous replicator equation (this can be made precise), so the Moran process captures the idea of natural selection in a similar way. Actually there is a way to recover the replicator equation from the Moran process in large populations—details at the end!

We’ll assume that the fitnesses are nonnegative and that the total fitness (the denominator) is never zero; if that seems artificial, some people prefer to transform the fitness landscape by which gives a ratio reminiscent of the Boltzmann or Fermi distribution from statistical physics, with the parameter playing the role of **intensity of selection** rather than inverse temperature. This is sometimes called **Fermi selection**.

That takes care of the birth part. The death part is easier: we just choose an individual at random (uniformly) to be replaced. Now we can form the transition probabilities of moving between population states. For instance the probability of moving from state to is given by the product of the birth and death probabilities, since they are independent:

since we have to chose a replicator of type to reproduce and one of type to be replaced. Similarly for to (switch all the a’s and b’s), and we can write the probability of staying in the state as

Since we only replace one individual at a time, this covers all the possible transitions, and keeps the population constant.

We’d like to analyze this model and many people have come up with clever ways to do so, computing quantities like **fixation probabilities** (also known as **absorption probabilities**), indicating the chance that the population will end up with one type completely dominating, i.e. in state or If we assume that the fitness of type is constant and simply equal to 1, and the fitness of type is we can calculate the probability that a single mutant of type will take over a population of type using standard Markov chain methods:

For neutral relative fitness (), which is the probability a neutral mutant invades by drift alone since selection is neutral. Since the two boundary states or are absorbing (no transitions out), in the long run *every* population ends up in one of these two states, i.e. the population is homogeneous. (This is the formulation referred to by Matteo Smerlak in The mathematical origins of irreversibility.)

That’s a bit different flavor of result than what we discussed previously, since we had stable states where both types were present, and now that’s impossible, and a bit disappointing. We need to make the population model a bit more complex to have more interesting behaviors, and we can do this in a very nice way by adding the effects of mutation. At the time of reproduction, we’ll allow either type to mutate into the other with probability This changes the transition probabilities to something like

Now the process never stops wiggling around, but it does have something known as a **stationary distribution**, which gives the probability that the population is in any given state *in the long run*.

For populations with more than two types the basic ideas are the same, but there are more neighboring states that the population could move to, and many more states in the Markov process. One can also use more complicated mutation matrices, but this setup is good enough to typically guarantee that no one species completely takes over. For interesting behaviors, typically is a good choice (there’s some biological evidence that mutation rates are typically inversely proportional to genome size).

Without mutation, once the population reached or it stayed there. Now the population bounces between states, either because of drift, selection, or mutation. Based on our stability theorems for evolutionarily stable states, it’s reasonable to hope that for small mutation rates and larger populations (less drift), the population should spend most of its time near the evolutionarily stable state. This can be measured by the stationary distribution which gives the long run probabilities of a process being in a given state.

Previous work by Claussen and Traulsen:

• Jens Christian Claussen and Arne Traulsen, Non-Gaussian fluctuations arising from finite populations: exact results for the evolutionary Moran process, *Physical Review E* **71** (2005), 025101.

suggested that the stationary distribution is at least sometimes maximal around evolutionarily stable states. Specifically, they showed that for a very similar model with fitness landscape given by

the stationary state is essentially a binomial distribution centered at

Unfortunately, the stationary distribution can be very difficult to compute for an arbitrary Markov chain. While it can be computed for the Markov process described above without mutation, and in the case studied by Claussen and Traulsen, there’s no general analytic formula for the process with mutation, nor for more than two types, because the processes are not *reversible*. Since we can’t compute the stationary distribution analytically, we’ll have to find another way to show that the local maxima of the stationary distribution are “evolutionarily stable”. We can approximate the stationary distribution fairly easily with a computer, so it’s easy to plot the results for just about any landscape and reasonable population size (e.g. ).

It turns out that we can use a relative entropy minimization approach, just like for the continuous replicator equation! But how? We lack some essential ingredients such as deterministic and differentiable trajectories. Here’s what we do:

• We show that the local maxima and minima of the stationary distribution satisfy a *complex balance* criterion.

• We then show that these states minimize an *expected* relative entropy.

• This will mean that the current state and the *expected next state* are ‘close’.

• Lastly, we show that these states satisfy an analogous definition of evolutionary stability (now incorporating mutation).

The relative entropy allows us to measure how close the current state is to the expected next state, which captures the idea of stability in another way. This ports the relative minimization Lyapunov result to some more realistic Markov chain models. The only downside is that we’ll assume the populations are “sufficiently large”, but in practice for populations of three types, is typically enough for common fitness landscapes (there are lots of examples here for which are prettier than the smaller populations). The reason for this is that the population state needs enough “resolution” to get sufficiently close to the stable state, which is not necessarily a ratio of integers. If you allow some wiggle room, smaller populations are still typically pretty close.

Evolutionarily stable states are closely related to Nash equilibria, which have a nice intuitive description in traditional game theory as “states that no player has an incentive to deviate from”. But in evolutionary game theory, we don’t use a game matrix to compute e.g. maximum payoff strategies, rather the game matrix defines a fitness landscape which then determines how natural selection unfolds.

We’re going to see this idea again in a moment, and to help get there let’s introduce an function called an **incentive** that encodes how a fitness landscape is used for selection. One way is to simply replace the quantities and in the fitness-proportionate selection ratio above, which now becomes (for two population types):

Here and are the incentive function components that determine how the fitness landscape is used for natural selection (if at all). We have seen two examples above:

for the Moran process and fitness-proportionate selection, and

for an alternative that incorporates a strength of selection term preventing division by zero for fitness landscapes defined by zero-sum game matrices, such as a rock-paper-scissors game. Using an incentive function also simplifies the transition probabilities and results as we move to populations of more than two types. Introducing mutation, we can describe the ratio for incentive-proportion selection with mutation for the th population type when the population is in state as

for some matrix of mutation probabilities This is just the probability that we get a new individual of the th type (by birth and/or mutation). A common choice for the mutation matrix is to use a single mutation probability and spread it out over all the types, such as letting

and

Now we are ready to define the **expected next state** for the population and see how it captures a notion of stability. For a given state population in a multitype population, using to indicate the normalized population state consider all the neighboring states that the population could move to in one step of the process (one birth-death cycle). These neighboring states are the result of increasing a population type by one (birth) and decreasing another by one (death, possibly the same type), of course excluding cases on the boundary where the number of individuals of any type drops below zero or rises above Now we can define the expected next state as the sum of neighboring states weighted by the transition probabilities

with transition probabilities given by

for states that differ in at the th coordinate and at th coordinate from Here is just the probability of the random death of an individual of the th type, so the transition probabilities are still just birth (with mutation) and death as for the Moran process we started with.

Skipping some straightforward algebraic manipulations, we can show that

Then it’s easy to see that if and only if and that if and only if So we have a nice description of ‘stability’ in terms of fixed points of the expected next state function and the incentive function

and we’ve gotten back to “no one has an incentive to deviate”. More precisely, for the Moran process

and we get back for every type. So we take as our analogous condition to an evolutionarily stable state, though it’s just the ‘no motion’ part and not also the ‘stable’ part. That’s what we need the stationary distribution for!

To turn this into a useful number that measures stability, we use the relative entropy of the expected next state and the current state, in analogy with the Lyapunov theorem for the replicator equation. The relative entropy

has the really nice property that if and only if so we can use the relative entropy as a measure of *how close to stable* any particular state is! Here the expected next state takes the place of the ‘evolutionarily stable state’ in the result described last time for the replicator equation.

Finally, we need to show that the maxima (and minima) of of the stationary distribution are these fixed points by showing that these states minimize the expected relative entropy.

Seeing that local maxima and minima of the stationary distribution minimize the expected relative entropy is a more involved, so let’s just sketch the details. In general, these Markov processes are not reversible, so they don’t satisfy the detailed-balance condition, but the stationary probabilities do satisfy something called the global balance condition, which says that for the stationary distribution we have that

When the stationary distribution is at a local maximum (or minimum), we can show essentially that this implies (up to an for a large enough population) that

a sort of probability inflow-outflow equation, which is very similar to the condition of complex balanced equilibrium described by Manoj Gopalkrishnan in this Azimuth post. With some algebraic manipulation, we can show that these states have

Now let’s look again at the figures from the start. The first shows the vector field of the replicator equation:

You can see rest points at the center, on the center of each boundary edge, and on the corner points. The center point is evolutionarily stable, the center points of the boundary are semi-stable (but stable when the population is restricted to a boundary simplex), and the corner points are unstable.

This one shows the stationary distribution for a finite population model with a Fermi incentive on the same landscape, for a population of size 80:

A fixed population size gives a partitioning of the simplex, and each triangle of the partition is colored by the value of the stationary distribution. So you can see that there are local maxima in the center and on the centers of the triangle boundary edges. In this case, the size of the mutation probability determines how much of the stationary distribution is concentrated on the center of the simplex.

This shows one-half of the Euclidean distance squared between the current state and the expected next state:

And finally, this shows the same thing but with the relative entropy as the ‘distance function':

As you can see, the Euclidean distance is locally minimal at each of the local maxima and minima of the stationary distribution (including the corners); the relative entropy is only guaranteed so on the interior states (because the relative entropy doesn’t play nicely with the boundary, and unlike the replicator equation, the Markov process can jump on and off the boundary). It turns out that the relative Rényi entropies for between 0 and 1 also work just fine, but for the large population limit (the replicator dynamic), the relative entropy is the somehow the right choice for the replicator equation (has the derivative that easily gives Lyapunov stability), which is due to the connections between relative entropy and Fisher information in the information geometry of the simplex. The Euclidean distance is the case and the ordinary relative entropy is

As it turns out, something very similar holds for another popular finite population model, the Wright–Fisher process! This model is more complicated, so if you are interested in the details, check out our paper, which has many nice examples and figures. We also define a process that bridges the gap between the atomic nature of the Moran process and the generational nature of the Wright–Fisher process, and prove the general result for that model.

Finally, let’s see how the Moran process relates back to the replicator equation (see also the appendix in this paper), and how we recover the stability theory of the replicator equation. We can use the transition probabilities of the Moran process to define a stochastic differential equation (called a Langevin equation) with drift and diffusion terms that are essentially (for populations with two types:

As the population size gets larger, the diffusion term drops out, and the stochastic differential equation becomes essentially the replicator equation. For the stationary distribution, the variance (e.g. for the binomial example above) also has an inverse dependence on so the distribution limits to a delta-function that is zero except for at the evolutionarily stable state!

What about the relative entropy? Loosely speaking, as the population size gets larger, the iteration of the expected next state also becomes deterministic. Then the evolutionarily stable states is a fixed point of the expected next state function, and the expected relative entropy is essentially the same as the ordinary relative entropy, at least in a neighborhood of the evolutionarily stable state. This is good enough to establish local stability.

Earlier I said both the local maxima and minima minimize the expected relative entropy. Dash and I haven’t proven that the local maxima always correspond to evolutionarily stable states (and the minima to unstable states). That’s because the generalization of evolutionarily stable state we use is really just a ‘no motion’ condition, and isn’t strong enough to imply stability in a neighborhood for the deterministic replicator equation. So for now we are calling the local maxima **stationary stable** states.

We’ve also tried a similar approach to populations evolving on networks, which is a popular topic in evolutionary graph theory, and the results are encouraging! But there are many more ‘states’ in such a process, since the configuration of the network has to be taken into account, and whether the population is clustered together or not. See the end of our paper for an interesting example of a population on a cycle.

It took a while, but I got this task done. (Click for a slightly larger view.)Things take a lot longer these days, because...newborn. You'll recall that I did a little drawing of the youngster very soon after his arrival in December. Well, it was decided a while back that it should be on display on a wall in the house rather than hide in my notebooks like my other sketches tend to do. This was a great honour, but presented me with difficulty. I have a rule to not take any pages out of my notebooks. You'll think it is nuts, but you'll find that this madness is shared by many people who keep notebooks/sketchbooks. Somehow the whole thing is a Thing, if you know what I mean. To tear a page out would be a distortion of the record.... it would spoil the archival aspect of the book. (Who am I kidding? I don't think it likely that future historians will be poring over my notebooks... but I know that future Clifford will be, and it will be annoying to find a gap.) (It is sort of like deleting comments from a discussion on a blog post. I try not to do that without good reason, and I leave a trail to show that it was done if I must.)
Anyway, where was I? Ah. Pages. Well, I had to find a way of making a framed version of the drawing that kept the spirit and feel of the drawing intact while [...] Click to continue reading this post

I mentioned last week that I’m giving a talk at Vanderbilt tomorrow, but as they went to the trouble of writing a press release, the least I can do is share it:

It’s clear that this year’s Forman lecturer at Vanderbilt University, Chad Orzel, will talk about physics to almost anyone.

After all, two of his popular science books are How to Teach Physics to Your Dog and How to Teach Relativity to Your Dog. Orzel, an associate professor of physics at Union College in New York and author of the ScienceBlog “Uncertain Principles,” is scheduled to speak on campus at 3 p.m. Thursday, March 26.

As is the custom among my people, I sent them a title and abstract:

Title:Talking Dogs and Galileian Blogs: Social Media for Communicating Science

Abstract:Modern social media technologies provide an unprecedented opportunity to engage and inform a broad audience about the practice and products of science. Such outreach efforts are critically important in an era of funding cuts and global crises that demand scientific solutions. In this talk I’ll offer examples and advice on the use of social media for science communication, drawn from more than a dozen years of communicating science online.

This shares some DNA with the evangelical blogging-as-outreach talk I’ve been giving off and on for several years, but that was getting a little outdated. So I decided to blow it up and make a new version, which I nearly have finished… with less than 24 hours before my flight to Tennessee. Whee!

Anyway, if you’re in the Nashville area or could be on really short notice, stop by. Otherwise, stay tuned for Exciting! Blogging! News! early next week (give or take).

Evidence for rainbow gravity by butterfly production at the LHC. |

The most recent news about quantum gravity phenomenology going through the press is that the LHC upon restart at higher energies will make contact with parallel universes, excuse me, with PARALLEL UNIVERSES. The telegraph even wants you to believe that this would disprove the Big Bang, and tomorrow maybe it will cause global warming, cure Alzheimer and lead to the production of butterflies at the LHC, who knows. This story is so obviously nonsense that I thought it would be unnecessary to comment on this, but I have underestimated the willingness of news outlets to promote shallow science, and also the willingness of authors to feed that fire.

This story is based on the paper:Ahmed Farag Ali, Mir Faizal, Mohammed M. Khalil

arXiv:1410.4765 [hep-th]

Phys.Lett. B743 (2015) 295

Here is a summary of what they have done. In models with large additional dimensions, the Planck scale, where effects of quantum gravity become important, can be lowered to energies accessible at colliders. This is an old story that was big 15 years ago or so, and I wrote my PhD thesis on this. In the new paper they use a modification of general relativity that is called "rainbow gravity" and revisit the story in this framework.

In rainbow gravity the metric is energy-dependent which it normally is not. This energy-dependence is a non-standard modification that is not confirmed by any evidence. It is neither a theory nor a model, it is just an idea that, despite more than a decade of work, never developed into a proper model. Rainbow gravity has not been shown to be compatible with the standard model. There is no known quantization of this approach and one cannot describe interactions in this framework at all. Moreover, it is known to lead to non-localities with are ruled out already. For what I am concerned, no papers should get published on the topic until these issues have been resolved.

Rainbow gravity enjoys some popularity because it leads to Planck scale effects that can affect the propagation of particles, which could potentially be observable. Alas, no such effects have been found. No such effects have been found if the Planck scale is the normal one! The absolutely last thing you want to do at this point is argue that rainbow gravity should be combined with large extra dimensions, because then its effects would get stronger and probably be ruled out already. At the very least you would have to revisit all existing constraints on modified dispersion relations and reaction thresholds and so on. This isn't even mentioned in the paper.

That isn't all there is to say though. In their paper, the authors also unashamedly claim that such a modification has been predicted by Loop Quantum Gravity, and that it is a natural incorporation of effects found in string theory. Both of these statements are manifestly wrong. Modifications like this have been motivated by, but never been derived from Loop Quantum Gravity. And String Theory gives rise to some kind of minimal length, yes, but certainly not to rainbow gravity; in fact, the expression of the minimal length relation in string theory is known to be

In the rest of the paper, the authors then reconsider the emission rate of black holes in extra dimension with the energy-dependent metric.

They erroneously state that the temperature diverges when the mass goes to zero and that it comes to a "catastrophic evaporation". This has been known to be wrong since 20 years. This supposed catastrophic evaporation is due to an incorrect thermodynamical treatment, see for example section 3.1 of this paper. You do not need quantum gravitational effects to avoid this, you just have to get thermodynamics right. Another reason to not publish the paper. To be fair though, this point is pretty irrelevant for the rest of the authors' calculation.

They then argue that rainbow gravity leads to black hole remnants because the temperature of the black hole decreases towards the Planck scale. This isn't so surprising and is something that happens generically in models with modifications at the Planck scale, because they can bring down the final emission rate so that it converges and eventually stops.

The authors then further claim that the modification from rainbow gravity affects the cross-section for black hole production, which is probably correct, or at least not wrong. They then take constraints on the lowered Planck scale from existing searches for gravitons (ie missing energy) that should also be produced in this case. They use the contraints obtained from the graviton limits to say that with these limits, black hole production should not yet have been seen, but might appear in the upcoming LHC runs. They should not of course have used the constaints from a paper that were obtained in a scenario without the rainbow gravity modification, because the production of gravitons would likewise be modified.

Having said all that, the conclusion that they come to that rainbow gravity may lead to black hole remnants and make it more difficult to produce black holes is probably right, but it is nothing new. The reason is that these types of models lead to a generalized uncertainty principle, and all these calculations have been done before in this context. As the authors nicely point out, I wrote a paper already in 2004 saying that black hole production at the LHC should be suppressed if one takes into account that the Planck length acts as a minimal length.

Yes, in my youth I worked on black hole production at the LHC. I gracefully got out of this when it became obvious there wouldn't be black holes at the LHC, some time in 2005. And my paper, I should add, doesn't work with rainbow gravity but with a Lorentz-invariant high-energy deformation that only becomes relevant in the collision region and thus does not affect the propagation of free particles. In other words, in contrast to the model that the authors use, my model is not already ruled out by astrophysical constraints. The relevant aspects of the argument however are quite similar, thus the similar conclusions: If you take into account Planck length effects, it becomes more difficult to squeeze matter together to form a black hole because the additional space-time distortion acts against your efforts. This means you need to invest more energy than you thought to get particles close enough to collapse and form a horizon.

What does any of this have to do with paralell universes? Nothing, really, except that one of the authors, Mir Faizal, told some journalist there is a connection. In the phys.org piece one can read:

"To begin with rainbow gravity is neither new nor a theory, but that addition seems to be the journalist's fault. For what the parallel universes are concerned, to get these in extra dimensions you would need to have additional branes next to our own one and there is nothing like this in the paper. What this has to do with the multiverse I don't know, that's an entirely different story. Maybe this quote was taken out of context."Normally, when people think of the multiverse, they think of the many-worlds interpretation of quantum mechanics, where every possibility is actualized,"Faizal told Phys.org."This cannot be tested and so it is philosophy and not science. This is not what we mean by parallel universes. What we mean is real universes in extra dimensions. As gravity can flow out of our universe into the extra dimensions, such a model can be tested by the detection of mini black holes at the LHC. We have calculated the energy at which we expect to detect these mini black holes in gravity's rainbow [a new theory]. If we do detect mini black holes at this energy, then we will know that both gravity's rainbow and extra dimensions are correct."

Why does the media hype this nonsense? Three reasons I can think of. First, the next LHC startup is near and they're looking for a hook to get the story across. Black holes and parallel universes sound good, regardless of whether this has anything to do with reality. Second, the paper shamelessly overstates the relevance of the investigation, makes claims that are manifestly wrong, and fails to point out the miserable state that the framework they use is in. Third, the authors willingly feed the hype in the press.

Did the topic of rainbow gravity and the author's name, Mir Faizal, sound familiar? That's because I wrote about both only a month ago, when the press was hyping another nonsense story about black holes in rainbow gravity with the same author. In that previous paper they claimed that black holes in rainbow gravity don't have a horizon and nothing was mentioned about them forming remnants. I don't see how these both supposed consequences of rainbow gravity are even compatible with each other. If anything this just reinforces my impression that this isn't physics, it's just fanciful interpretation of algebraic manipulations that have no relation to reality whatsoever.

Here are a couple of interesting things I've come across in terms of public science outreach lately:

- I generally f-ing love "I f-ing love science" - they reach a truly impressive number of people, and they usually do a good job of conveying why science itself (beyond just particular results) is
*fun*. That being said, I've started to notice lately that in the physics and astro stories they run they sometimes either use inaccurate/hype-y headlines or report what is basically a press release completely uncritically. For instance, while it fires the mind of science fiction fans everywhere, I don't think it's actually good that IFLS decided to highlight a paper from the relatively obscure journal Phys. Lett. B and claim in a headline that the LHC could detect extra spatial dimensions by making mini black holes. Sure. And SETI might detect a signal next week. What are the odds that this will actually take place? Similarly, the headline "Spacetime foam discovery proves Einstein right" implies that someone has actually observed signatures of spacetime foam. In fact, the story is the exact opposite: Observations of photons from gamma ray bursts have shown*no*evidence of "foaminess" of spacetime, meaning that general relativity (without any exotic quantumness) can explain the results. A little improved quality control on the selection and headlines particularly on the high energy/astro stories would be great, thanks. - There was an article in the most recent APS News that got me interested in Alan Alda's efforts at Stony Brook on communicating science to the public. Alda, who hosted Scientific American Frontiers and played Feynman on Broadway, has dedicated a large part of his time in recent years to the cause of trying to spread the word to the general public about what science is, how it works, how it often involves compelling narratives, and how it is in many ways a pinnacle of human achievement. He is a fan of "challenge" contests, where participants are invited to submit a 300-word non-jargony explanation of some concept or phenomenon (e.g., "What is a flame?", "What is sleep?"). This is really hard to do well!
- Vox has an article that isn't surprising at all: Uncritical, hype-filled reporting of medical studies leads to news articles that give conflicting information to the public, and contributes to a growing sense among the lay-people that science is untrustworthy or a matter of opinion. Sigh.
- Occasionally deficit-hawk politicians realize that science research can benefit them by, e.g., curing cancer. If only they thought that basic research itself was valuable.

I took a physical-health day today, which means I stayed at home and worked on my students' projects, including commenting on drafts, manuscripts, or plots from Malz, Vakili, and Wang.

Bernhard Schölkopf arrived for a couple of days of work. We spent the morning discussing radio interferometry, *Kepler* light-curve modeling, and various thing philosophical. We headed up to the *Simons Foundation* to the Simons Center for Data Analysis for lunch. We had lunch with Marina Spivak (Simons) and Jim Simons (Simons). With the latter I discussed the issues of finding exoplanet rings, moons, and Trojans.

After lunch we ran into Leslie Greengard (Simons) and Alex Barnett (Dartmouth), with whom we had a long conversation about the linear algebra of non-compact kernel matrices on the sphere. This all relates to tractable non-approximate likelihood functions for the cosmic microwave background. The conversation ranged from cautiously optimistic (that we could do this for *Planck*-like data sets) to totally pessimistic, ending on an optimistic note. The day ended with a talk by Laura Haas (IBM) about infrastructure (and social science) she has been building (at IBM and in academic projects around data-driven science and discovery. She showed a great example of drug discovery (for cancer) by automated "reading" of the literature.

Yesterday’s post about VPython simulation of the famous bicycle wheel demo showed that you can get the precession and nutation from a simulation that only includes forces. But this is still kind of mysterious, from the standpoint of basic physics intuition. Specifically, it’s sort of hard to see how any of this produces a force up and to the left, as required for the precession to happen.

I spent a bunch of time last night drawing pictures and writing equations, and I think I have the start of an explanation. It all comes down to the picture of rigid objects as really stiff springs– the grey lines in the “featured image” above. If we imagine a start condition where all the springs are at their relaxed length, then look a short instant of time later, I think I can see where the force is coming from.

The two instants I want to imaging are shown schematically here– think of this as an end-on view of the rotating “wheel”:

All five of the balls (the four on the “rim” and the one at the hub) are subject to a downward gravitational force, and also have some linear momentum. If they start out exactly horizontal and vertical with the springs making up the spokes at their relaxed length, gravity is the only force that matters, and it’s indicated by short downward-pointing green arrows. The linear momentum for each ball is indicated by the reddish arrow (which deliberately doesn’t quite touch the ball, the usual convention I employ to try to avoid mixing up force and momentum arrows). The top ball is moving to the right, the bottom to the left, and so on.

A short time later, you end up with the situation on the right in the figure: each ball has moved a bit in the direction of its initial momentum, and also fallen down slightly due to gravity. You might be saying “Wait, the left ball actually shifted *up* from its initial position,” which is true, but that’s because the initial momentum was exactly vertical. It’s moved up by *less* than it would’ve without gravity, though.

Each of the “spokes” here has now stretched a tiny bit, and will thus be exerting a force pulling the ball toward the center. That’s a good thing, because it’s exactly the centripetal force you need to keep the balls spinning around in a more-or-less circular path (since I started the springs with no stretch, they’ll actually wobble in and out a bit as they go around, but it’s just cleaner to see what’s going on if we start with no forces). All four spokes have stretched by exactly the same amount, though, because the hub has also fallen by the same amount (you can convince yourself of this with half a page or so of equations; I’m not going to bother). All four spokes will exert exactly the same force, so they won’t have any net effect on the motion of the hub. So, this won’t get you the precession effect; but we knew that, because just having the spokes wasn’t enough in the simulation, either.

However, there are four more “springs” here, namely the braces stretching back toward the pivot point, which in this diagram would be straight back into the screen some distance from the initial position of the hub. If we imagine those four springs also started at their relaxed length, the motion of the balls makes *them* stretch, as well.

And while another half-page of equations can show it conclusively, I think you can see from the picture that these four springs do *not* all stretch by the same amount. On the right half of the figure, the distance from the center of the left ball to the center of the dotted outline showing the original position of the hub is smaller than the distance from the right ball to the original position of the hub, showing that the right spring will be more stretched than the left. And likewise, the top ball is slightly closer to where the hub was than the bottom is, so the bottom spring is stretched more than the top.

That means that the bottom and right braces will exert larger forces on the bottom and right balls. What are the directions of these forces? Well, mostly into the screen, but the bottom brace will also pull upward, and a little bit to the right. And the right brace will pull to the left, and a little bit up. The leftward pull of the right brace will be considerably bigger than the rightward pull of the bottom brace, so these don’t cancel each other out. There’s a down-and-left pull from the top brace on the top ball, and a right-and-down pull from the left brace on the left ball, as well, but these will be smaller than the pulls from the bottom and right braces.

If we carried this forward another step into the future, those forces would get communicated to the hub– the right ball would shift left as well as down, and the bottom ball would fall less quickly, compressing those spokes a bit beyond what they otherwise would be, which will then push the hub up and to the left. And the hub will push on the other two balls, so the whole thing moves up and to the left. But moving up and to the left is exactly what you need to get the precession effect seen in the simulation, and in the cool bicycle wheel demo.

Now, working all this out in detail, and carrying it forward for many more time steps is just a miserable mathematical grind. Which is why I simulated it on a computer in the first place, and why you’d be a fool to try to calculate this on paper without using angular momentum. But having confirmed from the simulation that this really does work with only forces, the two-time-step business above convinces me that I understand the origin of the precession and nutation at least in a qualitative way, which is more than good enough for a blog.

Got the Sunday Times, which I don’t usually do, and in the Book Review letters section I saw a familiar name: David English, of Somerville, MA. I started noticing this guy when I was in grad school. He writes letters to the editor. A lot of letters to the editor. Google finds about 10 pages […]

Got the Sunday Times, which I don’t usually do, and in the Book Review letters section I saw a familiar name: David English, of Somerville, MA. I started noticing this guy when I was in grad school. He writes letters to the editor. A lot of letters to the editor. Google finds about 10 pages of hits for his letters to the Times, starting in 1993 and continuing at a steady clip through the present. He wrote to the New Yorker and New York Magazine, too. And I thought I remembered him showing up in the Globe letter column, too, but Google can’t find that.

Who is David English of Somerville, MA? And has he actually had more letters to the New York Times published than anyone else alive?

guest post by David Spivak The problem The idea that’s haunted me, and motivated me, for the past seven years or so came to me while reading a book called The Moment of Complexity: our Emerging Network Culture, by Mark C. Taylor. It was a fascinating book about how our world is becoming increasingly networked—wired […]

*guest post by David Spivak*

The idea that’s haunted me, and motivated me, for the past seven years or so came to me while reading a book called *The Moment of Complexity: our Emerging Network Culture*, by Mark C. Taylor. It was a fascinating book about how our world is becoming increasingly networked—wired up and connected—and that this is leading to a dramatic increase in complexity. I’m not sure if it was stated explicitly there, but I got the idea that with the advent of the World Wide Web in 1991, a new neural network had been born. The lights had been turned on, and planet earth now had a brain.

I wondered how far this idea could be pushed. Is the world alive, is it a single living thing? If it is, in the sense I meant, then its primary job is to survive, and to survive it’ll have to make decisions. So there I was in my living room thinking, “oh my god, we’ve got to steer this thing!”

Taylor pointed out that as complexity increases, it’ll become harder to make sense of what’s going on in the world. That seemed to me like a big problem on the horizon, because in order to make good decisions, we need to have a good grasp on what’s occurring. I became obsessed with the idea of helping my species through this time of unprecedented complexity. I wanted to understand what was needed in order to help humanity make good decisions.

What seemed important as a first step is that we humans need to unify our understanding—to come to agreement—on matters of fact. For example, humanity still doesn’t know whether global warming is happening. Sure almost all credible scientists have agreed that it is happening, but does that steer money into programs that will slow it or mitigate its effects? This isn’t an issue of what course to take to solve a given problem; it’s about whether the problem even exists! It’s like when people were talking about Obama being a Muslim, born in Kenya, etc., and some people were denying it, saying he was born in Hawaii. If that’s true, why did he repeatedly refuse to show his birth certificate?

It is important, as a first step, to improve the extent to which we agree on the most obvious facts. This kind of “sanity check” is a necessary foundation for discussions about what course we should take. If we want to steer the ship, we have to make committed choices, like “we’re turning left now,” and we need to do so as a group. That is, there needs to be some amount of agreement about the way we should steer, so we’re not fighting ourselves.

Luckily there are a many cases of a group that needs to, and is able to, steer itself as a whole. For example as a human, my neural brain works with my cells to steer my body. Similarly, corporations steer themselves based on boards of directors, and based on flows of information, which run bureaucratically and/or informally between different parts of the company. Note that in neither case is there any suggestion that each part—cell, employee, or corporate entity—is “rational”; they’re all just doing their thing. What we do see in these cases is that the group members work together in a context where information and internal agreement is valued and often attained.

It seemed to me that intelligent, group-directed steering is possible. It does occur. But what’s the mechanism by which it happens, and how can we think about it? I figured that the way we steer, i.e., make decisions, is by using information.

I should be clear: whenever I say information, I never mean it “in the sense of Claude Shannon”. As beautiful as Shannon’s notion of information is, he’s not talking about the kind of information I mean. He explicitly said in his seminal paper that information in his sense is not concerned with *meaning*:

Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages.

In contrast, I’m interested in the semantic stuff, which flows between humans, and which makes possible decisions about things like climate change. Shannon invented a very useful *quantitative measure* of meaningless probability distributions.

That’s not the kind of information I’m talking about. When I say “I want to know what information is”, I’m saying I want to formulate the notion of human-usable semantic meaning, in as mathematical a way as possible.

Back to my problem: we need to steer the ship, and to do so we need to use information properly. Unfortunately, I had no idea what information is, nor how it’s used to make decisions (let alone to make good ones), nor how it’s obtained from our interaction with the world. Moreover, I didn’t have a clue how the minute information-handling at the micro-level, e.g., done by cells inside a body or employees inside a corporation, would yield information-handling at the macro (body or corporate) level.

I set out to try to understand what information *is* and how it can be communicated. What kind of stuff is information? It seems to follow rules: facts can be put together to form new facts, but only in certain ways. I was once explaining this idea to Dan Kan, and he agreed saying, “Yes, information is inherently a combinatorial affair.” What is the combinatorics of information?

Communication is similarly difficult to understand, once you dig into it. For example, my brain somehow enables me to use information and so does yours. But our brains are wired up in personal and *ad hoc* ways, when you look closely, a bit like a fingerprint or retinal scan. I found it fascinating that two highly personalized semantic networks could interface well enough to effectively collaborate.

There are two issues that I wanted to understand, and by to *understand* I mean to make mathematical to my own satisfaction. The first is what information *is*, as structured stuff, and what communication *is*, as a transfer of structured stuff. The second is how communication at micro-levels can create, or be, understanding at macro-levels, i.e., how a group can steer as a singleton.

Looking back on this endeavor now, I remain concerned. Things are getting increasingly complex, in the sorts of ways predicted by Mark C. Taylor in his book, and we seem to be losing some control: of the NSA, of privacy, of people 3D printing guns or germs, of drones, of big financial institutions, etc.

Can we expect or hope that our species as a whole will make decisions that are healthy, like keeping the temperature down, given the information we have available? Are we in the driver’s seat, or is our ship currently in the process of spiraling out of our control?

Let’s assume that we don’t want to panic but that we do want to participate in helping the human community to make appropriate decisions. A possible first step could be to formalize the notion of “using information well”. If we could do this rigorously, it would go a long way toward helping humanity get onto a healthy course. Further, mathematics is one of humanity’s best inventions. Using this tool to improve our ability to use information properly is a non-partisan approach to addressing the issue. It’s not about fighting, it’s about figuring out what’s happening, and weighing all our options in an informed way.

So, I ask: What kind of mathematics might serve as a formal ground for the notion of meaningful information, including both its successful communication and its role in decision-making?

Spring is finally in, and with it the great expectations for a new run of the Large Hadron Collider, which will restart in a month or so with a 62.5% increase in center of mass energy of the proton-proton collisions it produces: 13 TeV. At 13 TeV, the production of a 2-TeV Z' boson, say, would not be so terribly rare, making a signal soon visible in the data that ATLAS and CMS are eager to collect.

I am always looking for new ways to repeat myself, so I cannot possibly leave out this opportunity to point out yet another possibility to not test quantum gravity. Chris Lee from Arstechnica informed the world last week that “Deflecting X-rays due to gravity may provide view on quantum gravity”, which is a summary of the paper

The authors do not claim that this experiment would be more sensitive than already existing ones. I assume that if it was so, they’d have pointed this out. Instead, they write the main advantage is that this new method allows to test both special and general relativistic effects in tabletop experiments.

It’s a neat paper. What does it have to do with quantum gravity? Well, nothing. Indeed the whole paper doesn’t say anything about quantum gravity. Quantum gravity, I remind you, is the quantization of the gravitational interaction, which plays no role for this whatsoever. Chris Lee in his Arstechnica piece explains

Here is what Chris Lee had to say about the question what he thinks it’s got to do with quantum gravity:

**Summary**: Just because it’s something with quantum and something with gravity doesn’t mean it’s quantum gravity.

By Wen-Te Liao and Sven Ahrens

The idea is to shine light on a crystal at frequencies high enough so that it excites nuclear resonances. This excitation is delocalized, and the energy is basically absorbed and reemitted systematically, which leads to a propagation of the light-induced excitation through the crystal. How this propagation proceeds depends on the oscillations of the nuclei, which again depends on the local proper time. If you place the crystal in a gravitational field, the proper time will depend on the strength of the field. As a consequence, the propagation of the excitation through the crystal depends on the gradient of the gravitational field. The authors argue that in principle this influence of gravity on the passage of time in the crystal should be measurable.

They then look at a related but slightly different effect in which the crystal rotates and the time-dilatation resulting from the (non-inertial!) motion gives rise to a similar effect, though much larger in magnitude.The authors do not claim that this experiment would be more sensitive than already existing ones. I assume that if it was so, they’d have pointed this out. Instead, they write the main advantage is that this new method allows to test both special and general relativistic effects in tabletop experiments.

It’s a neat paper. What does it have to do with quantum gravity? Well, nothing. Indeed the whole paper doesn’t say anything about quantum gravity. Quantum gravity, I remind you, is the quantization of the gravitational interaction, which plays no role for this whatsoever. Chris Lee in his Arstechnica piece explains

“Experiments like these may even be sensitive enough to see the influence of quantum mechanics on space and time.”Which is just plainly wrong. The influence of quantum mechanics on space-time is far too weak to be measurable in this experiment, or in any other known laboratory experiment. If you figure out how to do this on a tabletop, book your trip to Stockholm right away. Though I recommend you show me the paper before you waste your money.

Here is what Chris Lee had to say about the question what he thinks it’s got to do with quantum gravity:

`@skdh @arstechnica no not entirely. Every test of general relativity is trying to find a discrepancy, which is relevant to quantum gravity`

— Chris Lee (@exmamaku) March 22, 2015

Deviations from general relativity aren’t the same as quantum gravity. And besides this, for all I can tell the authors haven’t claimed that they can test a new parameter regime that hasn’t been tested before. The reference to quantum gravity is an obvious attempt to sex up the piece and has no scientific credibility whatsoever.