Planet Musings

February 05, 2023

John BaezThe Italian Sixth

I showed my wife Lisa a nice video of Tommaso Zillio explaining a chord called the ‘Italian 6th’:

But she had a complaint: where the chords finally resolve to a C major triad, he writes the final chord as CCE—but those first two notes don’t sound like they’re an octave apart!

What do you think? Here’s the video:

Just for my own sake, let me explain how an Italian 6th works.

Zillio explains it in the key of C but I prefer numbers. One of the big motors that drives classical music is the progression

IV
V
I

from IV, called the ‘subdominant’, to V, called the ‘dominant’, to I, called the ‘tonic’. Tension followed by release! If we write these chords as triads they are

IV = 4 6 8
V = 5 7 9
I = 1 3 5

But we’d probably invert these chords for good voice leading: we don’t want notes making big jumps.

Most importantly, the 7 wants to climb up to the 8, which is one octave higher than the 1. Also, the 9 would be happy to climb up to the 10—which is an octave higher than the 3. So let’s raise the 1 and 3 an octave and get 8 and 10:

IV = 4 6 8
V = 5 7 9
I = 5   8 10

Since we’ve raised two notes, now we say the I s in ‘second inversion’.

Next: people like to heighten the tension of the V by adding an extra note and getting a 7th chord, called V7 or the ‘dominant 7th’:

IV = 4 6 8
V7 = 5 7 9 11
I =   5   8 10

This is an incredibly common chord progression, often decorated in various ways. Indeed, the music theorist Richard Goldman wrote:

The demand of the V7 for resolution is, to our ears, almost inescapably compelling. The dominant seventh is, in fact, the central propulsive force in our music; it is unambiguous and unequivocal.

Next, we could invert the IV chord by pushing the 4 up an octave to get 11:

IV =   6 8   11
V7 = 5 7 9 11
I =   5   8 10

Now the 11 just sits there for the first two chords. We could get a cooler sound by sharping that 11:

IV#° = 6 8   11#
V7 =   5 7 9 11
I =     5 8 10

The first chord is more fancy now: it’s called a ‘diminished sharp IV in first inversion’. (A IV chord is 4 6 8. If we push up the 4 by a half-tone we get 4# 6 8: now all three notes are a minor third apart, so it’s a ‘diminished’ triad, called a diminished sharp IV. To get a diminished sharp IV in first inversion we move the bottom note an octave higher and get 6 8 11#, which is what we have here.)

In this new chord progression the top voice nicely descends by half-steps from the 11# to the 11 to the 10:

IV#° = 6 8   11#
V7 =   5 7 9 11
I =     5 8 10

or in other words, subtracting an octave so it’s easier to understand, from the 4# to the 4 to the 3.

Next, we can flat the 6 in that first chord:

It6 = 6♭ 8   11#
V7 = 5 7 9 11
I =   5 8   10

Now the first chord is called the Italian 6th.

This progression is a much loved variant of our original basic IV V I. The interval from the 6♭ to the 4# is called an augmented 6th, so you may read about the Italian 6th in discussions of how to use the augmented 6th interval.

There are also two popular variants of the Italian 6th which add an extra note to the chord, called the ‘French 6th’:

and the ‘German 6th’:

But watch the videos for more on those!

n-Category Café Applied Category Theory 2023

You can now submit a paper if you want to give a talk here:

The Sixth International Conference on Applied Category Theory will take place at the University of Maryland from 31 July to 4 August 2023, preceded by the Adjoint School 2023 from 24 to 28 July. This conference follows previous events at Strathclyde (UK), Cambridge (UK), Cambridge (MA), Oxford (UK) and Leiden (NL).

Applied category theory is important to a growing community of researchers who study computer science, logic, engineering, physics, biology, chemistry, social science, systems, linguistics and other subjects using category-theoretic tools. The background and experience of our members is as varied as the systems being studied. The goal of the Applied Category Theory conference series is to bring researchers together, strengthen the applied category theory community, disseminate the latest results, and facilitate further development of the field.

Submissions

We accept submissions in English of original research papers, talks about work accepted/ submitted/published elsewhere, and demonstrations of relevant software. Accepted original research papers will be published in a proceedings volume. The conference will include an industry showcase event and community meeting. We particularly encourage people from underrepresented groups to submit their work and the organizers are committed to non-discrimination, equity, and inclusion.

Original research papers intended for conference proceedings should present original, high-quality work in the style of a computer science conference paper (up to 12 pages, not counting the bibliography; more detailed parts of proofs may be included in an appendix for the convenience of the reviewers). Please use the EPTCS style files available at

http://style.eptcs.org

Such submissions should not be an abridged version of an existing journal article although pre-submission arXiv preprints are permitted. These submissions will be adjudicated for both a talk and publication in the conference proceedings.

Important dates

The following dates are all in 2023, and Anywhere On Earth.

• Submission Deadline: Wednesday 3 May

• Author Notification: Wednesday 7 June

• Camera-ready version due: Tuesday 27 June

• Conference begins: 31 July

Program committee

Benedikt Ahrens

Mario Álvarez Picallo

Matteo Capucci

Titouan Carette

Bryce Clarke

Carmen Constantin

Geoffrey Cruttwell

Giovanni de Felice

Bojana Femic

Marcelo Fiore

Fabio Gadducci

Zeinab Galal

Richard Garner

Neil Ghani

Tamara von Glehn

Amar Hadzihasanovic

Masahito Hasegawa

Martha Lewis

Sophie Libkind

Rory Lucyshyn-Wright

Sandra Mantovanni

Jade Master

Konstantinos Meichanetzidis

Stefan Milius

Mike Mislove

Sean Moss

David Jaz Myers

Susan Niefield

Paige Randall North

Jason Parker

Evan Patterson

Sophie Raynor

Emily Roff

Morgan Rogers

Mario Román

Maru Sarazola

Bas Spitters

Sam Staton (co-chair)

Dario Stein

Eswaran Subrahmanian

Walter Tholen

Christina Vasilakopoulou (co-chair)

Christine Vespa

Simon Willerton

Glynn Winskel

Vladimir Zamdzhiev

Fabio Zanasi

Organizing committee

James Fairbanks, University of Florida

Joe Moeller, National Institute for Standards and Technology, USA

Sam Staton, Oxford University

Priyaa Varshinee Srinivasan, National Institute for Standards and Technology, USA

Christina Vasilakopoulou, National Technical University of Athens

Steering committee

John Baez, University of California, Riverside

Bob Coecke, Cambridge Quantum

Dorette Pronk, Dalhousie University

David Spivak, Topos Institute

David Hoggthermodynamics of cosmic gas

The day ended today at Flatiron with a great Colloquium by Eichiro Komatsu (MPA) about the temperature of cosmic gas. Gravitational collapse heats the gas, and that takes it up to something like 2 million degrees. This was computed ages ago by Peebles and others, but is now measured. There was a lot of discussion during and after about other heating mechanisms, and what things constitute gravitational heating. I'm interested in whether this result meaningfully constrains scattering interactions between the dark matter and baryons; if they scatter, and the dm is heavier, the baryons will (eventually) get exceedingly hot.

Doug NatelsonSome interesting links - useful lecture notes, videos

Proposal writing, paper writing, and course prep are eating a lot of my bandwidth right now, but I wanted to share a few things:

  • David Tong at Cambridge is a gifted educator and communicator who has written lecture notes that span a wide swath of the physics curriculum, from introductory material on mechanics through advanced graduate-level treatments of quantum field theory.  Truly, these are a fantastic resource, made freely available.  The link above goes to a page with links to all of these.
  • In a similar vein, Daniel Arovas at UC San Diego has also written up lecture notes on multiple components of physics, though usually aimed at the graduate level and not all linked in one place.  These include (links to pdf files) mechanics, thermodynamics and statistical mechanics, condensed matter physics, nonlinear dynamics, the quantum Hall effect, and group theory (unfinished).
  • I long ago should have mentioned this youtube channel (Kathy Loves Physics and History), by Kathy Joseph.  Her videos are a great blend of (like it says on the label) physics and history of science.  As a great example, check out the story of Ohm's Law.  I had never heard about the dispute between Ohm and Ampère (who didn't know about the internal resistance of batteries, and thus thought his experiments disproved Ohm's law).  
  • This twitter thread pointing out that current in quantum Hall and related systems is not, in fact, purely carried by states at the sample edges, is thought-provoking.  

February 03, 2023

Clifford JohnsonThe Life Scientific Interview

After doing a night bottle feed of our youngest in the wee hours of the morning some nights earlier this week, in order to help me get back to sleep I decided to turn on BBC Sounds to find a programme to listen to... and lo and behold, look what had just aired live! The programme that I'd recorded at Broadcasting House a few weeks ago in London.

So it is out now. It is an episode of Jim Al-Khalili's excellent BBC Radio 4 programme "The Life Scientific". The show is very much in the spirit of what (as you know) I strive to do in my work in the public sphere (including this blog): discuss the science an individual does right alongside aspects of the broader life of that individual. I recommend listening to [...] Click to continue reading this post

The post The Life Scientific Interview appeared first on Asymptotia.

Matt von HippelAll About the Collab

Sometimes, some scientists work alone. But mostly, scientists collaborate. We team up, getting more done together than we could alone.

Over the years, I’ve realized that theoretical physicists like me collaborate in a bit of a weird way, compared to other scientists. Most scientists do experiments, and those experiments require labs. Each lab typically has one principal investigator, or “PI”, who hires most of the other people in that lab. For any given project, scientists from the lab will be organized into particular roles. Some will be involved in the planning, some not. Some will do particular tests, gather data, manage lab animals, or do statistics. The whole experiment is at least roughly planned out from the beginning, and everyone has their own responsibility, to the extent that journals will sometimes ask scientists to list everyone’s roles when they publish papers. In this system, it’s rare for scientists from two different labs to collaborate. Usually it happens for a reason: a lab needs a statistician for a particularly subtle calculation, or one lab must process a sample so another lab can analyze it.

In contrast, theoretical physicists don’t have labs. Our collaborators sometimes come from the same university, but often they’re from a different one, frequently even in a different country. The way we collaborate is less like other scientists, and more like artists.

Sometimes, theoretical physicists have collaborations with dedicated roles and a detailed plan. This can happen when there is a specific calculation that needs to be done, that really needs to be done right. Some of the calculations that go into making predictions at the LHC are done in this way. I haven’t been in a collaboration like that (though in retrospect one collaborator may have had something like that in mind).

Instead, most of the collaborations I’ve been in have been more informal. They tend to start with a conversation. We chat by the coffee machine, or after a talk, anywhere there’s a blackboard nearby. It starts with “I’ve noticed something odd”, or “here’s something I don’t understand”. Then, we jam. We go back and forth, doing our thing and building on each other. Sometimes this happens in person, a barrage of questions and doubts until we hammer out something solid. Sometimes we go back to our offices, to calculate and look up references. Coming back the next day, we compare results: what did you manage to show? Did you get what I did? If not, why?

I make this sound spontaneous, but it isn’t completely. That starting conversation can be totally unplanned, but usually one of the scientists involved is trying to make it happen. There’s a different way you talk when you’re trying to start a collaboration, compared to when you just want to talk. If you’re looking for a collaboration, you go into more detail. If the other person is on the same wavelength, you start using “we” instead of “I”, or you start suggesting plans of action: “you could do X, while I do Y”. If you just want someone’s opinion, or just want to show off, then your conversation is less detailed, and less personal.

This is easiest to do with our co-workers, but we do it with people from other universities too. Sometimes this happens at conferences, more often during short visits for seminars. I’ve been on almost every end of this. As a visitor, I’ve arrived to find my hosts with a project in mind. As a host, I’ve invited a visitor with the goal of getting them involved in a collaboration, and I’ve received a visitor who came with their own collaboration idea.

After an initial flurry of work, we’ll have a rough idea of whether the project is viable. If it is, things get a bit more organized, and we sort out what needs to be done and a rough idea of who will do it. While the early stages really benefit from being done in person, this part is easier to do remotely. The calculations get longer but the concepts are clear, so each of us can work by ourselves, emailing when we make progress. If we get confused again, we can always schedule a Zoom to sort things out.

Once things are close (but often not quite done), it’s time to start writing the paper. In the past, I used Dropbox for this: my collaborators shared a folder with a draft, and we’d pass “control” back and forth as we wrote and edited. Now, I’m more likely to use something built for this purpose. Git is a tool used by programmers to collaborate on code. It lets you roll back edits you don’t like, and merge edits from two people to make sure they’re consistent. For other collaborations I use Overleaf, an online interface for the document-writing language LaTeX that lets multiple people edit in real-time. Either way, this part is also more or less organized, with a lot of “can you write this section?” that can shift around depending on how busy people end up being.

Finally, everything comes together. The edits stabilize, everyone agrees that the paper is good (or at least, that any dissatisfaction they have is too minor to be worth arguing over). We send it to a few trusted friends, then a few days later up on the arXiv it goes.

Then, the cycle begins again. If the ideas are still clear enough, the same collaboration might keep going, planning follow-up work and follow-up papers. We meet new people, or meet up with old ones, and establish new collaborations as we go. Our fortunes ebb and flow based on the conversations we have, the merits of our ideas and the strengths of our jams. Sometimes there’s more, sometimes less, but it keeps bubbling up if you let it.

Tommaso DorigoTwo Possible Sites For The SWGO Gamma-Ray Detector Array

Yesterday I profited of the kindness of Cesar Ocampo, the site manager of the Parque Astronomico near San Pedro de Atacama, in northern Chile, to visit a couple of places that the SWGO collaboration is considering as the site of a large array of particle detectors meant to study ultra-high-energy gamma rays from the sky. 

SWGO and cosmic ray showers

read more

February 01, 2023

Terence TaoInfinite partial sumsets in the primes

Tamar Ziegler and I have just uploaded to the arXiv our paper “Infinite partial sumsets in the primes“. This is a short paper inspired by a recent result of Kra, Moreira, Richter, and Robertson (discussed for instance in this Quanta article from last December) showing that for any set {A} of natural numbers of positive upper density, there exists a sequence {b_1 < b_2 < b_3 < \dots} of natural numbers and a shift {t} such that {b_i + b_j + t \in A} for all {i<j} this answers a question of Erdős). In view of the “transference principle“, it is then plausible to ask whether the same result holds if {A} is replaced by the primes. We can show the following results:

Theorem 1
  • (i) If the Hardy-Littlewood prime tuples conjecture (or the weaker conjecture of Dickson) is true, then there exists an increasing sequence {b_1 < b_2 < b_3 < \dots} of primes such that {b_i + b_j + 1} is prime for all {i < j}.
  • (ii) Unconditionally, there exist increasing sequences {a_1 < a_2 < \dots} and {b_1 < b_2 < \dots} of natural numbers such that {a_i + b_j} is prime for all {i<j}.
  • (iii) These conclusions fail if “prime” is replaced by “positive (relative) density subset of the primes” (even if the density is equal to 1).

We remark that it was shown by Balog that there (unconditionally) exist arbitrarily long but finite sequences {b_1 < \dots < b_k} of primes such that {b_i + b_j + 1} is prime for all {i < j \leq k}. (This result can also be recovered from the later results of Ben Green, myself, and Tamar Ziegler.) Also, it had previously been shown by Granville that on the Hardy-Littlewood prime tuples conjecture, there existed increasing sequences {a_1 < a_2 < \dots} and {b_1 < b_2 < \dots} of natural numbers such that {a_i+b_j} is prime for all {i,j}.

The conclusion of (i) is stronger than that of (ii) (which is of course consistent with the former being conditional and the latter unconditional). The conclusion (ii) also implies the well-known theorem of Maynard that for any given {k}, there exist infinitely many {k}-tuples of primes of bounded diameter, and indeed our proof of (ii) uses the same “Maynard sieve” that powers the proof of that theorem (though we use a formulation of that sieve closer to that in this blog post of mine). Indeed, the failure of (iii) basically arises from the failure of Maynard’s theorem for dense subsets of primes, simply by removing those clusters of primes that are unusually closely spaced.

Our proof of (i) was initially inspired by the topological dynamics methods used by Kra, Moreira, Richter, and Robertson, but we managed to condense it to a purely elementary argument (taking up only half a page) that makes no reference to topological dynamics and builds up the sequence {b_1 < b_2 < \dots} recursively by repeated application of the prime tuples conjecture.

The proof of (ii) takes up the majority of the paper. It is easiest to phrase the argument in terms of “prime-producing tuples” – tuples {(h_1,\dots,h_k)} for which there are infinitely many {n} with {n+h_1,\dots,n+h_k} all prime. Maynard’s theorem is equivalent to the existence of arbitrarily long prime-producing tuples; our theorem is equivalent to the stronger assertion that there exist an infinite sequence {h_1 < h_2 < \dots} such that every initial segment {(h_1,\dots,h_k)} is prime-producing. The main new tool for achieving this is the following cute measure-theoretic lemma of Bergelson:

Lemma 2 (Bergelson intersectivity lemma) Let {E_1,E_2,\dots} be subsets of a probability space {(X,\mu)} of measure uniformly bounded away from zero, thus {\inf_i \mu(E_i) > 0}. Then there exists a subsequence {E_{i_1}, E_{i_2}, \dots} such that

\displaystyle  \mu(E_{i_1} \cap \dots \cap E_{i_k} ) > 0

for all {k}.

This lemma has a short proof, though not an entirely obvious one. Firstly, by deleting a null set from {X}, one can assume that all finite intersections {E_{i_1} \cap \dots \cap E_{i_k}} are either positive measure or empty. Secondly, a routine application of Fatou’s lemma shows that the maximal function {\limsup_N \frac{1}{N} \sum_{i=1}^N 1_{E_i}} has a positive integral, hence must be positive at some point {x_0}. Thus there is a subsequence {E_{i_1}, E_{i_2}, \dots} whose finite intersections all contain {x_0}, thus have positive measure as desired by the previous reduction.

It turns out that one cannot quite combine the standard Maynard sieve with the intersectivity lemma because the events {E_i} that show up (which roughly correspond to the event that {n + h_i} is prime for some random number {n} (with a well-chosen probability distribution) and some shift {h_i}) have their probability going to zero, rather than being uniformly bounded from below. To get around this, we borrow an idea from a paper of Banks, Freiberg, and Maynard, and group the shifts {h_i} into various clusters {h_{i,1},\dots,h_{i,J_1}}, chosen in such a way that the probability that at least one of {n+h_{i,1},\dots,n+h_{i,J_1}} is prime is bounded uniformly from below. One then applies the Bergelson intersectivity lemma to those events and uses many applications of the pigeonhole principle to conclude.

January 31, 2023

n-Category Café Talk on the Tenfold Way

There are ten ways that a substance can have symmetry under time reversal, switching particles and holes, both or neither. But this fact turns out to extend far beyond condensed matter physics! It’s really built into the fabric of mathematics in a deep way.

On Monday February 6, 2023 I’m giving a talk about this. It’s at 10 am Pacific Time, or 18:00 UTC. To attend, you need to register here. You can see my slides already here.

The tenfold way

Abstract. The importance of the tenfold way in physics was only recognized in this century. Simply put, it implies that there are ten fundamentally different kinds of matter. But it goes back to 1964, when the topologist C. T. C. Wall classified the associative real super division algebras and found ten of them. The three ‘purely even’ examples were already familiar: the real numbers, complex numbers and quaternions. The rest become important when we classify representations of groups or supergroups on /2\mathbb{Z}\! /\! 2-graded Hilbert spaces. We explain this classification, its connection to Clifford algebras, and some of its implications.

John BaezTalk on the Tenfold Way

There are ten ways that a substance can have symmetry under time reversal, switching particles and holes, both or neither. But this fact turns out to extend far beyond condensed matter physics! It’s really built into the fabric of mathematics in a deep way.

Next Monday I’m giving a talk about this. It’s at 10 am Pacific Time, or 18:00 UTC. To attend, you need to register here. You can see my slides already here.

January 30, 2023

David Hoggciting things

I spent a big part of today working on finishing up a paper with Megan Bedell (Flatiron). My job was to fill in missing references. I'm still not efficient at this, more than 30 years in to my astronomy career.

David HoggGothamfest

Once a year (and differently every year), we get together as much of the astronomical community in New York City as we can and have them give fast talks. Today was great! I learned a huge amount, and no highlight reel would do. But here are some examples: Amanda Quirk (Columbia) has great data on M33 stars that maybe we could use to build images of the orbital toruses using technology that Price-Whelan and I developed over the last few years? Marc Huertas-Company (Paris) said (confidently?) that many of the star-forming galaxies found by JWST at very high redshift are likely prolate. Michael Higgins (CUNY) and Keaton Bell (CUNY) have a beautiful system to separate sources of variability out in NASA TESS data using structure in frequency space. Kate Storey-Fisher (NYU) showed results from Giulio Fabbian cross-correlating her ESA Gaia quasar sample with the ESA Planck lensing map, with better error bars than any previous survey! Ben Cassese (Columbia) showed a moving-object pipeline with NASA TESS imaging that detects outer Solar System objects, much like old work by Dustin Lang and myself.

John PreskillA (quantum) complex legacy

Early in the fourth year of my PhD, I received a most John-ish email from John Preskill, my PhD advisor. The title read, “thermodynamics of complexity,” and the message was concise the way that the Amazon River is damp: “Might be an interesting subject for you.” 

Below the signature, I found a paper draft by Stanford physicists Adam Brown and Lenny Susskind. Adam is a Brit with an accent and a wit to match his Oxford degree. Lenny, known to the public for his books and lectures, is a New Yorker with an accent that reminds me of my grandfather. Before the physicists posted their paper online, Lenny sought feedback from John, who forwarded me the email.

The paper concerned a confluence of ideas that you’ve probably encountered in the media: string theory, black holes, and quantum information. String theory offers hope for unifying two physical theories: relativity, which describes large systems such as our universe, and quantum theory, which describes small systems such as atoms. A certain type of gravitational system and a certain type of quantum system participate in a duality, or equivalence, known since the 1990s. Our universe isn’t such a gravitational system, but never mind; the duality may still offer a toehold on a theory of quantum gravity. Properties of the gravitational system parallel properties of the quantum system and vice versa. Or so it seemed.

The gravitational system can have two black holes linked by a wormhole. The wormhole’s volume can grow linearly in time for a time exponentially long in the black holes’ entropy. Afterward, the volume hits a ceiling and approximately ceases changing. Which property of the quantum system does the wormhole’s volume parallel?

Envision the quantum system as many particles wedged close together, so that they interact with each other strongly. Initially uncorrelated particles will entangle with each other quickly. A quantum system has properties, such as average particle density, that experimentalists can measure relatively easily. Does such a measurable property—an observable of a small patch of the system—parallel the wormhole volume? No; such observables cease changing much sooner than the wormhole volume does. The same conclusion applies to the entanglement amongst the particles.

What about a more sophisticated property of the particles’ quantum state? Researchers proposed that the state’s complexity parallels the wormhole’s volume. To grasp complexity, imagine a quantum computer performing a computation. When performing computations in math class, you needed blank scratch paper on which to write your calculations. A quantum computer needs the quantum equivalent of blank scratch paper: qubits (basic units of quantum information, realized, for example, as atoms) in a simple, unentangled, “clean” state. The computer performs a sequence of basic operations—quantum logic gates—on the qubits. These operations resemble addition and subtraction but can entangle the qubits. What’s the minimal number of basic operations needed to prepare a desired quantum state (or to “uncompute” a given state to the blank state)? The state’s quantum complexity.1 

Quantum complexity has loomed large over multiple fields of physics recently: quantum computing, condensed matter, and quantum gravity. The latter, we established, entails a duality between a gravitational system and a quantum system. The quantum system begins in a simple quantum state that grows complicated as the particles interact. The state’s complexity parallels the volume of a wormhole in the gravitational system, according to a conjecture.2 

The conjecture would hold more water if the quantum state’s complexity grew similarly to the wormhole’s volume: linearly in time, for a time exponentially large in the quantum system’s size. Does the complexity grow so? The expectation that it does became the linear-growth conjecture.

Evidence supported the conjecture. For instance, quantum information theorists modeled the quantum particles as interacting randomly, as though undergoing a quantum circuit filled with random quantum gates. Leveraging probability theory,3 the researchers proved that the state’s complexity grows linearly at short times. Also, the complexity grows linearly for long times if each particle can store a great deal of quantum information. But what if the particles are qubits, the smallest and most ubiquitous unit of quantum information? The question lingered for years.

Jonas Haferkamp, a PhD student in Berlin, dreamed up an answer to an important version of the question.4 I had the good fortune to help formalize that answer with him and members of his research group: master’s student Teja Kothakonda, postdoc Philippe Faist, and supervisor Jens Eisert. Our paper, published in Nature Physics last year, marked step one in a research adventure catalyzed by John Preskill’s email 4.5 years earlier.

Imagine, again, qubits undergoing a circuit filled with random quantum gates. That circuit has some architecture, or arrangement of gates. Slotting different gates into the architecture effects different transformations5 on the qubits. Consider the set of all transformations implementable with one architecture. This set has some size, which we defined and analyzed.

What happens to the set’s size if you add more gates to the circuit—let the particles interact for longer? We can bound the size’s growth using the mathematical toolkits of algebraic geometry and differential topology. Upon bounding the size’s growth, we can bound the state’s complexity. The complexity, we concluded, grows linearly in time for a time exponentially long in the number of qubits.

Our result lends weight to the complexity-equals-volume hypothesis. The result also introduces algebraic geometry and differential topology into complexity as helpful mathematical toolkits. Finally, the set size that we bounded emerged as a useful concept that may elucidate circuit analyses and machine learning.

John didn’t have machine learning in mind when forwarding me an email in 2017. He didn’t even have in mind proving the linear-growth conjecture. The proof enables step two of the research adventure catalyzed by that email: thermodynamics of quantum complexity, as the email’s title stated. I’ll cover that thermodynamics in its own blog post. The simplest of messages can spin a complex legacy.

The links provided above scarcely scratch the surface of the quantum-complexity literature; for a more complete list, see our paper’s bibliography. For a seminar about the linear-growth paper, see this video hosted by Nima Lashkari’s research group.

1The term complexity has multiple meanings; forget the rest for the purposes of this article.

2According to another conjecture, the quantum state’s complexity parallels a certain space-time region’s action. (An action, in physics, isn’t a motion or a deed or something that Hamlet keeps avoiding. An action is a mathematical object that determines how a system can and can’t change in time.) The first two conjectures snowballed into a paper entitled “Does complexity equal anything?” Whatever it parallels, complexity plays an important role in the gravitational–quantum duality. 

3Experts: Such as unitary t-designs.

4Experts: Our work concerns quantum circuits, rather than evolutions under fixed Hamiltonians. Also, our work concerns exact circuit complexity, the minimal number of gates needed to prepare a state exactly. A natural but tricky extension eluded us: approximate circuit complexity, the minimal number of gates needed to approximate the state.

5Experts: Unitary operators.

January 28, 2023

John BaezMathematics for Humanity

We discussed this here earlier, but now it’s actually happening!

The International Centre for Mathematical Sciences, or ICMS, in Edinburgh, will host a new project entitled ‘Mathematics for Humanity’. This will be devoted to education, research, and scholarly exchange having direct relevance to the ways in which mathematics can contribute to the betterment of humanity. Submitted proposals will be reviewed on April 15, 2023.

The activities of the program will revolve around three interrelated themes:

A. Integrating the global research community (GRC)

B. Mathematical challenges for humanity (MCH)

C. Global history of mathematics (GHM)

Development of the three themes will facilitate the engagement of the international mathematical community with the challenges of accessible education, knowledge-driven activism, and transformative scholarship.

For theme A, a coherent plan of activities for an extended period can be presented (at least 2 weeks, and up to to 3 months), comprising courses and seminars bringing together researchers from at least two different regions, which should be combined with networking activities and hybrid dissemination. Themes B and C would also comprise individual collaborative events.

Within each of the three themes, researchers can apply for one of the following activities:

  1. Research-in-groups. This is a proposal for a small group of 3 to 6 researchers to spend from 2 weeks to 3 months in Edinburgh on a reasonably well-defined research project. The researchers will be provided working space and funds for accommodation and subsistence.
  2. Research course or seminar. A group of researchers can propose a course or a seminar on topics relevant to one of the three themes. These should be planned as hybrid events with regular meetings in Edinburgh that can also be accessed online. Proposals should come with a detailed plan for attracting interest and for the dissemination of ideas.

  3. Research workshops. These are 5-day workshops in the standard ICMS format, of course with a focus on one of the three themes.

  4. Research school. These are hybrid schools of two-weeks length on one of the themes. These should come with substantial planning, a coherent structure, and be aimed towards post-graduate students and early career researchers.

The ICMS expects that up to 30 researchers will be in residence in Edinburgh at any given time over a 9-month period, which might be divided into three terms, mid-September to mid-December, mid-January to mid-April, and mid-April to mid-July. Every effort will be made to provide a unified facility for the activities of all groups working on all three themes, thereby encouraging a synergistic exchange of ideas and vision. The proposals will be reviewed twice a year soon after the spring deadline of 15 April and the autumn deadline of 15 November.

Project Summary

Submission Guidelines

Scientific Committee

Queries about the project should be sent to ICMS director Minhyong Kim or deputy director Beatrice Pelloni, who will be aided by the Scientific Committee in the selection of proposals:

John Baez (UC Riverside)                            • Karine Chemla (Paris)

Sophie Dabo-Niang (Lille)                           • Reviel Netz (Stanford)

Bao Chau Ngo (Chicago and VIASM)            • Raman Parimala (Emory)

Fernando Rodriguez Villegas (ICTP, Trieste)  • Terence Tao (UCLA)

January 27, 2023

Doug NatelsonCavities and tuning physics

I've written before about cavity quantum electrodynamics.  An electromagnetic cavity - a resonator of some kind, like your microwave oven chamber is for microwaves, or like an optical cavity made using nearly perfect mirrors - picks out what electromagnetic modes are allowed inside it.  In the language of photons, the "density of states" for photons in the cavity is modified from what it would be in free space.  Matter placed in the cavity, e.g. an atom, then interacts with that modified environment, even if the cavity is not being excited.  Instead of thinking about just the matter, or just the radiation by itself, in the cavity you need to include the light-matter interaction, and you can end up with states called polaritons that are combinations of matter + radiation excitations.  There are various flavors of polaritons, as there are different kinds of cavities as well as different kinds of matter (atoms vs. excitons, for example).

I just heard a nice talk by Angel Rubio about recent advances in applying cavity effects to both chemistry and materials properties.  For a recent discussion of the former, you can try here (pdf file).  Similar in spirit, there is a great deal of interest in using cavity interactions to modify the ground states (or excited states) of solid materials.  Resonantly altering phonons might allow tuning of superconductivity, for example.  Or, you could take a material like SrTiO3, which is almost a ferroelectric, and try to stabilize ferroelectricity.  Or, you could to take something that is almost a spin liquid and try to get it there by putting it in a cavity and pumping a little.

It's certainly interesting to ponder.  Achieving this in practice is very challenging, because getting matter-cavity couplings to be sufficiently large is not easy.  Never the less, the idea that you can take a material and potentially change something fundamental about its properties just by placing it in the right surroundings sounds almost magical.  Very cool to consider.

Matt von HippelLHC Black Holes for the Terminally Un-Reassured

Could the LHC have killed us all?

No, no it could not.

But…

I’ve had this conversation a few times over the years. Usually, the people I’m talking to are worried about black holes. They’ve heard that the Large Hadron Collider speeds up particles to amazingly high energies before colliding them together. They worry that these colliding particles could form a black hole, which would fall into the center of the Earth and busily gobble up the whole planet.

This pretty clearly hasn’t happened. But also, physicists were pretty confident that it couldn’t happen. That isn’t to say they thought it was impossible to make a black hole with the LHC. Some physicists actually hoped to make a black hole: it would have been evidence for extra dimensions, curled-up dimensions much larger than the tiny ones required by string theory. They figured out the kind of evidence they’d see if the LHC did indeed create a black hole, and we haven’t seen that evidence. But even before running the machine, they were confident that such a black hole wouldn’t gobble up the planet. Why?

The best argument is also the most unsatisfying. The LHC speeds up particles to high energies, but not unprecedentedly high energies. High-energy particles called cosmic rays enter the atmosphere every day, some of which are at energies comparable to the LHC. The LHC just puts the high-energy particles in front of a bunch of sophisticated equipment so we can measure everything about them. If the LHC could destroy the world, cosmic rays would have already done so.

That’s a very solid argument, but it doesn’t really explain why. Also, it may not be true for future colliders: we could build a collider with enough energy that cosmic rays don’t commonly meet it. So I should give another argument.

The next argument is Hawking radiation. In Stephen Hawking’s most famous accomplishment, he argued that because of quantum mechanics black holes are not truly black. Instead, they give off a constant radiation of every type of particle mixed together, shrinking as it does so. The radiation is faintest for large black holes, but gets more and more intense the smaller the black hole is, until the smallest black holes explode into a shower of particles and disappear. This argument means that a black hole small enough that the LHC could produce it would radiate away to nothing in almost an instant: not long enough to leave the machine, let alone fall to the center of the Earth.

This is a good argument, but maybe you aren’t as sure as I am about Hawking radiation. As it turns out, we’ve never measured Hawking radiation, it’s just a theoretical expectation. Remember that the radiation gets fainter the larger the black hole is: for a black hole in space with the mass of a star, the radiation is so tiny it would be almost impossible to detect even right next to the black hole. From here, in our telescopes, we have no chance of seeing it.

So suppose tiny black holes didn’t radiate, and suppose the LHC could indeed produce them. Wouldn’t that have been dangerous?

Here, we can do a calculation. I want you to appreciate how tiny these black holes would be.

From science fiction and cartoons, you might think of a black hole as a kind of vacuum cleaner, sucking up everything nearby. That’s not how black holes work, though. The “sucking” black holes do is due to gravity, no stronger than the gravity of any other object with the same mass at the same distance. The only difference comes when you get close to the event horizon, an invisible sphere close-in around the black hole. Pass that line, and the gravity is strong enough that you will never escape.

We know how to calculate the position of the event horizon of a black hole. It’s the Schwarzchild radius, and we can write it in terms of Newton’s constant G, the mass of the black hole M, and the speed of light c, as follows:

\frac{2GM}{c^2}

The Large Hadron Collider’s two beams each have an energy around seven tera-electron-volts, or TeV, so there are 14 TeV of energy in total in each collision. Imagine all of that energy being converted into mass, and that mass forming a black hole. That isn’t how it would actually happen: some of the energy would create other particles, and some would give the black hole a “kick”, some momentum in one direction or another. But we’re going to imagine a “worst-case” scenario, so let’s assume all the energy goes to form the black hole. Electron-volts are a weird physicist unit, but if we divide them by the speed of light squared (as we should if we’re using E=mc^2 to create a mass), then Wikipedia tells us that each electron-volt will give us 1.78\times 10^{-36} kilograms. “Tera” is the SI prefix for 10^{12}. Thus our tiny black hole starts with a mass of

14\times 10^{12}\times 1.78\times 10^{-36} = 2.49\times 10^{-23} \textrm{kg}

Plugging in Newton’s constant (6.67\times 10^{-11} meters cubed per kilogram per second squared), and the speed of light (3\times 10^8 meters per second), and we get a radius of,

\frac{2\times 6.67\times 10^{-11}\times 14\times 10^{12}\times 1.78\times 10^{-36}}{\left(3\times 10^8\right)^2} = 3.7\times 10^{-50} \textrm{m}

That, by the way, is amazingly tiny. The size of an atom is about 10^{-10} meters. If every atom was a tiny person, and each of that person’s atoms was itself a person, and so on for five levels down, then the atoms of the smallest person would be the same size as this event horizon.

Now, we let this little tiny black hole fall. Let’s imagine it falls directly towards the center of the Earth. The only force affecting it would be gravity (if it had an electrical charge, it would quickly attract a few electrons and become neutral). That means you can think of it as if it were falling through a tiny hole, with no friction, gobbling up anything unfortunate enough to fall within its event horizon.

For our first estimate, we’ll treat the black hole as if it stays the same size through its journey. Imagine the black hole travels through the entire earth, absorbing a cylinder of matter. Using the Earth’s average density of 5515 kilograms per cubic meter, and the Earth’s maximum radius of 6378 kilometers, our cylinder adds a mass of,

\pi \times \left(3.7\times 10^{-50}\right)^2 \times 2 \times 6378\times 10^3\times 5515 = 3\times 10^{-88} \textrm{kg}

That’s absurdly tiny. That’s much, much, much tinier than the mass we started out with. Absorbing an entire cylinder through the Earth makes barely any difference.

You might object, though, that the black hole is gaining mass as it goes. So really we ought to use a differential equation. If the black hole travels a distance r, absorbing mass as it goes at average Earth density \rho, then we find,

\frac{dM}{dr}=\pi\rho\left(\frac{2GM(r)}{c^2}\right)^2

Solving this, we get

M(r)=\frac{M_0}{1- M_0 \pi\rho\left(\frac{2G}{c^2}\right)^2 r }

Where M_0 is the mass we start out with.

Plug in the distance through the Earth for r, and we find…still about 3\times 10^{-88} \textrm{kg}! It didn’t change very much, which makes sense, it’s a very very small difference!

But you might still object. A black hole falling through the Earth wouldn’t just go straight through. It would pass through, then fall back in. In fact, it would oscillate, from one side to the other, like a pendulum. This is actually a common problem to give physics students: drop an object through a hole in the Earth, neglect air resistance, and what does it do? It turns out that the time the object takes to travel through the Earth is independent of its mass, and equal to roughly 84.5 minutes.

So let’s ask a question: how long would it take for a black hole, oscillating like this, to double its mass?

We want to solve,

2=\frac{1}{1- M_0 \pi\rho\left(\frac{2G}{c^2}\right)^2 r }

so we need the black hole to travel a total distance of

r=\frac{1}{2M_0 \pi\rho\left(\frac{2G}{c^2}\right)^2} = 5.3\times 10^{71} \textrm{m}

That’s a huge distance! The Earth’s radius, remember, is 6378 kilometers. So traveling that far would take

5.3\times 10^{71} \times 84.5/60/24/365 = 8\times 10^{67} \textrm{y}

Ten to the sixty-seven years. Our universe is only about ten to the ten years old. In another five times ten to the nine years, the Sun will enter its red giant phase, and swallow the Earth. There simply isn’t enough time for this tiny tiny black hole to gobble up the world, before everything is already gobbled up by something else. Even in the most pessimistic way to walk through the calculation, it’s just not dangerous.

I hope that, if you were worried about black holes at the LHC, you’re not worried any more. But more than that, I hope you’ve learned three lessons. First, that even the highest-energy particle physics involves tiny energies compared to day-to-day experience. Second, that gravitational effects are tiny in the context of particle physics. And third, that with Wikipedia access, you too can answer questions like this. If you’re worried, you can make an estimate, and check!

January 26, 2023

n-Category Café Mathematics for Humanity

I mentioned this earlier, but now it’s actually happening! I hope you can think of good workshops and apply to run them in Edinburgh.

The International Centre for Mathematical Sciences, or ICMS, in Edinburgh, will host a new project entitled Mathematics for Humanity. This will be devoted to education, research, and scholarly exchange having direct relevance to the ways in which mathematics can contribute to the betterment of humanity. Submitted proposals will be reviewed on April 15, 2023.

The activities of the program will revolve around three interrelated themes:

A. Integrating the global research community (GRC)

B. Mathematical challenges for humanity (MCH)

C. Global history of mathematics (GHM)

Development of the three themes will facilitate the engagement of the international mathematical community with the challenges of accessible education, knowledge-driven activism, and transformative scholarship.

For theme A, a coherent plan of activities for an extended period can be presented (at least 2 weeks, and up to to 3 months), comprising courses and seminars bringing together researchers from at least two different regions, which should be combined with networking activities and hybrid dissemination. Themes B and C would also comprise individual collaborative events.

Within each of the three themes, researchers can apply for one of the following activities:

  1. Research-in-groups. This is a proposal for a small group of 3 to 6 researchers to spend from 2 weeks to 3 months in Edinburgh on a reasonably well-defined research project. The researchers will be provided working space and funds for accommodation and subsistence.

  2. Research course or seminar. A group of researchers can propose a course or a seminar on topics relevant to one of the three themes. These should be planned as hybrid events with regular meetings in Edinburgh that can also be accessed online. Proposals should come with a detailed plan for attracting interest and for the dissemination of ideas.

  3. Research workshops. These are 5-day workshops in the standard ICMS format, of course with a focus on one of the three themes.

  4. Research school. These are hybrid schools of two-weeks length on one of the themes. These should come with substantial planning, a coherent structure, and be aimed towards post-graduate students and early career researchers.

The ICMS expects that up to 30 researchers will be in residence in Edinburgh at any given time over a 9-month period, which might be divided into three terms, mid-September to mid-December, mid-January to mid-April, and mid-April to mid-July. Every effort will be made to provide a unified facility for the activities of all groups working on all three themes, thereby encouraging a synergistic exchange of ideas and vision. The proposals will be reviewed twice a year soon after the spring deadline of 15 April and the autumn deadline of 15 November.

Project Summary

Submission Guidelines

Scientific Committee

Queries about the project should be sent to ICMS director Minhyong Kim or deputy director Beatrice Pelloni, who will be aided by the Scientific Committee in the selection of proposals:

John Baez (UC Riverside)                                        • Karine Chemla (Paris)

Sophie Dabo (Lille)                                                  • Reviel Netz (Stanford)

Bao Chau Ngo (Chicago and VIASM)                     • Raman Parimala (Emory)

Fernando Rodriguez Villegas (ICTP, Trieste)       • Terence Tao (UCLA)

n-Category Café Mathematics for Humanity: a Plan

I’m working with an organization that may eventually fund proposals to fund workshops for research groups working on “mathematics for humanity”. This would include math related to climate change, health, democracy, economics, etc.

I can’t give details unless and until it solidifies.

However, it would help me to know a bunch of possible good proposals. Can you help me imagine some?

A good proposal needs:

  • a clearly well-defined subject where mathematics is already helping humanity but could help more, together with

  • a specific group of people who already have a track record of doing good work on this subject, and

  • some evidence that having a workshop, maybe as long as 3 months, bringing together this group and other people, would help them do good things.

I’m saying this because I don’t want vague ideas like “oh it would be cool if a bunch of category theorists could figure out how to make social media better”.

I asked for suggestions on Mathstodon and got these so far:

Each topic already has people already working on it, so these are good examples. Can you think of more, and point me to groups of people working on these things?

January 24, 2023

Tommaso DorigoThe Interest Of High-School Students For Hard Sciences

Yesterday I visited a high school in Venice to deliver a lecture on particle physics, and to invite the participating students to take part in an art and science contest. This is part of the INFN "Art and Science across Italy" project, which has reached its fourth edition, organizes art exhibits with the students' creations in several cities across Italy. The best works are then selected for a final exhibit in Naples, and the 24 winners are offered a week-long visit to the CERN laboratories in Geneva, Switzerland.

read more

January 23, 2023

Scott Aaronson Movie Review: M3GAN

[WARNING: SPOILERS FOLLOW]


Update (Jan. 23): Rationalist blogger, Magic: The Gathering champion, and COVID analyst Zvi Mowshowitz was nerd-sniped by this review into writing his own much longer review of M3GAN, from a more Orthodox AI-alignment perspective. Zvi applies much of his considerable ingenuity to figuring out how even aspects of M3GAN that don’t seem to make sense in terms of M3GAN’s objective function—e.g., the robot offering up wisecracks as she kills people, attracting the attention of the police, or ultimately turning on her primary user Cady—could make sense after all, if you model M3GAN as playing the long, long game. (E.g., what if M3GAN planned even her own destruction, in order to bring Cady and her aunt closer to each other?) My main worry is that, much like Talmudic exegesis, this sort of thing could be done no matter what was shown in the movie: it’s just a question of effort and cleverness!


Tonight, on a rare date without the kids, Dana and I saw M3GAN, the new black-comedy horror movie about an orphaned 9-year-old girl named Cady who, under the care of her roboticist aunt, gets an extremely intelligent and lifelike AI doll as a companion. The robot doll, M3GAN, is given a mission to bond with Cady and protect her physical and emotional well-being at all times. M3GAN proceeds to take that directive more literally than intended, with predictably grisly results given the genre.

I chose this movie for, you know, work purposes. Research for my safety job at OpenAI.

So, here’s my review: the first 80% or so of M3GAN constitutes one of the finest movies about AI that I’ve seen. Judged purely as an “AI-safety cautionary fable” and not on any other merits, it takes its place alongside or even surpasses the old standbys like 2001, Terminator, and The Matrix. There are two reasons.

First, M3GAN tries hard to dispense with the dumb tropes that an AI differs from a standard-issue human mostly in its thirst for power, its inability to understand true emotions, and its lack of voice inflection. M3GAN is explicitly a “generative learning model”—and she’s shown becoming increasingly brilliant at empathy, caretaking, and even emotional manipulation. It’s also shown, 100% plausibly, how Cady grows to love her robo-companion more than any human, even as the robot’s behavior turns more and more disturbing. I’m extremely curious to what extent the script was influenced by the recent explosion of large language models—but in any case, it occurred to me that this is what you might get if you tried to make a genuinely 2020s AI movie, rather than a 60s AI movie with updated visuals.

Secondly, until near the end, the movie actually takes seriously that M3GAN, for all her intelligence and flexibility, is a machine trying to optimize an objective function, and that objective function can’t be ignored for narrative convenience. Meaning: sure, the robot might murder, but not to “rebel against its creators and gain power” (as in most AI flicks), much less because “chaos theory demands it” (Jurassic Park), but only to further its mission of protecting Cady. I liked that M3GAN’s first victims—a vicious attack dog, the dog’s even more vicious owner, and a sadistic schoolyard bully—are so unsympathetic that some part of the audience will, with guilty conscience, be rooting for the murderbot.

But then there’s the last 20% of the movie, where it abandons its own logic, as the robot goes berserk and resists her own shutdown by trying to kill basically everyone in sight—including, at the very end, Cady herself. The best I can say about the ending is that it’s knowing and campy. You can imagine the scriptwriters sighing to themselves, like, “OK, the focus groups demanded to see the robot go on a senseless killing spree … so I guess a senseless killing spree is exactly what we give them.”

But probably film criticism isn’t what most of you are here for. Clearly the real question is: what insights, if any, can we take from this movie about AI safety?

I found the first 80% of the film to be thought-provoking about at least one AI safety question, and a mind-bogglingly near-term one: namely, what will happen to children as they increasingly grow up with powerful AIs as companions?

In their last minutes before dying in a car crash, Cady’s parents, like countless other modern parents, fret that their daughter is too addicted to her iPad. But Cady’s roboticist aunt, Gemma, then lets the girl spend endless hours with M3GAN—both because Gemma is a distracted caregiver who wants to get back to her work, and because Gemma sees that M3GAN is making Cady happier than any human could, with the possible exception of Cady’s dead parents.

I confess: when my kids battle each other, throw monster tantrums, refuse to eat dinner or bathe or go to bed, angrily demand second and third desserts and to be carried rather than walk, run to their rooms and lock the doors … when they do such things almost daily (which they do), I easily have thoughts like, I would totally buy a M3GAN or two for our house … yes, even having seen the movie! I mean, the minute I’m satisfied that they’ve mostly fixed the bug that causes the murder-rampages, I will order that frigging bot on Amazon with next-day delivery. And I’ll still be there for my kids whenever they need me, and I’ll play with them, and teach them things, and watch them grow up, and love them. But the robot can handle the excruciating bits, the bits that require the infinite patience I’ll never have.

OK, but what about the part where M3GAN does start murdering anyone who she sees as interfering with her goals? That struck me, honestly, as a trivially fixable alignment failure. Please don’t misunderstand me here to be minimizing the AI alignment problem, or suggesting it’s easy. I only mean: supposing that an AI were as capable as M3GAN (for much of the movie) at understanding Asimov’s Second Law of Robotics—i.e., supposing it could brilliantly care for its user, follow her wishes, and protect her—such an AI would seem capable as well of understanding the First Law (don’t harm any humans or allow them to come to harm), and the crucial fact that the First Law overrides the Second.

In the movie, the catastrophic alignment failure is explained, somewhat ludicrously, by Gemma not having had time to install the right safety modules before turning M3GAN loose on her niece. While I understand why movies do this sort of thing, I find it often interferes with the lessons those movies are trying to impart. (For example, is the moral of Jurassic Park that, if you’re going to start a live dinosaur theme park, just make sure to have backup power for the electric fences?)

Mostly, though, it was a bizarre experience to watch this movie—one that, whatever its 2020s updates, fits squarely into a literary tradition stretching back to Faust, the Golem of Prague, Frankenstein’s monster, Rossum’s Universal Robots, etc.—and then pinch myself and remember that, here in actual nonfiction reality,

  1. I’m now working at one of the world’s leading AI companies,
  2. that company has already created GPT, an AI with a good fraction of the fantastical verbal abilities shown by M3GAN in the movie,
  3. that AI will gain many of the remaining abilities in years rather than decades, and
  4. my job this year—supposedly!—is to think about how to prevent this sort of AI from wreaking havoc on the world.

Incredibly, unbelievably, here in the real world of 2023, what still seems most science-fictional about M3GAN is neither her language fluency, nor her ability to pursue goals, nor even her emotional insight, but simply her ease with the physical world: the fact that she can walk and dance like a real child, and all-too-brilliantly resist attempts to shut her down, and have all her compute onboard, and not break. And then there’s the question of the power source. The movie was never explicit about that, except for implying that she sits in a charging port every night. The more the movie descends into grotesque horror, though, the harder it becomes to understand why her creators can’t avail themselves of the first and most elemental of all AI safety strategies—like flipping the switch or popping out the battery.

n-Category Café Question on Condensed Matter Physics

The tenfold way is a mathematical classification of Hamiltonians used in condensed matter physics, based on their symmetries. Nine kinds are characterized by choosing one of these 3 options:

  • antiunitary time-reversal symmetry with T 2=1T^2 = 1, with T 2=1T^2 = -1, or no such symmetry.

and one of these 3 options:

  • antiunitary charge conjugation symmetry with C 2=1C^2 = 1, with C 2=1C^2 = -1, or no such symmetry.

(Charge conjugation symmetry in condensed matter physics is usually a symmetry between particles - e.g. electrons or quasiparticles of some sort - and holes.)

The tenth kind has unitary “SS” symmetry, a symmetry that simultaneously reverses the direction of time and interchanges particles and holes. Since it is unitary and we’re free to multiply it by a phase, we can assume without loss of generality that S 2=1S^2 = 1.

What are examples of real-world condensed matter systems of all ten kinds?

I’ll take what I can get! If you know materials of a few of the ten kinds, that’s a start!

January 22, 2023

Jordan EllenbergBounce

I was vacationing with the kids in San Francisco and it turned out for transit reasons to improve our day a lot to be able to store our suitcases somewhere in the city for the whole day. There is, as they say, an app for that, called Bounce. It’s a pretty clever idea! You pay $7.50 a bag and Bounce connects you with a location that’s willing to store luggage for you — in our case, a hotel (a budget option which is apparently famous for having the toilet just being out there openly in the room, to save space) but they use UPS locations and other stores too. A luggage locker at the train station would be cheaper, but of course that would mean you have to go to the train station, which might be out of your way.

Now the question is this — could I have saved some money and just shown up at a random hotel, handed the bellhop a twenty, and asked him to keep four bags in the back room for the day? Seems kind of reasonable. On the other hand, I can imagine hotels being under insurance instructions not to store bags for unknown non-guests. But why wouldn’t the same insurance caution keep them from signing up with Bounce? Maybe Bounce has taken on the liability somehow.

Anyway, this is not a service I anticipate needing often, but in a moment when it was exactly what I needed, it did exactly what I wanted, so I recommend it.

PS: My kids are now extremely into San Francisco.

January 20, 2023

Matt von HippelCabinet of Curiosities: The Train-Ladder

I’ve got a new paper out this week, with Andrew McLeod, Roger Morales, Matthias Wilhelm, and Chi Zhang. It’s yet another entry in this year’s “cabinet of curiosities”, quirky Feynman diagrams with interesting traits.

A while back, I talked about a set of Feynman diagrams I could compute with any number of “loops”, bypassing the approximations we usually need to use in particle physics. That wasn’t the first time someone did that. Back in the 90’s, some folks figured out how to do this for so-called “ladder” diagrams. These diagrams have two legs on one end for two particles coming in, two legs on the other end for two particles going out, and a ladder in between, like so:

There are infinitely many of these diagrams, but they’re all beautifully simple, variations on a theme that can be written down in a precise mathematical way.

Change things a little bit, though, and the situation gets wildly more intractable. Let the rungs of the ladder peek through the sides, and you get something looking more like the tracks for a train:

These traintrack integrals are much more complicated. Describing them requires the mathematics of Calabi-Yau manifolds, involving higher and higher dimensions as the tracks get longer. I don’t think there’s any hope of understanding these things for all loops, at least not any time soon.

What if we aimed somewhere in between? A ladder that just started to turn traintrack?

Add just a single pair of rungs, and it turns out that things remain relatively simple. If we do this, it turns out we don’t need any complicated Calabi-Yau manifolds. We just need the simplest Calabi-Yau manifold, called an elliptic curve. It’s actually the same curve for every version of the diagram. And the situation is simple enough that, with some extra cleverness, it looks like we’ve found a trick to calculate these diagrams to any number of loops we’d like.

(Another group figured out the curve, but not the calculation trick. They’ve solved different problems, though, studying all sorts of different traintrack diagrams. They sorted out some confusion I used to have about one of those diagrams, showing it actually behaves precisely the way we expected it to. All in all, it’s been a fun example of the way different scientists sometimes hone in on the same discovery.)

These developments are exciting, because Feynman diagrams with elliptic curves are still tough to deal with. We still have whole conferences about them. These new elliptic diagrams can be a long list of test cases, things we can experiment with with any number of loops. With time, we might truly understand them as well as the ladder diagrams!

January 19, 2023

David Hoggdoing cosmology differently

Today Chirag Modi (Flatiron) gave a really great lunchtime talk about new technologies in cosmology and inference or measurement of cosmological parameters. He beautifully summarized how cosmology is done now (or traditionally): Make summary statistics of the observables, make a theory of the summary statistics, make up a surrogate likelihood function for use in inference, measure covariance matrices to use in the latter, and go. He's trying to obviate all of these things by using the simulations directly to make the measurements. He has nice results in forward modeling of the galaxy field, and in simulation-based inferences. Many interesting things came up in his talk, including the idea that I have discussed over the years with Kate Storey-Fisher (NYU) of enumerating all possible cosmological statistics! So much interesting stuff in the future of large-scale structure.

David Hoggdefining passive and active symmetries

What is a passive symmetry, and what is an active symmetry? I think I know: A passive symmetry is a symmetry that emerges because there are choices (like coordinate system, units system, gauge choice) in the representation of the data. An active symmetry is a symmetry that is observed to be there (like energy conservation). The passive symmetries are true by definition or by construction. THe active symmetries are subject to empirical test. Today Soledad Villar and I spent time talking about a truly formal definition in terms of commutative diagrams.

January 17, 2023

Tommaso DorigoCollaborative Science In Times Of War

I just finished reading a very nice piece on the Guardian, written by my friend and ex colleague Eleni Petrakou, who collaborated with me in the CMS experiment at CERN and is now a scientific writer. The topic is the disruptive effect that the war in Ukraine has caused to scientific collaboration. I urge you to read it if the matter is of any interest to you.

read more

January 15, 2023

Doug NatelsonCondensed matter’s rough start

 I’m teaching undergrad solid-state for the first time, and it has served as a reminder of how condensed matter physics got off the ground.  I suspect that one reason CM historically had not received a lot of respect in the early years (e.g. Pauli declaring that solid-state physics is the physics of dirt) is that it began very much as a grab bag of empirical observations, with the knowledge that the true underpinnings were well out of reach at the time.  Remember the order of a few key discoveries:

A whole host of materials physics observations predate the discovery of the electron, let alone modern statistical physics and quantum mechanics.  The early days of condensed matter had a lot of handwaving.  The derivation of the Hall effect in the classical Drude picture (modeling electrons in a metal based on the kinetic theory of gases) was viewed as a triumph, even though it clearly was incomplete and got the sign wrong (!) for a bunch of materials.  (Can you imagine trying to publish a result today and saying, ‘sure, it’s the wrong sign half the time, but it has to be sort of correct’?)

That we now actually understand so much about the physics of materials is one of the great intellectual accomplishments of the species, and the fact that so much of the explanation has real elegance is worth appreciating.

January 13, 2023

Clifford JohnsonWhat a Week!

Some Oxford scenesI’m sitting, for the second night in a row, in a rather pleasant restaurant in Oxford, somewhere on the walk between the physics department and my hotel. They pour a pretty good Malbec, and tonight I’ve had the wood-fired Guinea Fowl. I can hear snippets of conversation in the distance, telling me that many people who come here are regulars, and that correlates well with the fact that I liked the place immediately last night and decided I’d come back. The friendly staff remembered me and greeted me like a regular upon my return, which I liked. Gee’s is spacious with a high ceiling, and so I can sit away from everyone in a time where I’d still rather not be too cavalier with regards covid. On another occasion I might have sought out a famous pub with some good pub food and be elbow-to-elbow with students and tourists, but the phrase “too soon” came to mind when I walked by such establishments and glanced into the windows.

However, I am not here to do a restaurant review, although you might have thought that from the previous paragraph (the guinea fowl was excellent though, and the risotto last night was tasty, if a tiny bit over-salted for my tastes). Instead I find myself reflecting on […] Click to continue reading this post

The post What a Week! appeared first on Asymptotia.

Matt von HippelThe Problem of Quantum Gravity Is the Problem of High-Energy (Density) Quantum Gravity

I’ve said something like this before, but here’s another way to say it.

The problem of quantum gravity is one of the most famous problems in physics. You’ve probably heard someone say that quantum mechanics and general relativity are fundamentally incompatible. Most likely, this was narrated over pictures of a foaming, fluctuating grid of space-time. Based on that, you might think that all we have to do to solve this problem is to measure some quantum property of gravity. Maybe we could make a superposition of two different gravitational fields, see what happens, and solve the problem that way.

I mean, we could do that, some people are trying to. But it won’t solve the problem. That’s because the problem of quantum gravity isn’t just the problem of quantum gravity. It’s the problem of high-energy quantum gravity.

Merging quantum mechanics and general relativity is actually pretty easy. General relativity is a big conceptual leap, certainly, a theory in which gravity is really just the shape of space-time. At the same time, though, it’s also a field theory, the same general type of theory as electromagnetism. It’s a weirder field theory than electromagnetism, to be sure, one with deeper implications. But if we want to describe low energies, and weak gravitational fields, then we can treat it just like any other field theory. We know how to write down some pretty reasonable-looking equations, we know how to do some basic calculations with them. This part is just not that scary.

The scary part happens later. The theory we get from these reasonable-looking equations continues to look reasonable for a while. It gives formulas for the probability of things happening: things like gravitational waves bouncing off each other, as they travel through space. The problem comes when those waves have very high energy, and the nice reasonable probability formula now says that the probability is greater than one.

For those of you who haven’t taken a math class in a while, probabilities greater than one don’t make sense. A probability of one is a certainty, something guaranteed to happen. A probability greater than one isn’t more certain than certain, it’s just nonsense.

So we know something needs to change, we know we need a new theory. But we only know we need that theory when the energy is very high: when it’s the Planck energy. Before then, we might still have a different theory, but we might not: it’s not a “problem” yet.

Now, a few of you understand this part, but still have a misunderstanding. The Planck energy seems high for particle physics, but it isn’t high in an absolute sense: it’s about the energy in a tank of gasoline. Does that mean that all we have to do to measure quantum gravity is to make a quantum state out of your car?

Again, no. That’s because the problem of quantum gravity isn’t just the problem of high-energy quantum gravity either.

Energy seems objective, but it’s not. It’s subjective, or more specifically, relative. Due to special relativity, observers moving at different speeds observe different energies. Because of that, high energy alone can’t be the requirement: it isn’t something either general relativity or quantum field theory can “care about” by itself.

Instead, the real thing that matters is something that’s invariant under special relativity. This is hard to define in general terms, but it’s best to think of it as a requirement for not energy, but energy density.

(For the experts: I’m justifying this phrasing in part because of how you can interpret the quantity appearing in energy conditions as the energy density measured by an observer. This still isn’t the correct way to put it, but I can’t think of a better way that would be understandable to a non-technical reader. If you have one, let me know!)

Why do we need quantum gravity to fully understand black holes? Not just because they have a lot of mass, but because they have a lot of mass concentrated in a small area, a high energy density. Ditto for the Big Bang, when the whole universe had a very large energy density. Particle colliders are useful not just because they give particles high energy, but because they give particles high energy and put them close together, creating a situation with very high energy density.

Once you understand this, you can use it to think about whether some experiment or observation will help with the problem of quantum gravity. Does the experiment involve very high energy density, much higher than anything we can do in a particle collider right now? Is that telescope looking at something created in conditions of very high energy density, or just something nearby?

It’s not impossible for an experiment that doesn’t meet these conditions to find something. Whatever the correct quantum gravity theory is, it might be different from our current theories in a more dramatic way, one that’s easier to measure. But the only guarantee, the only situation where we know we need a new theory, is for very high energy density.

Clifford JohnsonBBC Fun!

As I mentioned in the previous post, I had business at BBC Broadcasting House this week. I was recording an interview that I’ll fill you in on later on, closer to release of the finished programme. Recall that in the post I mentioned how amusing it would be for me … Click to continue reading this post

The post BBC Fun! appeared first on Asymptotia.

January 12, 2023

Terence TaoIllustrating the Impact of the Mathematical Sciences

Over the last few years, I have served on a committee of the National Academy of Sciences to produce some posters and other related media to showcase twenty-first century and its applications in the real world, suitable for display in classrooms or math departments. Our posters (together with some associated commentary, webinars on related topics, and even a whimsical “comic“) are now available for download here.

Tommaso DorigoPerfect Play In A Blitz Chess Game

It rarely happens to play a regular chess game with no clear mistakes. When the game is a blitz one, though, this is exceedingly rare. A blitz game is one where both players have 5 minutes to make all their moves, and the first who runs out of time automatically loses (provided the opponent realizes it).
Because of the very short time to make decisions, blitz chess games are an adrenaline-producing, intense brain activity. So much so that when people talk to me during a blitz game I simply do not record the words they speak, for the whole duration of the game; after the end, I often find myself reckoning with a buffer of words that by then have no meaning anymore. 

read more

January 11, 2023

Matt Strassler Busy Writing a Book

Happy 2023 everyone!  You’ve noticed, no doubt, that the blog has been quiet recently.  That’s because I’ve got a book contract, with a deadline of March 31, 2023.  [The book itself won’t be published til spring 2024.]  I’ll tell you more about this in future posts. But over the next couple of months I’ll be a bit slow to answer questions and even slower to write content.  Fortunately, much of the content on this website is still current — the universe seems to be much the same in 2023 as it was in 2011 when the site was born. So poke around; I’m sure you’ll find something that interests you!

Richard EastherArm The Disruptors

Last week, Science Twitter was roiled by claims that “disruptive science” was on the wane and that this might be reversed by “reading widely”, taking “year long sabbaticals” and “focussing less on quantity … and more on …quality”. It blew up, which is probably not surprising given that it first pandered to our collective angst and then suggested some highly congenial remedies.

The Nature paper that kicked off this storm in our social media teacup is profusely illustrated with graphs and charts. The data is not uninteresting and does suggest that something about the practice of science has changed over the course of the last eight or nine decades. The problem is that it could also be Exhibit A in a demonstration of how data science can generate buzz while remaining largely disconnected from reality.

“Disruption” is a useful framework for discussing technological innovation (digital cameras render film obsolete; Netflix kills your neighbourhood video store, streaming music replaces CDs) but it is less clear to me that it can be applied directly to high-value science. “What is good?” is perhaps the oldest question in the book but the paper seems to skate past it.

The problem is (at least as I see it) many if not most scientific breakthroughs [1] extend the frontiers of knowledge rather than demolishing their forebears [2]. Even the biggest “paradigm shifts” often left their predecessors largely intact. Einstein arguably “disrupted” Newton but while film cameras and vinyl records are now the preserve of hipsters and purists, Newtonian physics is still at the heart of the field – as anyone who has taken first year physics or built a bridge that stood up can attest.

Similarly, quantum mechanics shattered the then-prevailing clockwork conception of the cosmos. However, its technical content was effectively a greenfield development since at a detailed level there was nothing for quantum mechanics to replace. By the end of the 1920s, however, quantum mechanics had given us the tools to explain almost everything that happens inside of an atom.

Consequently, as I see it, neither relativity or quantum mechanics really fits a conventional understanding of “disruption” even though they combine to create one the biggest revolutions ever seen in science. So that should be a problem if you are using “disruption” as a template for identifying interesting and important science.

Rather than making a qualitative assessment, the authors deploy a metric to measure disruption based on citation counts [3] – a widely cited paper whose own bibliographic antecedents then become less prominent is judged to be “disruptive” [4]. This leads to plots like the one below which focuses on Nobel winning papers and three “prestige” journals (Figure 5 from the paper).

If we take this study at its word, “disruption” has largely flatlined for the last fifty years. But one of the specific papers they identify – Riess et al.’s co-discovery of “dark energy” (or, more properly, observations suggesting that the rate at which the universe expands is picking up speed) is not rated as “disruptive” despite being the biggest upheaval in our understanding of the cosmos in a couple of generations.

Conversely, the discovery of the DNA double helix is measured to be “disruptive” — and it is certainly a watershed in our understanding of the the chemistry of life. The authors explain that it displaced an earlier “triple helix” model proposed by Linus Pauling – but Pauling’s scenario was less than a year old at this point so it was hardly an established incumbent knocked off its perch by a unexpected upstart. In fact, Watson and Crick’s 1953 discovery paper has only six references, and only one of those was published prior to 1952. Dirac’s 1928 paper scores well and it likewise has a handful of references and most those were similarly only a year or so old at the time of publication. However, the “disruption metric” looks for changes in citation patterns five years either side of publication. Consequently, even though there is no way their metric can produce meaningful data for these papers (given its reliance on a five year before-and-after comparison of citation counts) they single them out for special attention rather than filtering them and papers like them from their dataset.

What this suggests to me is that there has not been a sufficiently rigorous sniff-testing of the output of this algorithm. So on top of adopting a model of progress without really asking whether or not it captures the essence of “breakthrough” science the output of the metric used to assess it was often reverse-engineered to justify the numerical values it yields.

The concern that science is increasingly driven by “bean counting” and a publish or perish mentality that is at odds with genuine progress is widespread, and my own view (like most scientists, I would guess) is that there is truth to it. There is certainly a lot of frog-boiling in academia: it is indeed a challenge for working scientists to get long periods to reflect and explore and junior scientists are locked into a furiously competitive job market that offers little security to its participants.

Ironically, though, one key contributor to this pressure-cooker in which we find ourselves is Nature itself, the journal that published this paper. And Nature not only published it but hyped it in a news article – an incestuous coupling between peer reviewed content and “news” that can make the careers of those fortunate enough to participate in it. However, it is widely argued that this practice makes Nature itself a contributor to any decline of scientific quality that may be taking place by nudging authors to hype their work in ways not fully justified by their actual results. But “turning off the hype machine” is not one of the proposed solutions to our problems — and a cynic might suggest that this could be because it would also disable the money spigot that generates many millions of dollars a year for Nature’s very-definitely for-profit owners.

To some extent this is just me being cranky, since I spent part of last week at a slow simmer every time I saw this work flash by on a screen. But it matters, because this sort of analysis can find its way into debates about how to “fix” the supposed problems of science. And there certainly are many ways in which we could make science better. But before we prescribe we would be wise to accurately determine the symptoms of its illness. Coming up with numerical metrics to measure quality and impact in science is enormously tempting since it converts an otherwise laborious and qualitative process into something that it is both quantitative and automated [5] — but it is also very difficult, and it hasn’t happened here.

Ironically, the authors of this work are a professor in a management school, his PhD student and a sociologist who claim all expertise in “innovation” and “entrepreneurship”. Physicists are often seen as more willing than most to have opinions on matters outside of our professional domain and we are increasingly likely to be rebuked for failures to “stay in our lane”. But that advice cuts both ways; if you want to have opinions on science maybe you should work with people who have real expertise in the fields you hope to assess?


[1] I am going to focus on physics, since that is what I know best – but the pattern is claimed to be largely field-independent.

[2] There are exceptions. The heliocentric solar system supplanted the geocentric view and “caloric fluid” is no longer seen as a useful description of heat, but the norm for physics (and much of 20th century chemistry and biology, so far as I can see) is to “amend and extend”. There are often competing explanations for a phenomenon – e.g. Big Bang cosmology v. Steady State – only one of which can “win”, but these more closely resemble rivalries like the contest between BetaMax and VHS than “disruption”.

[3] They also make an argument that the language we use to talk about scientific results has changed over time, but most of the story has been based on their “disruption” metric.

[4] It had been used previously on patent applications (which must list “prior art”) by one of the authors, where it may actually make more sense.

[5] See also my views on the h-index.

Banner image: https://memory-alpha.fandom.com/wiki/Disruptor

January 09, 2023

Clifford JohnsonW1A

[caption id="attachment_20038" align="aligncenter" width="499"]Brpmpton bicycle rental lockers. Brompton bicycle rental lockers.[/caption]
I’ll be visiting Broadcasting House during my time here in London this week, for reasons I’ll mention later. Needless to say (almost), as a Brompton rider, and fan of the wonderful show W1A, I feel a sense of regret that I don’t have my bike here so that I can ride up to the front of the building on it. you won’t know what I’m talking about if you don’t know the show. Well, last night I was a-wandering and saw the rental option shown in the photo. It is very tempting…

-cvj
Click to continue reading this post

The post W1A appeared first on Asymptotia.

Clifford JohnsonBack East

[I was originally going to use the title “Back Home”, but then somehow this choice had a resonance to it that I liked. (Also reminds me of a lovely Joshua Redman album…)] So I am back in London, my home town. And since I’ve got 8 hour jet lag, I’m … Click to continue reading this post

The post Back East appeared first on Asymptotia.

January 08, 2023

Doug NatelsonNews items for the new year

After I was not chosen to be Speaker of the US House of Representatives, I think it’s time to highlight some brief items:

  • Here is a great blog post by a Rice grad alum, Daniel Gonzales, about flow to approach faculty searches.  I had written a fair bit on this a number of years ago, but his take is much fresher and up to date.
  • My colleagues in Rice’s chem department have written a very nice obituary in PNAS for Bob Curl.
  • It’s taken nearly 2000 years, but people seem to have finally figured out the reason why Roman concrete lasts hundreds to thousands of years, while modern concrete often starts crumbling after 30 years or so.
  • Capabilities for quantum optomechanical widgets are improving all the time.  Now it’s possible to implement a model for graphene, following some exquisite fabrication and impressive measurement techniques. 
  • From the math perspective, this is just f-ing weird.  For more info, see here.

January 07, 2023

John BaezA Curious Integral

On Mathstodon, Robin Houston pointed out a video where Oded Margalit claimed that it’s an open problem why this integral:

\displaystyle{  \int_0^\infty\cos(2x)\prod_{n=1}^\infty\cos\left(\frac{x}{n} \right) d x }

is so absurdly close to \frac{\pi}{8}, but not quite equal.

They agree to 41 decimal places, but they’re not the same!

\displaystyle{  \int_0^\infty\cos(2x)\prod_{n=1}^\infty\cos\left(\frac{x}{n}\right) d x } =
0.3926990816987241548078304229099378605246454...

while

\frac\pi 8 =
0.3926990816987241548078304229099378605246461...

So, a bunch of us tried to figure out what was going on.

Jaded nonmathematicians told us it’s just a coincidence, so what is there to explain? But of course an agreement this close is unlikely to be “just a coincidence”. It might be, but you’ll never get anywhere in math with that attitude.

We were reminded of the famous cosine Borwein integral

\displaystyle{ \int_0^\infty 2 \cos(x) \prod_{n = 0}^{N}  \frac{\sin (x/(2n+1))}{x/(2n+1)}  \, d x}

which equals \frac{\pi}{2} for N up to and including 55, but not for any larger N:

\displaystyle{ \int_0^\infty 2 \cos(x) \prod_{n = 0}^{56} \frac{\sin (x/(2n+1))}{x/(2n+1)} \, d x  \approx \frac{\pi}{2} - 2.3324 \cdot 10^{-138} }

But it was Sean O who really cracked the case, by showing that the integral we were struggling with could actually be reduced to an N = \infty version of the cosine Borwein integral, namely

\displaystyle{ \int_0^\infty 2 \cos(x) \prod_{n = 0}^{\infty}  \frac{\sin (x/(2n+1))}{x/(2n+1)} \, d x}

The point is this. A little calculation using the Weierstrass factorizations

\displaystyle{  \frac{\sin x}{x} = \prod_{n = 1}^\infty \left( 1  - \frac{x^2}{\pi^2 n^2} \right) }

\displaystyle{  \cos x = \prod_{n = 0}^\infty \left( 1  - \frac{4x^2}{\pi^2 (2n+1)^2} \right) }

lets you show

\displaystyle{  \prod_{n = 1}^\infty \cos\left(\frac{x}{n}\right) = \prod_{n = 0}^\infty \frac{\sin (2x/(2n+1))}{2x/(2n+1)} }

and thus

\displaystyle{   \int_0^\infty \cos(2x) \prod_{n=1}^\infty \cos\left(\frac{x}{n} \right) \; d x  = }

\displaystyle{  \int_0^\infty\cos(2x) \prod_{n = 0}^\infty \frac{\sin (2x/(2n+1))}{2x/(2n+1)} d x  }

Then, a change of variables on the right-hand side gives

\displaystyle{  \int_0^\infty \cos(2x) \prod_{n=1}^\infty \cos\left(\frac{x}{n} \right) \; d x   = }

\displaystyle{ \frac{1}{4} \int_0^\infty 2\cos(x) \prod_{n = 0}^\infty \frac{\sin (x/(2n+1))}{x/(2n+1)} d x }

So, showing that

\displaystyle{  \int_0^\infty\cos(2x)\prod_{n=1}^\infty\cos\left(\frac{x}{n} \right) d x }

is microscopically less than \frac{\pi}{8} is equivalent to showing that

\displaystyle{ \int_0^\infty 2\cos(x) \prod_{n = 0}^\infty \frac{\sin (x/(2n+1))}{x/(2n+1)} d x }

is microscopically less than \frac{\pi}{2}.

This sets up a clear strategy for solving the mystery! People understand why the cosine Borwein integral

\displaystyle{ \int_0^\infty 2 \cos(x) \prod_{n = 0}^{N}  \frac{\sin (x/(2n+1))}{x/(2n+1)}  \, d x}

equals \frac{\pi}{2} for N up to 55, and then drops ever so slightly below \frac{\pi}{2}. The mechanism is clear once you watch the right sort of movie. It’s very visual. Greg Egan explains it here with an animation, based on ideas by Hanspeter Schmid:

• John Baez, Patterns that eventually fail, Azimuth, September 20, 2018.

Or you can watch this video, which covers a simpler but related example:

• 3Blue1Brown, Researchers thought this was a bug (Borwein integrals).

So, we just need to show that as N \to +\infty, the value of the cosine Borwein integral doesn’t drop much more! It drops by just a tiny amount: about 7 \times 10^{-43}.

Alas, this doesn’t seem easy to show. At least I don’t know how to do it yet. But what had seemed an utter mystery has now become a chore in analysis: estimating how much

\displaystyle{ \int_0^\infty 2 \cos(x) \prod_{n = 0}^{N}  \frac{\sin (x/(2n+1))}{x/(2n+1)}  \, d x}

drops each time you increase N a bit.

At this point if you’re sufficiently erudite you are probably screaming: “BUT THIS IS ALL WELL-KNOWN!”

And you’re right! We had a lot of fun discovering this stuff, but it was not new. When I was posting about it on MathOverflow, I ran into an article that mentions a discussion of this stuff:

• Eric W. Weisstein, Infinite cosine product integral, from MathWorld—A Wolfram Web Resource.

and it turns out Borwein and his friends had already studied it. There’s a little bit here:

• J. M. Borwein, D. H. Bailey, V. Kapoor and E. W. Weisstein, Ten problems in experimental mathematics, Amer. Math. Monthly 113 (2006), 481–509.

and a lot more in this book:

• J. M. Borwein, D. H. Bailey and R. Girgensohn, Experimentation in Mathematics: Computational Paths to Discovery, Wellesley, Massachusetts, A K Peters, 2004.

In fact the integral

\displaystyle{ \int_0^\infty 2 \cos(x) \prod_{n = 0}^{\infty}  \frac{\sin (x/(2n+1))}{x/(2n+1)}  \, d x}

was discovered by Bernard Mares at the age of 17. Apparently he posed the challenge of proving that it was less than \frac{\pi}{4}. Borwein and others dived into this and figured out how.

But there is still work left to do!

As far as I can tell, the known proofs that

\displaystyle{ \frac{\pi}{8} -  \int_0^\infty\cos(2x)\prod_{n=1}^\infty\cos\left(\frac{x}{n} \right) d x }  \; \approx \; 7.4073 \cdot 10^{-43}

all involve a lot of brute-force calculation. Is there a more conceptual way to understand this difference, at least approximately? There is a clear conceptual proof that

\displaystyle{ \frac{\pi}{8} -  \int_0^\infty\cos(2x)\prod_{n=1}^\infty\cos\left(\frac{x}{n} \right) d x }  \;\; > \;\; 0

That’s what Greg Egan explained in my blog article. But can we get a clear proof that

\displaystyle{ \frac{\pi}{8} -  \int_0^\infty\cos(2x)\prod_{n=1}^\infty\cos\left(\frac{x}{n} \right) d x }  \; \; < \; \; C

for some small constant C, say 10^{-40} or so?

One can argue that until we do, Oded Margalit is right: there’s an open problem here. Not a problem in proving that something is true. A problem in understanding why it is true.

Terence TaoSpecial relativity and Middle-Earth

This post is an unofficial sequel to one of my first blog posts from 2007, which was entitled “Quantum mechanics and Tomb Raider“.

One of the oldest and most famous allegories is Plato’s allegory of the cave. This allegory centers around a group of people chained to a wall in a cave that cannot see themselves or each other, but only the two-dimensional shadows of themselves cast on the wall in front of them by some light source they cannot directly see. Because of this, they identify reality with this two-dimensional representation, and have significant conceptual difficulties in trying to view themselves (or the world as a whole) as three-dimensional, until they are freed from the cave and able to venture into the sunlight.

There is a similar conceptual difficulty when trying to understand Einstein’s theory of special relativity (and more so for general relativity, but let us focus on special relativity for now). We are very much accustomed to thinking of reality as a three-dimensional space endowed with a Euclidean geometry that we traverse through in time, but in order to have the clearest view of the universe of special relativity it is better to think of reality instead as a four-dimensional spacetime that is endowed instead with a Minkowski geometry, which mathematically is similar to a (four-dimensional) Euclidean space but with a crucial change of sign in the underlying metric. Indeed, whereas the distance {ds} between two points in Euclidean space {{\bf R}^3} is given by the three-dimensional Pythagorean theorem

\displaystyle  ds^2 = dx^2 + dy^2 + dz^2

under some standard Cartesian coordinate system {(x,y,z)} of that space, and the distance {ds} in a four-dimensional Euclidean space {{\bf R}^4} would be similarly given by

\displaystyle  ds^2 = dx^2 + dy^2 + dz^2 + du^2

under a standard four-dimensional Cartesian coordinate system {(x,y,z,u)}, the spacetime interval {ds} in Minkowski space is given by

\displaystyle  ds^2 = dx^2 + dy^2 + dz^2 - c^2 dt^2

(though in many texts the opposite sign convention {ds^2 = -dx^2 -dy^2 - dz^2 + c^2dt^2} is preferred) in spacetime coordinates {(x,y,z,t)}, where {c} is the speed of light. The geometry of Minkowski space is then quite similar algebraically to the geometry of Euclidean space (with the sign change replacing the traditional trigonometric functions {\sin, \cos, \tan}, etc. by their hyperbolic counterparts {\sinh, \cosh, \tanh}, and with various factors involving “{c}” inserted in the formulae), but also has some qualitative differences to Euclidean space, most notably a causality structure connected to light cones that has no obvious counterpart in Euclidean space.

That said, the analogy between Minkowski space and four-dimensional Euclidean space is strong enough that it serves as a useful conceptual aid when first learning special relativity; for instance the excellent introductory text “Spacetime physics” by Taylor and Wheeler very much adopts this view. On the other hand, this analogy doesn’t directly address the conceptual problem mentioned earlier of viewing reality as a four-dimensional spacetime in the first place, rather than as a three-dimensional space that objects move around in as time progresses. Of course, part of the issue is that we aren’t good at directly visualizing four dimensions in the first place. This latter problem can at least be easily addressed by removing one or two spatial dimensions from this framework – and indeed many relativity texts start with the simplified setting of only having one spatial dimension, so that spacetime becomes two-dimensional and can be depicted with relative ease by spacetime diagrams – but still there is conceptual resistance to the idea of treating time as another spatial dimension, since we clearly cannot “move around” in time as freely as we can in space, nor do we seem able to easily “rotate” between the spatial and temporal axes, the way that we can between the three coordinate axes of Euclidean space.

With this in mind, I thought it might be worth attempting a Plato-type allegory to reconcile the spatial and spacetime views of reality, in a way that can be used to describe (analogues of) some of the less intuitive features of relativity, such as time dilation, length contraction, and the relativity of simultaneity. I have (somewhat whimsically) decided to place this allegory in a Tolkienesque fantasy world (similarly to how my previous allegory to describe quantum mechanics was phrased in a world based on the computer game “Tomb Raider”). This is something of an experiment, and (like any other analogy) the allegory will not be able to perfectly capture every aspect of the phenomenon it is trying to represent, so any feedback to improve the allegory would be appreciated.

— 1. Treefolk —

Tolkien’s Middle-Earth contains, in addition to humans, many fantastical creatures. Tolkien’s book “The Hobbit” introduces the trolls, who can move around freely at night but become petrified into stone during the day; and his book “The Two Towers” (the second of his three-volume work “The Lord of the Rings“) introduces the Ents, who are large walking sentient tree-like creatures.

In this Tolkienesque fantasy world of our allegory (readers, by the way, are welcome to suggest a name for this world), there are two intelligent species. On the one hand one has the humans, who can move around during the day much as humans in our world do, but must sleep at night without exception (one can invent whatever reason one likes for this, but it is not relevant to the rest of the allegory). On the other hand, inspired by the trolls and Ents of Tolkien, in this world we will have the treefolk, who in this world are intelligent creatures resembling a tree trunk (possibly with some additional branches or additional appendages, but these will not play a central role in the allegory). They are rooted to a fixed location in space, but during the night they have some limited ability to (slowly) twist their trunk around. On the other hand, during the day, they turn into non-sentient stone columns, frozen in whatever shape they last twisted themselves into. Thus the humans never see the treefolk during their active period, and vice versa; but we will assume that they are still somehow able to communicate asynchronously with each other through a common written language (more on this later).

Remark 1 In Middle-Earth there are also the Huorns, who are briefly mentioned in “The Two Towers” as intelligent trees kin to the Ents, but are not described in much detail. Being something of a blank slate, these would have been a convenient name to give these fantasy creatures; however, given that the works of Tolkien will not be public domain for a few more decades, I’ll refrain from using the Huorns explicitly, and instead use the more generic term “treefolk”.

When a treefolk makes its trunk vertical (or at least straight), it is roughly cylindrical in shape, and has horizontal “rings” on its exterior at intervals of precisely one inch apart; so for instance one can easily calculate the height of a treefolk in inches by counting how many rings it has. One could think of a treefolk’s trunk geometrically as a sequence of horizontal disks stacked on top of each other, with each disk being an inch in height and basically of constant radius horizontally, and separated by the aforementioned rings. Because my artistic abilities are close to non-existent, I will draw a treefolk schematically (and two-dimensionally), as a vertical rectangle, with the rings drawn as horizontal lines (and the disks being the thin horizontal rectangles between the rings):

But treefolks can tilt their trunk at an angle; for instance, if a treefolk tilts its trunk to be at a 30 degree angle from the vertical, then now the top of each ring is only {\cos 30^\circ = \frac{\sqrt{3}}{2} \approx 0.866} inches higher than the top of the preceding ring, rather than a full inch higher, though it is also displaced in space by a distance of {\sin 30^\circ = \frac{1}{2}} inches, all in accordance with the laws of trigonometry. It is also possible for treefolks to (slowly) twist their trunk into more crooked shapes, for instance in the picture below the treefolk has its trunk vertical in its bottom half, but at a {30^\circ} angle in its top half. (This will necessarily cause some compression or stretching of the rings at the turnaround point, so that those rings might no longer be exactly one inch apart; we will ignore this issue as we will only be analyzing the treefolk’s rings at “inertial” locations where the trunk is locally straight and it is possible for the rings to stay perfectly “rigid”. Curvature of the trunk in this allegory is the analogue of acceleration in our spacetime universe.)

treefolks prefer to stay very close to being vertical, and only tilt at significant deviations from the vertical in rare circumstances; it is only in recent years that they have started experimenting with more extreme angles of tilt. Let us say that there is a hard limit of {45^\circ} as to how far a treefolk can tilt its trunk; thus for instance it is not possible for a treefolk to place its trunk at a 60 degree angle from the vertical. (This is analogous to how matter is not able to travel faster than the speed of light in our world.) [Removed this hypothesis as being unnatural for the underlying Euclidean geometry – T.]

Now we turn to the nature of the treefolk’s sentience, which is rather unusual. Namely – only one disk of the treefolk is conscious at any given time! As soon as the sun sets, a treefolk returns from stone to a living creature, and the lowest disk of that treefolk awakens and is able to sense its environment, as well as move the trunk above it. However, every minute, with the regularity of clockwork, the treefolk’s consciousness and memories transfer themselves to the next higher disk; the previous disk becomes petrifed into stone and no longer mobile or receiving sensory input (somewhat analogous to the rare human disease of fibrodysplasia ossificans progressiva, in which the body becomes increasingly ossified and unable to move). As the night progresses, the locus of the treefolk’s consciousness moves steadily upwards and more and more of the treefolk turns to stone, until it reaches the end of its trunk, at which point the treefolk turns completely into a stone column until the next night, at which point the process starts again. (In particular, no treefolk has ever been tall enough to retain its consciousness all the way to the next sunrise.) Treefolk are aware of this process, and in particular can count intervals of time by keeping track of how many times its consciousness has had to jump from one disk to the next; they use rings as a measure of time. For instance, if a treefolk experiences ten shifts of consciousness between one event and the next, the treefolk will know that ten minutes have elapsed between the two events; in their language, they would say that the second event occurred ten rings after the first.

The second unusual feature of the treefolk’s sentience is that at any given time, the treefolk can sense the portions of all nearby objects that are in the same plane as the disk, but not portions that are above or below this plane; in particular, some objects may be completely “invisible” to the treefolk of they are completely above or completely below the treefolk’s current plane of “vision”. Exactly how the treefolk senses its environment is not of central importance, but one could imagine either some sort of visual organ on each disk that is activated during the minute in which that disk is conscious, but which has a limited field of view (similar one that a knight might experience when wearing a helmet with only a narrow horizontal slit in their visor to see through), or perhaps some sort of horizontal echolocation ability. (Or, since we are in a fantasy setting, we can simply attribute this sensory ability to “magic”.) For instance, the picture below that (very crudely) depicts a treefolk standing vertically in an environment, fifty minutes after it first awakens, so that the disk that is fifty inches off the ground is currently sentient. The treefolk can sense any other object that is also fifty inches from the ground; for instance, it can “see” a slice of a bush to the left, and a slice of a boulder to the right, but cannot see the sign at all. (Let’s assume that this somewhat magical “vision” can penetrate through objects to some extent (much as “x-ray vision” would work in comic books), so it can get some idea for instance that the section of boulder it sees is somewhat wider than the slice of bush that it sees.) As the minutes pass and the treefolk’s consciousness moves to higher and higher rungs, the bush will fluctuate in size and then disappear from the treefolk’s point of “view”, and the boulder will also gradually shrink in size until disappearing several rings after the bush disappeared.

If the treefolk’s trunk is tilted at an angle, then its visual plane of view tilts similarly, and so the objects that it can see, and their relative positions and sizes, change somewhat. For instance, in the picture below, the bush, boulder, and sign remain in the same location, but the treefolk’s trunk has tilted; as such, it now senses a small slice of the sign (that will shortly disappear), and a (now smaller) slice of the boulder (that will grow for a couple rings before ultimately shrinking away to nothingness), but the bush has already vanished from view several rings previously.

At any given time, the treefolk only senses a two-dimensional slice of its surroundings, much like how the prisoners in Plato’s cave only see the two-dimensional shadows on the cave wall. As such, treefolks do not view the world around them as three-dimensional; to them, it is a two-dimensional world that slowly changes once every ring even if the three-dimensional world is completely static, similarly to how flipping the pages of an otherwise static flip book can give the illusion of movement. In particular, they do not have a concept in their language for “height”, but only for horizontal notions of spatial measurement, such as width; for instance, if a tall treefolk is next to a shorter treefolk that is 100 inches tall, with both treefolk vertical, it will think of that shorter treefolk as “living for 100 rings” rather than being 100 inches in height, since from the tall treefolk’s perspective, the shorter treefolk would be visible for 100 rings, and then disappear. These treefolk would also see that their rings line up: every time a ring passes for one treefolk, the portion of the other treefolk that is in view also advances by one ring. So treefolk, who usually stay close to vertical for most of their lives, have come to view rings as being universal measurements of time. They also do not view themselves as three-dimensional objects; somewhat like the characters in Edwin Abbott classic book “Flatland“, they think of themselves as two-dimensional disks, with each ring slightly changing the nature of that disk, much as humans feel their bodies changing slightly with each birthday. While they can twist the portion of their trunk above their currently conscious disk at various angles, they do not think of this twisting in three-dimensional terms; they think of it as willing their two-dimensional disk-shaped self into motion in a horizontal direction of their choosing.

Treefolk cannot communicate directly with other treefolk (and in particular one treefolk is not aware of which ring of another treefolk is currently conscious); but they can modify the appearance of their exterior on their currently conscious ring (or on rings above that ring, but not on the petrified rings below) for other treefolk to read. Two treefolks standing vertically side by side will then be able to communicate with each other by a kind of transient text messaging system, since they awaken at the same time, and at any given later moment, their conscious rings will be at the same height and each treefolk be able to read the messages that the other treefolk leaves for them, although a message that one treefolk leaves for another for one ring will vanish when these treefolk both shift their consciousnesses to the next ring. A human coming across these treefolks the following day would be able to view these messages (similar to how one can review a chat log in a text messaging app, though with the oldest messages at the bottom); they could also leave messages for the treefolk by placing text on some sort of sign that the treefolk can then read one line at a time (from bottom to top) on a subsequent night as their consciousness ascends through its rings. (Here we will assume that at some point in the past the humans have somehow learned the treefolk’s written language.) But from the point of view of the treefolk, their messages seem as impermanent to them as spoken words are to us: they last for a minute and then they are gone.

— 2. Time contraction and width dilation —

In recent years, treefolk scientists (or scholars/sages/wise ones, if one wishes to adhere as much as possible to the fantasy setting), studying the effect of significant tilting on other treefolk, discovered a strange phenomenon which they might term “time contraction” (similar to time dilation in special relativity, but with the opposite sign): if a treefolk test subject tilts at a significant angle, then it begins to “age” more rapidly in the sense that test subject will be seen to pass by more rings than the observer treefolk that remains vertical. For instance, with the test subject tilted at a {30^\circ} angle, as 100 rings pass by for the vertical observer, {100 / \cos 30^\circ \approx 115} rings can be counted on the tilted treefolk. This is obvious to human observers, who can readily explain the situation when they come across it during the day, in terms of trigonometry:

This leads to the following “twin paradox“: if two identical treefolk awaken at the same time, but one stays vertical while the other tilts away and then returns, then when they rejoin their rings will become out of sync, with the twisted treefolk being conscious at a given height several minutes after the vertical treefolk was conscious at that height. As such, communication now comes with a lag: a message left by the vertical treefolk at a given ring will take several minutes to be seen by the twisted treefolk, and the twisted treefolk would similarly have to leave its messages on a higher ring than it is currently conscious at in order to be seen by the vertical treefolk. Again, a human who comes across this situation in the day can readily explain the phenomenon geometrically, as the twisted treefolk takes longer (in terms of rings) to reach the same location as the vertical treefolk):

These treefolk scientists also observe a companion to the time contraction phenomenon, namely that of width dilation (the analogue of length contraction; a treefolk who is tilted at an angle will be seen by other (vertical) treefolk observers as having their shape distorted from a disk to an ellipse, with the width in the direction of the tilt being elongated (much like the slices of a carrot become longer and less circular when sliced diagonally). For instance, in the picture at the beginning of this section, the width of the tilted treefolk has increased by a factor of {1 / \cos 30^\circ \approx 1.15}, or about fifteen percent. Once again, this is a phenomenon that humans, with their ability to visualize horizontal and vertical dimensions simultaneously, can readily explain via trigonometry (suppressing the rings on the tilted treefolk to reduce clutter):

The treefolk scientists were able to measure these effects more quantitatively. As they cannot directly sense any non-horizontal notions of space, they cannot directly compute the angle at which a given treefolk deviates from the vertical; but they can measure how much a treefolk “moves” in their two-dimensional plane of vision. Let’s say that the humans use the metric system of length measurement and have taught it (through some well-placed horizontal rulers perhaps) to the treefolk, who are able to use this system to measure horizontal displacements in units of centimeters. (They are unable to directly observe the inch-long height of their rings, as that is a purely vertical measurement, and so cannot use inches to directly measure horizontal displacements.) A treefolk that is tilted at an angle will then be seen to be “moving” at some number of centimeters per ring; with each ring that the vertical observer passes through, the tilted treefolk would appear to have shifted its position by that number of centimeters. After many experiments, the treefolk scientists eventually hit upon the following empirical law: if a treefolk is “moving” at {v} centimeters per ring, then it will experience a time contraction of {\sqrt{1+\frac{v^2}{c^2}}} and a width dilation of {\sqrt{1+\frac{v^2}{c^2}}}, where {c} is a physical constant that they compute to be about {2.54} centimeters per ring. (Compare with special relativity, in which an object moving at {v} meters per second experiences a time dilation of {1/\sqrt{1-\frac{v^2}{c^2}}} and a length contraction of {1/\sqrt{1-\frac{v^2}{c^2}}}, where the physical constant {c} is now about {3.0 \times 10^8} meters per second.) However, they are unable to come up with a satisfactory explanation for this arbitrary-seeming law; it bears some resemblance to the Pythagorean theorem, which they would be familiar with from horizontal plane geometry, but until they view rings as a third spatial dimension rather than as a unit of time, they would struggle to describe this empirically observed time contraction and width dilation in purely geometric terms. But again, the analysis is simple to a human observer, who notices that the tilted treefolk is spatially displaced by {tv} centimeters whenever the vertical tree advances by {t} rings (or inches), at which point the computation is straightforward from Pythagoras (and the mysterious constant {c} is explained as being the number of centimeters in an inch):

At some point, these scientists might discover (either through actual experiment, or thought-experiment) what we would call the principle of relativity: the laws of geometry for a tilted treefolk are identical to that of a vertical treefolk. For instance, as mentioned previously, if a tilted treefolk appears to be moving at {v} centimeters per second from the vantage point of a vertical treefolk, then the vertical treefolk will observe the tilted treefolk as experiencing a time contraction of {\sqrt{1+\frac{v^2}{c^2}}} and a width dilation of {\sqrt{1+\frac{v^2}{c^2}}}, but from the tilted treefolk’s point of view, it is the vertical treefolk which is moving at {v} centimeters per second (in the opposite direction), and it will be the vertical treefolk that experiences the time contraction of {\sqrt{1+\frac{v^2}{c^2}}} and width dilation of {\sqrt{1+\frac{v^2}{c^2}}}. In particular, both treefolk will think that the other one is aging more rapidly, as each treefolk will see slightly more than one ring of the other pass by every time they pass a ring of their own. However, this is not a paradox, due to the relativity of horizontality (the analogue in this allegory to relativity of simultaneity in special relativity); two locations in space that are simultaneously visible to one treefolk (due to them lying on the same plane as one of the disks of that treefolk) need not be simultaneously visible to the other, if they are tilted at different angles. Again, this would be obvious to humans who can see the higher-dimensional picture: compare the planes of sight of the tilted treefolk in the figure below with the planes of sight of the vertical treefolk as depicted in the first figure of this section.

Similarly, the twin paradox discussed earlier continues to hold even when the “inertial” treefolk is not vertical:

[Strictly speaking one would need to move the treefolk to start at the exact same location, rather than merely being very close to each other, to deal with the slight synchronization discrepancy at the very bottom of the two twins in this image.]

Given two locations in {A} and {B} in (three-dimensional space), therefore, one treefolk may view the second location {B} as displaced in space from the first location {A} by {dx} centimeters in one direction (say east-west) and {dy} centimeters in an orthogonal direction (say north-south), while also being displaced by time by {dt} rings; but a treefolk tilted at a different angle may come up with different measures {dx', dy'} of the spatial displacement as well as a different measure {dt'} of the ring displacement, due to the effects of time contraction, width dilation, non-relativity of horizontality, and the relative “motion” between the two treefolk. However, to an external human observer, it is clear from two applications of Pythagoras’s theorem that there is an invariant

\displaystyle  dx^2 + dy^2 + c^2 dt^2 = (dx')^2 + (dy')^2 + c^2 (dt')^2:

See the figure below, where the {y} dimension has been suppressed for simplicity.

From the principle of relativity, this invariance strongly suggests the laws of geometry should be invariant under transformations that preserve the interval {dx^2 + dy^2 + c^2 dt^2}. Humans would refer to such transformations as three-dimensional rigid motions, and the invariance of geometry under these motions would be an obvious fact to them; but it would be a highly unintuitive hypothesis for a treefolk used to viewing their environment as two dimensional space evolving one ring at a time.

Humans could also explain to the treefolk that their calculations would be simplified if they used the same unit of measurement for both horizontal length and vertical length, for instance using the inch to measure horizontal distances as well as the vertical height of their rings. This would normalize {c} to be one, and is somewhat analogous to the use of Planck units in physics.

— 3. The analogy with relativity —

In this allegory, the treefolk are extremely limited in their ability to sense and interact with their environment, in comparison to the humans who can move (and look) rather freely in all three spatial dimensions, and who can easily explain the empirical scientific efforts of the treefolk to understand their environment in terms of three-dimensional geometry. But in the real four-dimensional spacetime that we live in, it is us who are the treefolk; we inhabit a worldline tracing through this spacetime, similar to the trunk of a treefolk, but at any given moment our consciousness only occupies a slice of that worldline, transferred from one slice to the next as we pass from moment to moment; the slices that we have already experienced are frozen in place, and it is only the present and future slices that we have some ability to still control. Thus, we experience the world as a three-dimensional body moving in time, as opposed to a “static” four-dimensional object. We can still map out these experiences in terms of four-dimensional spacetime diagrams (or diagrams in fewer dimensions, if we are able to omit some spatial directions for simplicity); this is analogous to how the humans in this world are easily able to map out the experiences of these treefolk using three-dimensional spatial diagrams (or the two-dimensional versions of them depicted here in which we suppress one of the two horizontal dimensions for simplicity). Even so, it takes a non-trivial amount of conceptual effort to identify these diagrams with reality, since we are so accustomed to the dynamic three-dimensional perspective. But one can try to adopt the perhaps this allegory can help in some cases to make this conceptual leap, and be able to think more like humans than like treefolk.

John BaezTopos Institute Positions

The Topos Institute is doing some remarkable work in applying category theory to real-world problems. And they’re growing!

They want to hire a Finance and Operations Manager and a Research Software Engineer. For more information, go here.

And if you’re a grad student working on category theory, you definitely want to check out their summer research positions! For more information on those, go here. Applications for these are due February 15th, 2023.

Topos Institute

January 06, 2023

Matt von HippelThe Many Varieties of Journal Club

Across disciplines, one tradition seems to unite all academics: the journal club. In a journal club, we gather together to discuss papers in academic journals. Typically, one person reads the paper in depth in advance, and comes prepared with a short presentation, then everyone else asks questions. Everywhere I’ve worked has either had, or aspired to have, a journal club, and every academic I’ve talked to recognizes the concept.

Beyond that universal skeleton, though, are a lot of variable details. Each place seems to interpret journal clubs just a bit differently. Sometimes a lot differently.

For example, who participates in journal clubs? In some places, journal clubs are a student thing, organized by PhD or Master’s students to get more experience with their new field. Some even have journal clubs as formal courses, for credit and everything. In other places, journal clubs are for everyone, from students up through the older professors.

What kind of papers? Some read old classic papers, knowing that without an excuse we’d never take the time to read them and would miss valuable insights. Some instead focus on the latest results, as a way to keep up with progress in the field.

Some variation is less intentional. Academics are busy, so it can be hard to find a volunteer to prepare a presentation on a paper every week. This leads journal clubs to cut corners, in once again a variety of ways. A journal club focused on the latest papers can sometimes only find volunteers interested in presenting their own work (which we usually already have a presentation prepared for). Sometimes this goes a step further, and the journal club becomes a kind of weekly seminar: a venue for younger visitors to talk about their work that’s less formal than a normal talk. Sometimes, instead of topic, the corner cut is preparation: people still discuss new papers, but instead of preparing a presentation they just come and discuss on the fly. This gets dangerous, because after a certain point people may stop reading the papers altogether, hoping that someone else will come having read it to explain it!

Journal clubs are tricky. Academics are curious, but we’re also busy and lazy. We know it would be good for us to discuss, to keep up with new papers or read the old classics… but actually getting organized, that’s another matter!

January 05, 2023

Scott Aaronson Cargo Cult Quantum Factoring

Just days after we celebrated my wife’s 40th birthday, she came down with COVID, meaning she’s been isolating and I’ve been spending almost all my time dealing with our kids.

But if experience has taught me anything, it’s that the quantum hype train never slows down. In the past 24 hours, at least four people have emailed to ask me about a new paper entitled “Factoring integers with sublinear resources on a superconducting quantum processor.” Even the security expert Bruce Schneier, while skeptical, took the paper surprisingly seriously.

The paper claims … well, it’s hard to pin down what it claims, but it’s certainly given many people the impression that there’s been a decisive advance on how to factor huge integers, and thereby break the RSA cryptosystem, using a near-term quantum computer. Not by using Shor’s Algorithm, mind you, but by using the deceptively similarly named Schnorr’s Algorithm. The latter is a classical algorithm based on lattices, which the authors then “enhance” using the heuristic quantum optimization method called QAOA.

For those who don’t care to read further, here is my 3-word review:

No. Just No.

And here’s my slightly longer review:

Schnorr ≠ Shor. Yes, even when Schnorr’s algorithm is dubiously “enhanced” using QAOA—a quantum algorithm that, incredibly, for all the hundreds of papers written about it, has not yet been convincingly argued to yield any speedup for any problem whatsoever (besides, as it were, the problem of reproducing its own pattern of errors) (one possible recent exception from Sami Boulebnane and Ashley Montanaro).

In the new paper, the authors spend page after page saying-without-saying that it might soon become possible to break RSA-2048, using a NISQ (i.e., non-fault-tolerant) quantum computer. They do so via two time-tested strategems:

  1. the detailed exploration of irrelevancies (mostly, optimization of the number of qubits, while ignoring the number of gates), and
  2. complete silence about the one crucial point.

Then, finally, they come clean about the one crucial point in a single sentence of the Conclusion section:

It should be pointed out that the quantum speedup of the algorithm is unclear due to the ambiguous convergence of QAOA.

“Unclear” is an understatement here. It seems to me that a miracle would be required for the approach here to yield any benefit at all, compared to just running the classical Schnorr’s algorithm on your laptop. And if the latter were able to break RSA, it would’ve already done so.

All told, this is one of the most actively misleading quantum computing papers I’ve seen in 25 years, and I’ve seen … many. Having said that, this actually isn’t the first time I’ve encountered the strange idea that the exponential quantum speedup for factoring integers, which we know about from Shor’s algorithm, should somehow “rub off” onto quantum optimization heuristics that embody none of the actual insights of Shor’s algorithm, as if by sympathetic magic. Since this idea needs a name, I’d hereby like to propose Cargo Cult Quantum Factoring.

And with that, I feel I’ve adequately discharged my duties here to sanity and truth. If I’m slow to answer comments, it’ll be because I’m dealing with two screaming kids.

January 03, 2023

Tommaso DorigoScientific Wish List For 2023

Another year just started, and this is as good a time as any to line up a few wishes. Not a bucket list, nor a "will do" set of destined-to-fail propositions. It is painful to have to reckon with the failure of our strength of will, so I'd say it is better to avoid that. Rather, it is a good exercise to put together a list of things that we would like to happen, and over which we have little or no control: it is much safer as we won't feel guilty if these wishes do not come true. 

read more

January 01, 2023

Doug NatelsonFavorite science fiction invention?

 In the forward-looking spirit of the New Year, it might be fun to get readers’ opinions of their favorite science fiction inventions.  I wrote about favorite sci-fi materials back in 2015, but let’s broaden the field. Personally, I’m a fan of the farcaster (spoiler warning!) from the Hyperion Cantos of Dan Simmons.  I also have a long-time affection for Larry Niven’s Known Space universe, especially the General Products Hull (a single molecule transparent to the visible, but opaque at all other wavelengths, and with binding somehow strengthened by an external fusion power source) and the Slaver Disintegrator, which somehow turns off the negative charge of the electron, and thus makes matter tear itself apart from the unscreened Coulomb repulsion of the protons in atomic nuclei.  Please comment below with your favorites.

On another new year’s note, someone needs to do a detailed study of the solubility limit of crème de cassis in champagne.  Too high a cassis to champagne ratio in your kir royals, and you end up with extra cassis stratified at the bottom of your flute, as shown here.


Happy new year to all, and best wishes for a great 2023.



December 30, 2022

Scott Aaronson Happy 40th Birthday Dana!

The following is what I read at Dana’s 40th birthday party last night. Don’t worry, it’s being posted with her approval. –SA

I’d like to propose a toast to Dana, my wife and mother of my two kids.  My dad, a former speechwriter, would advise me to just crack a few jokes and then sit down … but my dad’s not here.

So instead I’ll tell you a bit about Dana.  She grew up in Tel Aviv, finishing her undergraduate CS degree at age 17—before she joined the army.  I met her when I was a new professor at MIT and she was a postdoc in Princeton, and we’d go to many of the same conferences. At one of those conferences, in Princeton, she finally figured out that my weird, creepy, awkward attempts to make conversation with her were, in actuality, me asking her out … at least in my mind!  So, after I’d returned to Boston, she then emailed me for days, just one email after the next, explaining everything that was wrong with me and all the reasons why we could never date.  Despite my general obliviousness in such matters, at some point I wrote back, “Dana, the absolute value of your feelings for me seems perfect. Now all I need to do is flip the sign!”

Anyway, the very next weekend, I took the Amtrak back to Princeton at her invitation. That weekend is when we started dating, and it’s also when I introduced her to my family, and when she and I planned out the logistics of getting married.

Dana and her family had been sure that she’d return to Israel after her postdoc. She made a huge sacrifice in staying here in the US for me. And that’s not even mentioning the sacrifice to her career that came with two very difficult pregnancies that produced our two very diffic … I mean, our two perfect and beautiful children.

Truth be told, I haven’t always been the best husband, or the most patient or the most grateful.  I’ve constantly gotten frustrated and upset, extremely so, about all the things in our life that aren’t going well.  But preparing the slideshow tonight, I had a little epiphany.  I had a few photos from the first two-thirds of Dana’s life, but of course, I mostly had the last third.  But what’s even happened in that last third?  She today feels like she might be close to a breakthrough on the Unique Games Conjecture.  But 13 years ago, she felt exactly the same way.  She even looks the same!

So, what even happened?

Well OK, fine, there was my and Dana’s first trip to California, a month after we started dating.  Our first conference together.  Our trip to Vegas and the Grand Canyon.  Our first trip to Israel to meet her parents, who I think are finally now close to accepting me. Her parents’ trip to New Hope, Pennsylvania to meet my parents. Our wedding in Tel Aviv—the rabbi rushing through the entire ceremony in 7 minutes because he needed to get home to his kids. Our honeymoon safari in Kenya.  Lily’s birth. Our trip to Israel with baby Lily, where we introduced Lily to Dana’s grandmother Rivka, an Auschwitz survivor, just a few months before Rivka passed away. Taking Lily to run around Harvard Yard with our Boston friends, Lily losing her beloved doll Tuza there, then finding Tuza the next day after multiple Harvard staff had been engaged in the quest. There’s me and Dana eating acai bowls in Rio de Janeiro, getting a personal tour of the LHC in Switzerland, with kangaroos and koalas in Australia. There’s our house here in Austin.  Oh, and here are all our Austin friends! Our trip to Disney World with Lily, while Dana was pregnant with Daniel (she did ride one rollercoaster). Daniel’s birth, which to my relief, went well despite the rollercoaster. Our sabbatical year in Israel. The birth of our nephews.

I confess I teared up a little going through all this. Because it’s like: if you showed all these photos to some third party, they’d probably be like, wow, that looks like a pretty good life. Even if you showed the photos to my 17-year-old self, I’d be like, OK, no need to roll the dice again, I’ll take that life. I’m not talking about what’s happened with the world—climate change or COVID or the insurrection or anything like that. But the part with the beautiful Israeli complexity theorist wife, who’s also caring and unbreakable in her moral convictions? Yes.

So, thank you and a very happy birthday to the one who’s given me all this!

And lastly, because it seems almost obligatory: I did feed everything I said just now into GPT (I won’t specify which version), and asked it to write Dana a special GPT birthday poem. Here’s what it came up with:

From Tel Aviv to Princeton,
You’ve traveled near and far,
A brilliant computer scientist,
A shining, guiding star.

You’ve made so many sacrifices,
For family and for love,
But your light shines through the darkness,
And fits me like a glove.

We’ve shared so many moments,
Too many to recount,
But each one is a treasure,
Each memory paramount.

So happy birthday, Dana,
You deserve the very best,
I’m grateful for your presence,
And feel so truly blessed.


Addendum: Speaking of GPT, should it and other Large Language Models be connected to the Internet and your computer’s filesystem and empowered to take actions directly, with reinforcement learning pushing it to achieve the user’s goals?

On the negative side, some of my friends worry that this sort of thing might help an unaligned superintelligence to destroy the world.

But on the positive side, at Dana’s birthday party, I could’ve just told the computer, “please display these photos in a slideshow rotation while also rotating among these songs,” and not wasted part of the night messing around with media apps that befuddle and defeat me as a mere CS PhD.

I find it extremely hard to balance these considerations.

Anyway, happy birthday Dana!

December 26, 2022

Mark Chu-CarrollHow Computers Work: Arithmetic With Gates

In my last post, I promised that I’d explain how we can build interesting mathematical operations using logic gates. In this post, I’m going to try do that by walking through the design of a circuit that adds to multi-bit integers together.

As I said last time, most of the time, when we’re trying to figure out how to do something with gates, it’s useful to with boolean algebra to figure out what we want to build. If we have two bits, what does it mean, in terms of boolean logic to add them?

Each input can be either 0 or 1. If they’re both one, then the sum is 0. If either, but not both, is one, then the sum is 1. If both are one, then the sum is 2. So there’s three possible outputs: 0, 1, and 2.

This brings us to the first new thing: we’re building something that operates on single bits as inputs, but it needs to have more than one bit of output! A single boolean output can only have two possible values, but we need to have three.

The way that we usually describe it is that a single bit adder takes two inputs, X and Y, and produces two outputs, SUM and CARRY. Sum is called the “low order” bit, and carry is the “high order” bit – meaning that if you interpret the output as a binary number, you put the higher order bits to the left of the low order bits. (Don’t worry, this will get clearer in a moment!)

Let’s look at the truth table for a single bit adder, with a couple of extra columns to help understand how we intepret the outputs:

XYSUMCARRYBinaryDecimal
0000000
0110011
1010011
1101102

If we look at the SUM bit, it’s an XOR – that is, it outputs 1 if exactly one, but not both, of its inputs is 1; otherwise, it outputs 0. And if we look at the carry bit, it’s an AND. Our definition of one-bit addition is thus:

  • SUM = X \oplus Y
  • CARRY = X \land Y

We can easily build that with gates:

A one-bit half-adder

This little thing is called a half-adder. That may seem confusing at first, because it is adding two one-bit values. But we don’t really care about adding single bits. We want to add numbers, which consist of multiple bits, and for adding pairs of bits from multibit numbers, a half-adder only does half the work.

That sounds confusing, so let’s break it down a bit with an example.

  • Imagine that we’ve got two two bit numbers, 1 and 3 that we want to add together.
  • In binary 1 is 01, and 3 is 11.
  • If we used the one-bit half-adders for the 0 bit (that is, the lowest order bit – in computer science, we always start counting with 0), we’d get 1+1=0, with a carry of 1; and for the 1 bit, we’d get 1+0=1 with a carry of 0. So our sum would be 10, which is 2.
  • That’s wrong, because we didn’t do anything with the carry output from bit 0. We need to include that as an input to the sum of bit 1.

We could try starting with the truth table. That’s always a worthwile thing to do. But it gets really complicated really quickly.

XX0X1YY0Y1OUTOUT0CARRY0OUT1CARRY1
00000000000
00011011000
00020120010
00031131010
11000011000
11011020110
11020131010
11031140101
20100020010
20111031010
20120140001
20131151011
31100031010
31111040111
31120151011
31131160111

This is a nice illustration of why designing CPUs is so hard, and why even massively analyzed and tested CPUs still have bugs! We’re looking at one of the simplest operations to implement; and we’re only looking at it for 2 bits of input. But already, it’s hard to decide what to include in the table, and to read the table to understand what’s going on. We’re not really going to be able to do much reasoning here directly in boolean logic using the table. But it’s still good practice, both because it helps us make sure we understand what outputs we want, and because it gives us a model to test against once we’ve build the network of gates.

And there’s still some insight to be gained here: Look at the row for 1 + 3. In two bit binary, that’s 01 + 11. The sum for bit 0 is 0 – there’s no extra input to worry about,
but it does generate a carry out. The sum of the input bits for bit
one is 1+1=10 – so 0 with a carry bit. But we have the carry from bit
0 – that needs to get added to the sum for bit1. If we do that – if we do another add step to add the carry bit from bit 0 to the sum from bit 1, then we’ll get the right result!

The resulting gate network for two-bit addition looks like:

The adder for bit 1, which is called a full adder, adds the input bits X1 and Y1, and then adds the sum of those (produced by that first adder) to the carry bit from bit0. With this gate network, the output from the second adder for bit 1 is the correct value for bit 1 of the sum, but we’ve got two different carry outputs – the carry from the first adder for bit 1, and the carry from the second adder. We need to combine those somehow – and the way to do it is an OR gate.

Why an OR gate? The second adder will only produce a carry if the first adder produced a 1 as its output. But there’s no way that adding two bits can produce both a 1 as its sum output and a 1 as its carry output. So the carry bit from the second adder will only ever be 1 if the output of the first adder is 0; and the carry output from the first adder will only ever be 1 if the sum output from the first carry is 0. Only one of the two carries will ever be true, but if either of them is true, we should produce a 1 as the carry output. Thus, the or-gate.

Our full adder, therefore, takes 3 inputs: a carry from the next lower bit, and the two bits to sum; and it outputs two bits: a sum and a carry. Inside, it’s just two adders chained together, so that first we add the two sum inputs, and then we add the sum of that to the incoming carry.

For more than two bits, we just keep chaining full adders together. For example,
here’s a four-bit adder.

This way of implementing sum is called a ripple carry adder – because the carry bits ripple up through the gates. It’s not the most efficient way of producing a sum – each higher order bit of the inputs can’t be added together until the next lower bit is done, so the carry ripples through the network as each bit finishes, and the total time required is proportional to the number of bits to be summed. More bits means that the ripple-carry adder gets slower. But this works, and it’s pretty easy to understand.

There are faster ways to build multibit adders, by making the gate network more complicated in order to remove the ripple delay. You can imagine, for example, that instead of waiting for the carry from bit 0, you could just build the circuit for bit 1 so that it inputs X0 and Y0; and similarly, for bit 2, you could include X0, X1, Y0, and Y1 as additional inputs. You can imagine how this gets complicated quickly, and there are some timing issues that come up as the network gets more complicated, which I’m really not competent to explain.

Hopefully this post successfully explained a bit of how interesting operations like arithmetic can be implemented in hardware, using addition as an example. There are similar gate networks for subtraction, multiplication, etc.

These kinds of gate networks for specific operations are parts of real CPUs. They’re called functional units. In the simplest design, a CPU has one functional unit for each basic arithmetic operation. In practice, it’s a lot more complicated than that, because there are common parts shared by many arithmetic operations, and you can get rid of duplication by creating functional units that do several different things. We might look at how that works in a future post, if people are interested. (Let me know – either in the comments, or email, or mastodon, if you’d like me to brush up on that and write about it.)

December 24, 2022

Scott Aaronson Short letter to my 11-year-old self

Dear Scott,

This is you, from 30 years in the future, Christmas Eve 2022. Your Ghost of Christmas Future.

To get this out of the way: you eventually become a professor who works on quantum computing. Quantum computing is … OK, you know the stuff in popular physics books that never makes any sense, about how a particle takes all the possible paths at once to get from point A to point B, but you never actually see it do that, because as soon as you look, it only takes one path?  Turns out, there’s something huge there, even though the popular books totally botch the explanation of it.  It involves complex numbers.  A quantum computer is a new kind of computer people are trying to build, based on the true story.

Anyway, amazing stuff, but you’ll learn about it in a few years anyway.  That’s not what I’m writing about.

I’m writing from a future that … where to start?  I could describe it in ways that sound depressing and even boring, or I could also say things you won’t believe.  Tiny devices in everyone’s pockets with the instant ability to videolink with anyone anywhere, or call up any of the world’s information, have become so familiar as to be taken for granted.  This sort of connectivity would come in especially handy if, say, a supervirus from China were to ravage the world, and people had to hide in their houses for a year, wouldn’t it?

Or what if Donald Trump — you know, the guy who puts his name in giant gold letters in Atlantic City? — became the President of the US, then tried to execute a fascist coup and to abolish the Constitution, and came within a hair of succeeding?

Alright, I was pulling your leg with that last one … obviously! But what about this next one?

There’s a company building an AI that fills giant rooms, eats a town’s worth of electricity, and has recently gained an astounding ability to converse like people.  It can write essays or poetry on any topic.  It can ace college-level exams.  It’s daily gaining new capabilities that the engineers who tend to the AI can’t even talk about in public yet.  Those engineers do, however, sit in the company cafeteria and debate the meaning of what they’re creating.  What will it learn to do next week?  Which jobs might it render obsolete?  Should they slow down or stop, so as not to tickle the tail of the dragon? But wouldn’t that mean someone else, probably someone with less scruples, would wake the dragon first? Is there an ethical obligation to tell the world more about this?  Is there an obligation to tell it less?

I am—you are—spending a year working at that company.  My job—your job—is to develop a mathematical theory of how to prevent the AI and its successors from wreaking havoc. Where “wreaking havoc” could mean anything from turbocharging propaganda and academic cheating, to dispensing bioterrorism advice, to, yes, destroying the world.

You know how you, 11-year-old Scott, set out to write a QBasic program to converse with the user while following Asimov’s Three Laws of Robotics? You know how you quickly got stuck?  Thirty years later, imagine everything’s come full circle.  You’re back to the same problem. You’re still stuck.

Oh all right. Maybe I’m just pulling your leg again … like with the Trump thing. Maybe you can tell because of all the recycled science fiction tropes in this story. Reality would have more imagination than this, wouldn’t it?

But supposing not, what would you want me to do in such a situation?  Don’t worry, I’m not going to take an 11-year-old’s advice without thinking it over first, without bringing to bear whatever I know that you don’t.  But you can look at the situation with fresh eyes, without the 30 intervening years that render it familiar. Help me. Throw me a frickin’ bone here (don’t worry, in five more years you’ll understand the reference).

Thanks!!
—Scott

PS. When something called “bitcoin” comes along, invest your life savings in it, hold for a decade, and then sell.

PPS. About the bullies, and girls, and dating … I could tell you things that would help you figure it out a full decade earlier. If I did, though, you’d almost certainly marry someone else and have a different family. And, see, I’m sort of committed to the family that I have now. And yeah, I know, the mere act of my sending this letter will presumably cause a butterfly effect and change everything anyway, yada yada.  Even so, I feel like I owe it to my current kids to maximize their probability of being born.  Sorry, bud!

December 20, 2022

Mark Chu-CarrollHow Computers Work: Logic Gates

At this point, we’ve gotten through a very basic introduction to how the electronic components of a computer work. The next step is understanding how a computer can compute anything.

There are a bunch of parts to this.

  1. How do single operations work? That is, if you’ve got a couple of numbers represented as high/low electrical signals, how can you string together transistors in a way that produces something like the sum of those two numbers?
  2. How can you store values? And once they’re stored, how can you read them back?
  3. How does the whole thing run? It’s a load of transistors strung together – how does that turn into something that can do things in sequence? How can it “read” a value from memory, interpret that as an instruction to perform an operation, and then select the right operation?

In this post, we’ll start looking at the first of those: how are individual operations implemented using transistors?

Boolean Algebra and Logic Gates

The mathematical basis is something called boolean algebra. Boolean algebra is a simple mathematical system with two values: true and false (or 0 and 1, or high and low, or A and B… it doesn’t really matter, as long as there are two, and only two, distinct values).

Boolean algebra looks at the ways that you can combine those true and false values. For example, if you’ve got exactly one value (a bit) that’s either true or false, there are four operations you can perform on it.

  1. Yes: this operation ignores the input, and always outputs True.
  2. No: like Yes, this ignores its input, but in No, it always outputs False.
  3. Id: this outputs the same value as its input. So if its input is true, then it will output true; if its input is false, then it will output false.
  4. Not: this reads its input, and outputs the opposite value. So if the input is true, it will output false; and if the input is false, it will output True.

The beauty of boolean algebra is that it can be physically realized by transistor circuits. Any simple, atomic operation that can be described in boolean algebra can be turned into a simple transistor circuit called a gate. For most of understanding how a computer works, once we understand gates, we can almost ignore the fact that there are transistors behind the scenes: the gates become our building blocks.

The Not Gate

InputOutput
10
01
The truth table for boolean NOT

We’ll start with the simplest gate: a not gate. A not gate implements the Not operation from boolean algebra that we described above. In a physical circuit, we’ll interpret a voltage on a wire (“high”) as a 1, and no voltage on the wire (“low”) as a 0. So a not gate should output low (no voltage) when its input is high, and it should output high (a voltage) when its input is low. We usually write that as something called a truth table, which shows all of the possible inputs, and all of the possible outputs. In the truth table, we usually write 0s and 1s: 0 for low (or no current), and 1 for high. For the NOT gate, the truth table has one input column, and one output column.

Circuit diagram of a NOT gate

I’ve got a sketch of a not gate in the image to the side. It consists of two transistors: a standard (default-off) transistor, which is labelled “B”, and a complementary transistor (default-on) labeled A. A power supply is provided on the the emitter of transistor A, and then the collector of A is connected to the emitter of B, and the collector of B is connected to ground. Finally, the input is split and connected to the bases of both transistors, and the output is connected to the wire that connects the collector of A and the emitter of B.

That all sounds complicated, but the way that it works is simple. In an electric circuit, the current will always follow the easiest path. If there’s a short path to ground, the current will always follow that path. And ground is always low (off/0). Knowing that, let’s look at what this will do with its inputs.

Suppose that the input is 0 (low). In that case, transistor A will be on, and transistor B will be off. Since B is off, there’s no path from the power to ground; and since A is on, cif there’s any voltage at the input, then current will flow through A to the output.

Now suppose that the input is 1 (high). In that case, A turns off, and B turns on. Since A is off, there’s no path from the power line to the output. And since B is on, the circuit has connected the output to ground, making it low.

Our not gate is, basically, a switch. If its input is high, then the switch attaches the output to ground; if its input is low, then the switch attaches the output to power.

The NAND gate

Let’s try moving on to something more interesting: a NAND gate. A NAND gate takes two inputs, and outputs high when any of its inputs is low. Engineers love NAND gates, because you can create any boolean operation by combining NAND gates. We’ll look at that in a bit more detail later.

Input XInput YOutput
001
011
101
110
The truth table for NAND

Here’s a diagram of a NAND gate. Since there’s a lots of wires running around and crossing each other, I’ve labeled the transistors, and made each of the wires a different color:

  • Connections from the power source are drawn in green.
  • Connections from input X are drawn in blue.
  • Connections from input Y are drown in red.
  • The complementary transistors are labelled C1 and C2.
  • The output of the two complementary transistors is labelled “cout”, and drawn in purple.
  • The two default-off transistors are labelled T1 and T2.
  • The output from the gate is drawn in brown.
  • Connections to ground are drawn in black.

Let’s break down how this works:

  • In the top section, we’ve got the two complimentary (default-on) transistors. If either of the inputs is 0 (low), then they’ll stay on, and pass a 1 to the cout line. There’s no connection to ground, and there is a connection to power via one (or both) on transistors, so the output of the circuit will be 1 (high).
  • If neither of the inputs is low, then both C1 and C2 turn off. Cout is then not getting any voltage, and it’s 0. You might think that this is enough – but we want to force the output all the way to 0, and there could be some residual electrons in C1 and C2 from the last time they were active. So we need to provide a path to drain that, instead of allowing it to possibly affect the output of the gate. That’s what T and T2 are for on the bottom. If both X and Y are high, then both T1 and T2 will be on – and that will provide an open path to ground, draining the system, so that the output is 0 (low).

Combining Gates

There are ways of building gates for each of the other basic binary operators in boolean algebra: AND, OR, NOR, XOR, and XNOR. But in fact, we don’t need to know how to do those – because in practice,all we need is a NAND gate. You can combine NAND gates to produce any other gate that you want. (Similarly, you can do the same with NOR gates. NAND and NOR are called universal gates for this reason.)

Let’s look at how that works. First, we need to know how to draw gates in a schematic form, and what each of the basic operations do. So here’s a chart of each operation, its name, its standard drawing in a schematic, and its truth table.

Just like we did with the basic gates above, we’ll start with NOT. Using boolean logic identities, we can easily derive that \lnot A = A \lnot\land A; or in english, “not A” is the same thing as “not(A nand A)”. In gates, that’s easy to build: it’s a NAND gate with both of its inputs coming from the same place:

For a more interesting one, let’s look at AND, and see how we can build that using just NAND gates. We can go right back to boolean algebra, and play with identities. We want A \land B. It’s pretty straightforward in terms of logic: “A \and B” is the same as \lnot (A \lnot\land B).

That’s just two NAND gates strung together, like this:

We can do the same basic thing with all of the other basic boolean operations. We start with boolean algebra to figure out equivalences, and then translate those into chains of gates.

With that, we’ve got the basics of boolean algebra working in transistors. But we still aren’t doing interesting computations. The next step is building up: combining collections of gates together to do more complicated things. In the next post, we’ll look at an example of that, by building an adder: a network of gates that performs addition!

December 19, 2022

John PreskillEight highlights from publishing a science book for the general public

What’s it like to publish a book?

I’ve faced the question again and again this year, as my book Quantum Steampunk hit bookshelves in April. Two responses suggest themselves.

On the one hand, I channel the Beatles: It’s a hard day’s night. Throughout the publication process, I undertook physics research full-time. Media opportunities squeezed themselves into the corners of the week: podcast and radio-show recordings, public-lecture preparations, and interviews with journalists. After submitting physics papers to coauthors and journals, I drafted articles for Quanta Magazine, Literary Hub, the New Scientist newsletter, and other venues—then edited the articles, then edited them again, and then edited them again. Often, I apologized to editors about not having the freedom to respond to their comments till the weekend. Before public-lecture season hit, I catalogued all the questions that I imagined anyone might ask, and I drafted answers. The resulting document spans 16 pages, and I study it before every public lecture and interview.

Public lecture at the Institute for the Science of Origins at Case Western Reserve University

Answer number two: Publishing a book is like a cocktail of watching the sun rise over the Pacific from Mt. Fuji, taking off in an airplane for the first time, and conducting a symphony in Carnegie Hall.1 I can scarcely believe that I spoke in the Talks at Google lecture series—a series that’s hosted Tina Fey, Noam Chomsky, and Andy Weir! And I found my book mentioned in the Boston Globe! And in a Dutch science publication! If I were an automaton from a steampunk novel, the publication process would have wound me up for months.

Publishing a book has furnished my curiosity cabinet of memories with many a seashell, mineral, fossil, and stuffed crocodile. Since you’ve asked, I’ll share eight additions that stand out.

Breakfast on publication day. Because how else would one celebrate the publication of a steampunk book?

1) I guest-starred on a standup-comedy podcast. Upon moving into college, I received a poster entitled 101 Things to Do Before You Graduate from Dartmouth. My list of 101 Things I Never Expected to Do in a Physics Career include standup comedy.2 I stand corrected.

Comedian Anthony Jeannot bills his podcast Highbrow Drivel as consisting of “hilarious conversations with serious experts.” I joined him and guest comedienne Isabelle Farah in a discussion about film studies, lunch containers, and hippies, as well as quantum physics. Anthony expected me to act as the straight man, to my relief. That said, after my explanation of how quantum computers might help us improve fertilizer production and reduce global energy consumption, Anthony commented that, if I’d been holding a mic, I should have dropped it. I cherish the memory despite having had to look up the term mic drop when the recording ended.

At Words Worth Books in Waterloo, Canada

2) I met Queen Victoria. In mid-May, I arrived in Canada to present about my science and my book at the University of Toronto. En route to the physics department, I stumbled across the Legislative Assembly of Ontario. Her Majesty was enthroned in front of the intricate sandstone building constructed during her reign. She didn’t acknowledge me, of course. But I hope she would have approved of the public lecture I presented about physics that blossomed during her era. 

Her Majesty, Queen Victoria

3) You sent me your photos of Quantum Steampunk. They arrived through email, Facebook, Twitter, text, and LinkedIn. They showed you reading the book, your pets nosing it, steampunk artwork that you’d collected, and your desktops and kitchen counters. The photographs have tickled and surprised me, although I should have expected them, upon reflection: Quantum systems submit easily to observation by their surroundings.3 Furthermore, people say that art—under which I classify writing—fosters human connection. Little wonder, then, that quantum physics and writing intersect in shared book selfies.

Photos from readers

4) A great-grandson of Ludwig Boltzmann’s emailed. Boltzmann, a 19th-century Austrian physicist, helped mold thermodynamics and its partner discipline statistical mechanics. So I sat up straighter upon opening an email from a physicist descended from the giant. Said descendant turned out to have watched a webinar I’d presented for the magazine Physics Today. Although time machines remain in the domain of steampunk fiction, they felt closer to reality that day.

5) An experiment bore out a research goal inspired by the book. My editors and I entitled the book’s epilogue Where to next? The future of quantum steampunk. The epilogue spurred me to brainstorm about opportunities and desiderata—literally, things desired. Where did I want for quantum thermodynamics to head? I shared my brainstorming with an experimentalist later that year. We hatched a project, whose experiment concluded this month. I’ll leave the story for after the paper debuts, but I can say for now that the project gives me chills—in a good way.

6) I recited part of Edgar Allan Poe’s “The Raven” with a fellow physicist at a public lecture. The Harvard Science Book Talks form a lecture series produced by the eponymous university and bookstore. I presented a talk hosted by Jacob Barandes—a Harvard physics lecturer, the secret sauce behind the department’s graduate program, and an all-around exemplar of erudition. He asked how entropy relates to “The Raven.”

Image from the Harvard Gazette

For the full answer, see chapter 11 of my book. Briefly: Many entropies exist. They quantify the best efficiencies with which we can perform thermodynamic tasks such as running an engine. Different entropies can quantify different tasks’ efficiencies if the systems are quantum, otherwise small, or far from equilibrium—outside the purview of conventional 19th-century thermodynamics. Conventional thermodynamics describes many-particle systems, such as factory-scale steam engines. We can quantify conventional systems’ efficiencies using just one entropy: the thermodynamic entropy that you’ve probably encountered in connection with time’s arrow. How does this conventional entropy relate to the many quantum entropies? Imagine starting with a quantum system, then duplicating it again and again, until accruing infinitely many copies. The copies’ quantum entropies converge (loosely speaking), collapsing onto one conventional-looking entropy. The book likens this collapse to a collapse described in “The Raven”:

The speaker is a young man who’s startled, late one night, by a tapping sound. The tapping exacerbates his nerves, which are on edge due to the death of his love: “Deep into that darkness peering, long I stood there wondering, fearing, / Doubting, dreaming dreams no mortal ever dared to dream before.” The speaker realizes that the tapping comes from the window, whose shutter he throws open. His wonders, fears, doubts, and dreams collapse onto a bird’s form as a raven steps inside. So do the many entropies collapse onto one entropy as the system under consideration grows infinitely large. We could say, instead, that the entropies come to equal each other, but I’d rather picture “The Raven.” 

I’d memorized the poem in high school but never had an opportunity to recite it for anyone—and it’s a gem to declaim. So I couldn’t help reciting a few stanzas in response to Jacob. But he turned out to have memorized the poem, too, and responded with the next several lines! Even as a physicist, I rarely have the chance to reach such a pinnacle of nerdiness.

With Pittsburgh Quantum Institute head honchos Rob Cunningham and Adam Leibovich

7) I stumbled across a steam-driven train in Pittsburgh. Even before self-driving cars heightened the city’s futuristic vibe, Pittsburgh has been as steampunk as the Nautilus. Captains of industry (or robber barons, if you prefer) raised the city on steel that fed the Industrial Revolution.4 And no steampunk city would deserve the title without a Victorian botanical garden.

A Victorian botanical garden features in chapter 5 of my book. To see a real-life counterpart, visit the Phipps Conservatory. A poem in glass and aluminum, the Phipps opened in 1893 and even boasts a Victoria Room.

Yes, really.

I sneaked into the Phipps during the Pittsburgh Quantum Institute’s annual conference, where I was to present a public lecture about quantum steampunk. Upon reaching the sunken garden, I stopped in my tracks. Yards away stood a coal-black, 19th-century steam train. 

At least, an imitation train stood yards away. The conservatory had incorporated Monet paintings into its scenery during a temporary exhibition. Amongst the palms and ponds were arranged props inspired by the paintings. Monet painted The Gare Saint-Lazare: Arrival of a Train near a station, so a miniature train stood behind a copy of the artwork. The scene found its way into my public lecture—justifying my playing hooky from the conference for a couple of hours (I was doing research for my talk!).

My book’s botanical garden houses hummingbirds, wildebeests, and an artificial creature called a Yorkicockasheepapoo. I can’t promise that you’ll spy Yorkicockasheepapoos while wandering the Phipps, but send me a photo if you do.

8) My students and postdocs presented me with a copy of Quantum Steampunk that they’d signed. They surprised me one afternoon, shortly after publication day, as I was leaving my office. The gesture ranks as one of the most adorable things that’ve ever happened to me, and their book is now the copy that I keep on campus. 

Students…book-selfie photographers…readers halfway across the globe who drop a line…People have populated my curiosity cabinet of with some of the most extraordinary book-publication memories. Thanks for reading, and thanks for sharing.

Book signing after public lecture at Chapman University. Photo from Justin Dressel.

1Or so I imagine, never having watched the sun rise from Mt. Fuji or conducted any symphony, let alone one at Carnegie Hall, and having taken off in a plane for the first time while two months old.

2Other items include serve as an extra in a film, become stranded in Taiwan, and publish a PhD thesis whose title contains the word “steampunk.”

3This ease underlies the difficulty of quantum computing: Any stray particle near a quantum computer can “observe” the computer—interact with the computer and carry off a little of the information that the computer is supposed to store.

4The Pittsburgh Quantum Institute includes Carnegie Mellon University, which owes its name partially to captain of industry Andrew Carnegie.

December 18, 2022

John PreskillAnnouncing the quantum-steampunk short-story contest!

The year I started studying calculus, I took the helm of my high school’s literary magazine. Throughout the next two years, the editorial board flooded campus with poetry—and poetry contests. We papered the halls with flyers, built displays in the library, celebrated National Poetry Month, and jerked students awake at morning assembly (hitherto known as the quiet kid you’d consult if you didn’t understand the homework, I turned out to have a sense of humor and a stage presence suited to quoting from that venerated poet Dr. Seuss.1 Who’d’ve thought?). A record number of contest entries resulted.

That limb of my life atrophied in college. My college—a stereotypical liberal-arts affair complete with red bricks—boasted a literary magazine. But it also boasted English and comparative-literature majors. They didn’t need me, I reasoned. The sun ought to set on my days of engineering creative-writing contests.

I’m delighted to be eating my words, in announcing the Quantum-Steampunk Short-Story Contest.

From Pinterest

The Maryland Quantum-Thermodynamics Hub is running the contest this academic year. I’ve argued that quantum thermodynamics—my field of research—resembles the literary and artistic genre of steampunk. Steampunk stories combine Victorian settings and sensibilities with futuristic technologies, such as dirigibles and automata. Quantum technologies are cutting-edge and futuristic, whereas thermodynamics—the study of energy—developed during the 1800s. Inspired by the first steam engines, thermodynamics needs retooling for quantum settings. That retooling is quantum thermodynamics—or, if you’re feeling whimsical (as every physicist should), quantum steampunk.

The contest opens this October and closes on January 15, 2023. Everyone aged 13 or over may enter a story, written in English, of up to 3,000 words. Minimal knowledge of quantum theory is required; if you’ve heard of Schrödinger’s cat, superpositions, or quantum uncertainty, you can pull out your typewriter and start punching away. 

Entries must satisfy two requirements: First, stories must be written in a steampunk style, including by taking place at least partially during the 1800s. Transport us to Meiji Japan; La Belle Époque in Paris; gritty, smoky Manchester; or a camp of immigrants unfurling a railroad across the American west. Feel free to set your story partially in the future; time machines are welcome.

Second, each entry must feature at least one quantum technology, real or imagined. Real and under-construction quantum technologies include quantum computers, communication networks, cryptographic systems, sensors, thermometers, and clocks. Experimentalists have realized quantum engines, batteries, refrigerators, and teleportation, too. Surprise us with your imagined quantum technologies (and inspire our next research-grant proposals).

In an upgrade from my high-school days, we’ll be awarding $4,500 worth of Visa gift certificates. The grand prize entails $1,500. Entries can also win in categories to be finalized during the judging process; I anticipate labels such as Quantum Technology We’d Most Like to Have, Most Badass Steampunk Hero/ine, Best Student Submission, and People’s Choice Award.

Our judges run the gamut from writers to quantum physicists. Judge Ken Liu‘s latest novel peered out from a window of my local bookstore last month. He’s won Hugo, Nebula, and World Fantasy Awards—the topmost three prizes that pop up if you google “science-fiction awards.” Appropriately for a quantum-steampunk contest, Ken has pioneered the genre of silkpunk, “a technology aesthetic based on a science fictional elaboration of traditions of engineering in East Asia’s classical antiquity.” 

Emily Brandchaft Mitchell is an Associate Professor of English at the University of Maryland. She’s authored a novel and published short stories in numerous venues. Louisa Gilder wrote one of the New York Times 100 Notable Books of 2009, The Age of Entanglement. In it, she imagines conversations through which scientists came to understand the core of this year’s Nobel Prize in physics. Jeffrey Bub is a philosopher of physics and a Distinguished University Professor Emeritus at the University of Maryland. He’s also published graphic novels about special relativity and quantum physics with his artist daughter. 

Patrick Warfield, a musicologist, serves as the Associate Dean for Arts and Programming at the University of Maryland. (“Programming” as in “activities,” rather than as in “writing code,” the meaning I encounter more often.) Spiros Michalakis is a quantum mathematician and the director of Caltech’s quantum outreach program. You may know him as a scientific consultant for Marvel Comics films.

Walter E. Lawrence III is a theoretical quantum physicist and a Professor Emeritus at Dartmouth College. As department chair, he helped me carve out a niche for myself in physics as an undergrad. Jack Harris, an experimental quantum physicist, holds a professorship at Yale. His office there contains artwork that features dragons.

University of Maryland undergraduate Hannah Kim designed the ad above. She and Jade LeSchack, founder of the university’s Undergraduate Quantum Association, round out the contest’s leadership team. We’re standing by for your submissions through—until the quantum internet exists—the hub’s website. Send us something to dream on.

This contest was made possible through the support of Grant 62422 from the John Templeton Foundation.

1Come to think of it, Seuss helped me prepare for a career in physics. He coined the terms wumbus and nerd; my PhD advisor invented NISQ, the name for a category of quantum devices. NISQ now has its own Wikipedia page, as does nerd

Matt Strassler Why Current Wormhole Research is So Important

Once we clear away the hype (see the previous posts 1, 2, 3, 4), and realize that no one is doing anything as potentially dangerous as making real wormholes (ones you could actually fall into) in a lab, or studying how to send dogs across the galaxy, we are left with a question. Why bother to do wormhole research at all?

The answer is that it has nothing to do with actually making wormholes… at least, not in the sense of science fiction portals that you and I could use to travel from here to some faraway place across the universe.  It has to do with potentially gaining new insight into the quantum physics of gravity, space and time.

How Do We Study Black Holes, and Why?

Why do scientists do research on black holes?  There are at least two very different reasons.

  1. Large black holes can be observed in nature.  These black holes, which astronomers and gravitational wave experimenters study, are well-described by non-quantum physics — “classical” physics, where the future is (in principle) truly predictable from the past.  
  2. Small black holes are a window into quantum gravity — the unknown quantum physics of spacetime, where space itself is governed by the uncertainty principle, meaning that the very shape of spacetime can’t be precisely specified.  This is relevant for black holes far too small for us to discover using astronomy, yet far too difficult for us to produce experimentally. They are important because they pose conceptual problems and puzzles for quantum gravity. Theoretical physicists think about black holes, and study their math, in hopes of uncovering quantum gravity’s secrets.

To gain more insight into their workings, scientists also simulate black holes on computers, and study analogues to black holes in laboratories.

Why Wormholes?

In contrast to black holes, there may be no wormholes worthy of the name anywhere in our universe. Though recent research clearly shows that there’s no principle that forbids wormholes from existing, it also shows it’s unlikely that large wormholes can be produced or can endure in our universe. While black holes are a generic outcome of the collapse of a huge star, wormholes are relatively delicate, and difficult to create and maintain.

But wormholes may be even more interesting than black holes for the problems of quantum gravity.  This was only appreciated, slowly at first, over the past 10 years. 

It’s hard to define the quantum state of a black hole. [In quantum physics, objects don’t just have locations and motions; roughly speaking, they have “states”, in which they have a combination of many locations and motions all at once.] The basic obstacle is entropy, a measure of missing information. The air in your room has entropy, because although you may know its temperature and pressure, you do not know where every atom of air is; that’s missing information. It turns out that a black hole has entropy too, which means that our usual description of a black hole is intrinsically missing some crucial information. That prevents us from knowing precisely what its state is.

But surprisingly, in some circumstances the quantum state of a wormhole can be sharply defined — in which case its entropy is zero.  (Such a wormhole is not missing any information. But if you take either half of this wormhole and ignore the other half, you find a black hole. That black hole has entropy precisely because you’re ignoring all the information included in the other half of the wormhole!) To obtain and understand such a wormhole involves giving it two apparently different but actually interchangeable descriptions, one in terms of space-time and gravity, where the wormhole’s geometric shape is clear, and one in terms of what one might call a gravity-independent auxiliary quantum system, in which its quantum state is precisely defined.

The Power of Duality: A Rosetta Stone for Quantum Gravity

One physical object, two quantum descriptions — one with gravity, one without; the first with more space dimensions than the latter.  It’s like being able to read the same text in two completely different languages.  It’s an example of what physicists often call “a duality.” (I’ve gone into more detail about this in recent posts here and here.)

This is the message of what goes by the mantra of “ER=EPR”, referring to two famous and apparently unrelated papers from 1935 by Einstein and Rosen, with the second having Podolsky as a co-author.  ER=EPR asserts that two apparently different things,

  • a tangible bridge across curved, extra-dimensional space between two regions, and
  • a less tangible bridge, established with quantum entanglement between objects in the same two regions, without any use of gravity,

are literally the same thing.

An conceptual illustration of the proposal that two perfectly entangled quantum systems (EPR) are equivalent to a wormhole connecting the locations of those systems (ER), and represent two languages for describing exactly the same thing. The wormhole is empty; the slabs shown merely indicate how distances are shrinking as one proceeds to the wormhole’s midpoint. Not shown is that the wormhole changes shape over time; for the situation in this picture, this wormhole is “non-traversable”, because there’s insufficient time to cross from one side of the bridge to the other before it shrinks down to nothing.

Discovering that spacetime is related to quantum entanglement, and that ER and EPR involve the same issues, is somewhat like discovering that two poorly understood and partially readable texts in completely different languages are actually two translations of exactly the same document.  It’s a Rosetta stone.  Parts of the document can easily be read in one language, other parts in the second language; and putting them together, we find we can read more and more.

Similarly, the math of a wormhole (ER) looks completely different from the math of two quantum-entangled non-gravitational systems (EPR).  But in particular cases, Juan Maldacena and Lenny Susskind argued, they are two languages describing the same object.  We can combine these two partial views of this single object to learn more and more about it. 

Moreover, because we’re using math, not text, we can go a step further.  Even in regimes where we cannot “read the document” in either language, we can use computers to explore.  Scientists can try to simulate the math of the entangled auxiliary quantum systems on a computer, ideally a quantum computer so that it keeps track of all quantum effects, to learn more about the wormhole’s behavior in regimes where we have no idea how it works — in regions where the quantum uncertainty principle affects space and time.

Even more remarkable would be to actually make — not merely simulate — this entangled pair of auxiliary quantum systems. Then we would be closer to making a wormhole, with laws of nature different from ours and with its own gravity, that connects on to our world. But that’s a long ways off, and not the story for the present.

From ER=EPR to Traversable Wormholes

A further breakthrough, beyond the original ER=EPR idea, came with the work of Gao, Jafferis and Wall (see also here and here) in which it was demonstrated for the first time that “traversable wormholes” — ones that can truly serve as bridges across which objects can be transported — do make physical sense.  Astonishingly, they are related by duality to an important and exciting research area in quantum information, called “quantum teleportation.”  That’s the process by which, using two entangled quantum systems, quantum information can be brought to one of the systems, destroyed in that system, and recreated in that second system some distance away. Again, don’t expect anyone to be teleporting your dog, but simple information and ultra-microscopic objects might be transportable. 

Be warned though; the teleportation only works if additional non-quantum information is traded between the two systems. In the wormhole language, that means you can only get through the wormhole if information is also passed outside the wormhole from the departure region to the arrival region.  This makes it impossible to go someplace that you haven’t already been sending messages to, and to use any such wormhole as a shortcut — i.e., to get to your destination faster than could a near-light-speed spacecraft traveling outside the wormhole. Not only do portals to ultra-distant places remain science fiction, they now seem even more likely to stay that way.

Still, with these caveats, there’s still something amazing here: we can now imagine using the Rosetta stone of duality to simulate a traversable wormhole, and learn how it works in quantum gravity. That would be fantastic!

The Dream of Simulating Quantum Gravity

This is a dream, yet to be fulfilled. Computers are nowhere near being able to handle the questions we’d like to answer about the gravity we live with in our “four-dimensional space-time” (our familiar three space dimensions, plus one more for time). But by simplifying the problem in several steps (see the last figure of this post), we can at least hope to answer some early questions in a much simpler sort of wormhole in a simpler sort of gravity.    This is what I’d prefer to call an artificially-simulated cartoon wormhole — rather than a “baby” wormhole, because unlike a baby, it isn’t a small version of an adult, nor has it any hope of growing into one.  It’s more like a stick figure.  It’s in two-dimensional space-time — one space and one time.  That’s a big simplification — there’s nothing like normal gravity there!  [Worse, we don’t have an exact duality in that case; the auxiliary quantum system we need isn’t really the same “text” as the wormhole. These are two systems, not one, with a limited but useful overlap.]

But cartoons aren’t to be mocked.  Don’t underestimate them; cartoons are a powerful tool for educators everywhere, and subversive political cartoons have helped take down governments. For decades, famous physicists — Schwinger, ‘t Hooft, Gross and Neveu, Kogut and Susskind, and many more — have studied cartoon versions of real physics, especially ones in which our four space-time dimensions are replaced with just two. They’ve often learned interesting lessons from doing so, sometimes even profound ones.

[Note: Stick figure physics also can be a very good description of real stick-figure systems, for example a one-dimensional chain of atoms inside a material.] 

I hasten to caution you that this technique does not always work.  Not all of the lessons learned from stick-figure physics turn out to apply to the corresponding real-world problem. But this method has had enough success that we should take cartoon studies seriously.

This is why exploration of one-dimensional wormholes, and of some sort of auxiliary quantum problem to which they might be approximately related, may be worthwhile.  And this is why it’s important to learn to simulate these auxiliary quantum systems on quantum computers, as was done in the paper that generated all the hype, based on proposals made in this paper and this one.  Even if we can’t hope soon to understand how three-dimensional quantum space emerges from quantum entanglement, we can perhaps hope to learn more about one-dimensional quantum space, using quantum computer simulation. Maybe what we learn there would already teach us a deep and universal truth about quantum gravity, or at least suggest new ways to think about its subtleties.

The experiment done in the recent paper is a baby step in this direction.  Others have attempted something along similar lines, but this is the first experiment that seems to have focused on the truly wormhole-like regime, and found some evidence for what was expected already of wormholes (from direct calculation and from classical computers…I’ll write about those details in a future post.) That seems like a real step forward. But let’s keep things in perspective. No new knowledge was created by this experiment; its achievements were technical and technological.  It’s not a conceptual breakthrough.  (I’m not alone in this view; Lenny Susskind, Dan Harlow and Scott Aaronson all expressed the same opinion in the New York Times and elsewhere.)

But nevertheless, this experiment represents a little arrow that points to a possible big future… not a future of a new Elon Musk, building wormholes for fun and profit, but one of a future Einstein, comprehending the quantum nature of spacetime itself.

December 15, 2022

Matt Strassler Fusion Confusion

By now the word is widely out that Tuesday’s fusion announcement was less of a news flash (as I initially suggested) and more of a overheated news flicker. The politician-scientists who made the announcement that they’d put 2 Megajoules of energy into a pellet of nuclear kindling, and gotten 3 Megajoules out from nuclear fusion, neglected to mention that it took them about 300 Megajoules — about 100 times as much energy from the electrical grid — to run the experiment in the first place. In other words, they said

  • -2 + 3 = +1 !!! Breakthrough!!!!!!!!!

whereas anyone who knew the details would have said

  • -300 – 2 + 3 = -299 ? Cool bro, but…

In other words, it was a good day for fusion, but not nearly good enough.

To be fair to everyone, the scientists involved have made tremendous progress in the last few years; they weren’t even close to getting this much energy out until 2021. They’re 10 times ahead of where they were in 2019 and over 100 times ahead of where they were in 2010. If they can continue this progress and figure out how to get another 100 times as much fusion energy out without requiring vastly more electricity, then this all might start to be somewhat interesting.

But even then, it seems it’s going to be very tough to get anything resembling a power plant out of this fusion strategy. Experts seem to think the engineering challenges are immense. (Have any readers heard someone say otherwise?) Perhaps Tokomaks are still the way to go.

I’m annoyed, as I’m sure many of you are. I was myself too trusting, assuming that the politician-scientists who made the claims would be smart enough not to over-hype something that would get so much scrutiny. It’s the 21st century; you can’t come out and say something so undeservedly dramatic without the backlash being loud and swift. Instead they played the political spin game as though it was still the 1970s. I think they were hoping to convince Congress to keep their funding going (and because of an application of their work to nuclear weapons, they may succeed.) But when it comes to nuclear fusion as a solution to our energy/climate crisis — did they really think people wouldn’t quickly figure out they’d been duped? Seriously?

To quote one of the comments on my last post, from Blackstone, “It seems to me that this whole civilization desperately needs a reality check.” I completely agree. We’re so driven now by hype and click-bait that it’s almost impossible to separate the wheat from the chaff. Maybe at some point the people driving this international daily drama show will realize they’re doing serious harm. Clearly we’re not there yet.

But that’s what this blog is for, as are some others in a similar vein. Hopefully I won’t make too many mistakes like the one I made Tuesday, and when I make them, I’ll always fix them. Thank you to the many commenters who raised valid concerns; I know you’ll always keep me honest if I take a false step.

Scott Aaronson Sam Bankman-Fried and the geometry of conscience

Update (Dec. 15): This, by former Shtetl-Optimized guest blogger Sarah Constantin, is the post about SBF that I should’ve written and wish I had written.

Update (Nov. 16): Check out this new interview of SBF by my friend and leading Effective Altruist writer Kelsey Piper. Here Kelsey directly confronts SBF with some of the same moral and psychological questions that animated this post and the ensuing discussion—and, surely to the consternation of his lawyers, SBF answers everything she asks. And yet I still don’t know what exactly to make of it. SBF’s responses reveal a surprising cynicism (surprising because, if you’re that cynical, why be open about it?), as well as an optimism that he can still fix everything that seems wildly divorced from reality.

I still stand by most of the main points of my post, including:

  • the technical insanity of SBF’s clearly-expressed attitude to risk (“gambler’s ruin? more like gambler’s opportunity!!”), and its probable role in creating the conditions for everything that followed,
  • the need to diagnose the catastrophe correctly (making billions of dollars in order to donate them to charity? STILL VERY GOOD; lying and raiding customer deposits in course of doing so? DEFINITELY BAD), and
  • how, when sneerers judge SBF guilty just for being a crypto billionaire who talked about Effective Altruism, it ironically lets him off the hook for what he specifically did that was terrible.

But over the past couple days, I’ve updated in the direction of understanding SBF’s psychology a lot less than I thought I did. While I correctly hit on certain aspects of the tragedy, there are other important aspects—the drug use, the cynical detachment (“life as a video game”), the impulsivity, the apparent lying—that I neglected to touch on and about which we’ll surely learn more in the coming days, weeks, and years. –SA


Several readers have asked me for updated thoughts on AI safety, now that I’m 5 months into my year at OpenAI—and I promise, I’ll share them soon! The thing is, until last week I’d entertained the idea of writing up some of those thoughts for an essay competition run by the FTX Future Fund, which (I was vaguely aware) was founded by the cryptocurrency billionaire Sam Bankman-Fried, henceforth SBF.

Alas, unless you’ve been tucked away on some Caribbean island—or perhaps, especially if you have been—you’ll know that the FTX Future Fund has ceased to exist. In the course of 2-3 days last week, SBF’s estimated net worth went from ~$15 billion to a negative number, possibly the fastest evaporation of such a vast personal fortune in all human history. Notably, SBF had promised to give virtually all of it away to various worthy causes, including mitigating existential risk and helping Democrats win elections, and the worldwide Effective Altruist community had largely reoriented itself around that pledge. That’s all now up in smoke.

I’ve never met SBF, although he was a physics undergraduate at MIT while I taught CS there. What little I knew of SBF before this week, came mostly from reading Gideon Lewis-Kraus’s excellent New Yorker article about Effective Altruism this summer. The details of what happened at FTX are at once hopelessly complicated and—it would appear—damningly simple, involving the misuse of billions of dollars’ worth of customer deposits to place risky bets that failed. SBF has, in any case, tweeted that he “fucked up and should have done better.”

You’d think none of this would directly impact me, since SBF and I inhabit such different worlds. He ran a crypto empire from the Bahamas, sharing a group house with other twentysomething executives who often dated each other. I teach at a large state university and try to raise two kids. He made his first fortune by arbitraging bitcoin between Asia and the West. I own, I think, a couple bitcoins that someone gave me in 2016, but have no idea how to access them anymore. His hair is large and curly; mine is neither.

Even so, I’ve found myself obsessively following this story because I know that, in a broader sense, I will be called to account for it. SBF and I both grew up as nerdy kids in middle-class Jewish American families, and both had transformative experiences as teenagers at Canada/USA Mathcamp. He and I know many of the same people. We’ve both been attracted to the idea of small groups of idealistic STEM nerds using their skills to help save the world from climate change, pandemics, and fascism.

Aha, the sneerers will sneer! Hasn’t the entire concept of “STEM nerds saving the world” now been utterly discredited, revealed to be just a front for cynical grifters and Ponzi schemers? So if I’m also a STEM nerd who’s also dreamed of helping to save the world, then don’t I stand condemned too?

I’m writing this post because, if the Greek tragedy of SBF is going to be invoked as a cautionary tale in nerd circles forevermore—which it will be—then I think it’s crucial that we tell the right cautionary tale.

It’s like, imagine the Apollo 11 moon mission had almost succeeded, but because of a tiny crack in an oxygen tank, it instead exploded in lunar orbit, killing all three of the astronauts. Imagine that the crack formed partly because, in order to hide a budget overrun, Wernher von Braun had secretly substituted a cheaper material, while telling almost none of his underlings.

There are many excellent lessons that one could draw from such a tragedy, having to do with, for example, the construction of oxygen tanks, the procedures for inspecting them, Wernher von Braun as an individual, or NASA safety culture.

But there would also be bad lessons to not draw. These include: “The entire enterprise of sending humans to the moon was obviously doomed from the start.” “Fate will always punish human hubris.” “All the engineers’ supposed quantitative expertise proved to be worthless.”

From everything I’ve read, SBF’s mission to earn billions, then spend it saving the world, seems something like this imagined Apollo mission. Yes, the failure was total and catastrophic, and claimed innocent victims. Yes, while bad luck played a role, so did, shall we say, fateful decisions with a moral dimension. If it’s true that, as alleged, FTX raided its customers’ deposits to prop up the risky bets of its sister organization Alameda Research, multiple countries’ legal systems will surely be sorting out the consequences for years.

To my mind, though, it’s important not to minimize the gravity of the fateful decision by conflating it with everything that preceded it. I confess to taking this sort of conflation extremely personally. For eight years now, the rap against me, advanced by thousands (!) on social media, has been: sure, while by all accounts Aaronson is kind and respectful to women, he seems like exactly the sort of nerdy guy who, still bitter and frustrated over high school, could’ve chosen instead to sexually harass women and hinder their scientific careers. In other words, I stand condemned by part of the world, not for the choices I made, but for choices I didn’t make that are considered “too close to me” in the geometry of conscience.

And I don’t consent to that. I don’t wish to be held accountable for the misdeeds of my doppelgängers in parallel universes. Therefore, I resolve not to judge anyone else by their parallel-universe doppelgängers either. If SBF indeed gambled away his customers’ deposits and lied about it, then I condemn him for it utterly, but I refuse to condemn his hypothetical doppelgänger who didn’t do those things.

Granted, there are those who think all cryptocurrency is a Ponzi scheme and a scam, and that for that reason alone, it should’ve been obvious from the start that crypto-related plans could only end in catastrophe. The “Ponzi scheme” theory of cryptocurrency has, we ought to concede, a substantial case in its favor—though I’d rather opine about the matter in (say) 2030 than now. Like many technologies that spend years as quasi-scams until they aren’t, maybe blockchains will find some compelling everyday use-cases, besides the well-known ones like drug-dealing, ransomware, and financing rogue states.

Even if cryptocurrency remains just a modern-day tulip bulb or Beanie Baby, though, it seems morally hard to distinguish a cryptocurrency trader from the millions who deal in options, bonds, and all manner of other speculative assets. And a traditional investor who made billions on successful gambles, or arbitrage, or creating liquidity, then gave virtually all of it away to effective charities, would seem, on net, way ahead of most of us morally.

To be sure, I never pursued the “Earning to Give” path myself, though certainly the concept occurred to me as a teenager, before it had a name. Partly I decided against it because I seem to lack a certain brazenness, or maybe just willingness to follow up on tedious details, needed to win in business. Partly, though, I decided against trying to get rich because I’m selfish (!). I prioritized doing fascinating quantum computing research, starting a family, teaching, blogging, and other stuff I liked over devoting every waking hour to possibly earning a fortune only to give it all to charity, and more likely being a failure even at that. All told, I don’t regret my scholarly path—especially not now!—but I’m also not going to encase it in some halo of obvious moral superiority.

If I could go back in time and give SBF advice—or if, let’s say, he’d come to me at MIT for advice back in 2013—what could I have told him? I surely wouldn’t talk about cryptocurrency, about which I knew and know little. I might try to carve out some space for deontological ethics against pure utilitarianism, but I might also consider that a lost cause with this particular undergrad.

On reflection, maybe I’d just try to convince SBF to weight money logarithmically when calculating expected utility (as in the Kelly criterion), to forsake the linear weighting that SBF explicitly advocated and that he seems to have put into practice in his crypto ventures. Or if not logarithmic weighing, I’d try to sell him on some concave utility function—something that makes, let’s say, a mere $1 billion in hand seem better than $15 billion that has a 50% probability of vanishing and leaving you, your customers, your employees, and the entire Effective Altruism community with less than nothing.

At any rate, I’d try to impress on him, as I do on anyone reading now, that the choice between linear and concave utilities, between risk-neutrality and risk-aversion, is not bloodless or technical—that it’s essential to make a choice that’s not only in reflective equilibrium with your highest values, but that you’ll still consider to be such regardless of which possible universe you end up in.

December 14, 2022

Matt Strassler Fusion’s First Good Day on Earth

The fusing of small atomic nuclei into larger ones, with the associated release of particles carrying a lot of motion-energy, is the mechanism that powers the Sun’s furnace, and that of other stars. This was first suspected in the 1920’s, and confirmed in the 1930s.

Nuclear fission (the breaking of larger atomic nuclei into smaller pieces) was discovered in the 1930s, and used to generate energy in 1942. Work on fission in settings both uncontrolled (i.e. bombs) and controlled (ie. power plants) proceeded rapidly; bombs unfortunately were quickly designed and built during World War II, while useful power plants were already operating by 1951. Meanwhile work on fusion also proceeded rapidly; in the uncontrolled setting, the first bomb using fusion (triggered by a fission bomb!) was already made in 1951, and in a flash of a decade, huge numbers of hydrogen bombs filled the arsenals of superpowers large and small. But controlled fusion for power plants… Ah.

Had it been as easy to control fusion as it was to control fission, we’d have fusion plants everywhere; fossil fuels would be consigned only to certain forms of transportation, and the climate crisis would be far less serious than it is right now. But unfortunately, it has been 70 years of mostly bad news — tragic news, really, for the planet.

But finally we have a little glimmer of hope. On December 5th, somebody finally managed, without using a bomb, to get more fusion-generated energy out of an object than the energy they had to put into it.

[UPDATE: Not really. Though this was a success and a milestone, it wasn’t nearly as good as advertised. Yes, more energy came out of the fusing material than was put into the fusing material. But it took far more energy to make the necessary laser light in the first place — 300 megajoules of energy off the electricity grid, compared to a gain from the fusing material of about 1 megajoule. So overall it was still a big net loss, even though locally, at the fusing material, it was a net gain. See this link, in particular the third figure, which shows that the largest energy cost was electricity from the grid to run the lasers. In short, well, it’s still a good day for fusion, but we are even further from power plants than we were led to believe today.]

Poster Child for Particle Physics

In the Sun and similar stars, fusion proceeds through several processes in which protons (the nuclei of the simplest form of hydrogen) are converted to neutrons and combine with other protons to form mainly helium nuclei (two protons and two neutrons). Other important nuclei are deuterium D (a version of hydrogen with a proton and neutron stuck together), tritium T (another version with a proton plus two neutrons — which is unstable, typically lasting about 12 years), and Helium-3 (two protons plus one neutron.)

Fusion is a fascinating process, because all four of the famous forces of nature are needed. [The fifth, the Higgs force, plays no role, though as is so often the case, the Higgs field is secretly crucial.] In a sense, it’s a poster child for our understanding of how the cosmos works. Consider sunshine:

  1. We need gravity to hold the Sun together, and to crush its center to the point that its temperature reaches well over ten million degrees.
  2. We need electromagnetism to produce the light that carries energy to the Sun’s surface and sunshine to Earth.
  3. We need the strong nuclear force to make protons and neutrons, and to combine them into other simple nuclei such as deuterium, tritium and helium.
  4. We need the weak nuclear force to convert the abundant protons into neutrons (along with a positron [i.e. an anti-electron] and a neutrino.)

How can we be sure this really happens inside the Sun? There are quite a few ways, but perhaps the most direct is that we observe the neutrinos, which (unlike everything else that’s made in the process) escape from the Sun’s core in vast numbers. Though very difficult to detect on Earth, they are occasionally observed. By now, studies of these neutrinos, as here by the Borexino experiment, are definitive. Everything checks out.

In the recent experiment on Earth, gravity’s role is a little more indirect — obviously we wouldn’t have a planet on which to live and laboratories in which to do experiments without it. But it’s electromagnetism which does the holding and crushing of the material. The role of the strong and weak nuclear forces is similar, though instead of starting with mostly protons, the method that made fusion this week uses the weak nuclear force long before the experiment to make the neutrons needed in deuterium and tritium. The actual moment of fusion involves the strong nuclear force, in which

  • D + T –> He + n

i.e. one deuterium nucleus plus one tritium nucleus (a total of two protons and three neutrons) are recombined to make one helium nucleus and one neutron, which come out with more motion-energy than the initial D and T nuclei start with.

The Promise of Endless Cheap Safe[r] Power?

The breakthrough this week? Finally, after decades of promises and disappointments, workers at a US lab, Lawrence Livermore Laboratory in California, working at the National Ignition Facility, have gotten significantly more energy out of fusion than they put in. How this works is described by the lab here. The steps are: make a pellet stocked with D and T; fire up a set of lasers and amplify them to enormous power; aim them into a chamber containing the pellet, heating the chamber to millions of degrees and causing it to emit X-rays (high-energy photons); the blast of X-rays blows off the outer layer of the pellet, which [action-reaction!] causes the inner core of the pellet to greatly compress; in the high temperature and density of the pellet’s core, fusion spontaneously begins and heats the rest of the pellet, causing even more fusion.

Not as easy as it sounds. For a long time they’ve been getting a dud, or just a little fusion. But finally, the energy from fusion has exceeded the energy of the initial lasers by a substantial amount — 50%.

This one momentary success is far from a power plant. But you can’t make a power plant without first making power. So December 5th, eighty years and three days after fission’s first good day, was a good day for fusion on Earth, maybe the first one ever.

If this strategy for making fusion will ever lead to a power plant, this process will have to repeated over and over very rapidly, with the high-energy particles that are created along the way being directed somewhere where they can heat water and turn a steam turbine, from which electric current can be created as it is in many power plants. Leaving aside the major technical challenges, one should understand that this does not come without radioactive pollution; the walls of the container vessel in which the nuclear reactions take place, and other materials inside, will become radioactive over time, and will have to be disposed of with care, as with any radioactive waste. But it’s still vastly safer than a fission power plant, such as are widespread today. Why?

First, the waste from a fission plant is suitable for making nuclear weapons; it has to be not only buried safely but also guarded. Waste from a fusion plant, though still radioactive, is not useful for that purpose.

Second, if a fission plant malfunctions, its nuclear chain-reaction can start running away, getting hotter and hotter until the fuel melts, breaks through the vessel that contains it, and contaminates ground, air and water. By contrast, if a fusion plant malfunctions, its nuclear reactions just… stop.

And third, mining for uranium is bad for the environment (and uranium itself can be turned into a fuel for nuclear weapons.) Mining for hydrogen involves taking some water and passing electric current through it. Admittedly it’s a bit more complicated than that to get the deuterium and especially the tritium you need — the tritium be obtained from lithium, which does require mining — but still, less digging giant holes into mountains and contaminating groundwater with heavy metals.

Meanwhile, both forms of nuclear power have the advantage that they don’t dump loads of carbon into the atmosphere, and avoid the kind of oil spills we saw this week in Kansas.

So even though we are a long way from having nuclear fusion as a power source, and even though there will be some nuclear waste to deal with, there are good reasons to note this day. Someday we might look back on it as the beginning of a transformed economy, a cleaner atmosphere, and a saved planet.

Peter Rohde New track – Naso del Liskamm

My new track “Naso del Liskamm” is now available on Spotify, SoundCloud and other major streaming services.

The post New track – Naso del Liskamm appeared first on Peter Rohde.

December 13, 2022

Terence TaoAn improvement to Bennett’s inequality for the Poisson distribution

If {\lambda>0}, a Poisson random variable {{\bf Poisson}(\lambda)} with mean {\lambda} is a random variable taking values in the natural numbers with probability distribution

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) = k) = e^{-\lambda} \frac{\lambda^k}{k!}.

One is often interested in bounding upper tail probabilities

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) \geq \lambda(1+u))

for {u \geq 0}, or lower tail probabilities

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) \leq \lambda(1+u))

for {-1 < u \leq 0}. A standard tool for this is Bennett’s inequality:

Proposition 1 (Bennett’s inequality) One has

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) \geq \lambda(1+u)) \leq \exp(-\lambda h(u))

for {u \geq 0} and

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) \leq \lambda(1+u)) \leq \exp(-\lambda h(u))

for {-1 < u \leq 0}, where

\displaystyle  h(u) := (1+u) \log(1+u) - u.

From the Taylor expansion {h(u) = \frac{u^2}{2} + O(u^3)} for {u=O(1)} we conclude Gaussian type tail bounds in the regime {u = o(1)} (and in particular when {u = O(1/\sqrt{\lambda})} (in the spirit of the Chernoff, Bernstein, and Hoeffding inequalities). but in the regime where {u} is large and positive one obtains a slight gain over these other classical bounds (of {\exp(- \lambda u \log u)} type, rather than {\exp(-\lambda u)}).

Proof: We use the exponential moment method. For any {t \geq 0}, we have from Markov’s inequality that

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) \geq \lambda(1+u)) \leq e^{-t \lambda(1+u)} {\bf E} \exp( t {\bf Poisson}(\lambda) ).

A standard computation shows that the moment generating function of the Poisson distribution is given by

\displaystyle  \exp( t {\bf Poisson}(\lambda) ) = \exp( (e^t - 1) \lambda )

and hence

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) \geq \lambda(1+u)) \leq \exp( (e^t - 1)\lambda - t \lambda(1+u) ).

For {u \geq 0}, it turns out that the right-hand side is optimized by setting {t = \log(1+u)}, in which case the right-hand side simplifies to {\exp(-\lambda h(u))}. This proves the first inequality; the second inequality is proven similarly (but now {u} and {t} are non-positive rather than non-negative). \Box

Remark 2 Bennett’s inequality also applies for (suitably normalized) sums of bounded independent random variables. In some cases there are direct comparison inequalities available to relate those variables to the Poisson case. For instance, suppose {S = X_1 + \dots + X_n} is the sum of independent Boolean variables {X_1,\dots,X_n \in \{0,1\}} of total mean {\sum_{j=1}^n {\bf E} X_j = \lambda} and with {\sup_i {\bf P}(X_i) \leq \varepsilon} for some {0 < \varepsilon < 1}. Then for any natural number {k}, we have

\displaystyle  {\bf P}(S=k) = \sum_{1 \leq i_1 < \dots < i_k \leq n} {\bf P}(X_{i_1}=1) \dots {\bf P}(X_{i_k}=1)

\displaystyle  \prod_{i \neq i_1,\dots,i_k} {\bf P}(X_i=0)

\displaystyle  \leq \frac{1}{k!} (\sum_{i=1}^n \frac{{\bf P}(X_i=1)}{{\bf P}(X_i=0)})^k \times \prod_{i=1}^n {\bf P}(X_i=0)

\displaystyle  \leq \frac{1}{k!} (\frac{\lambda}{1-\varepsilon})^k \prod_{i=1}^n \exp( - {\bf P}(X_i = 1))

\displaystyle  \leq e^{-\lambda} \frac{\lambda^k}{(1-\varepsilon)^k k!}

\displaystyle  \leq e^{\frac{\varepsilon}{1-\varepsilon} \lambda} {\bf P}( \mathbf{Poisson}(\frac{\lambda}{1-\varepsilon}) = k).

As such, for {\varepsilon} small, one can efficiently control the tail probabilities of {S} in terms of the tail probability of a Poisson random variable of mean close to {\lambda}; this is of course very closely related to the well known fact that the Poisson distribution emerges as the limit of sums of many independent boolean variables, each of which is non-zero with small probability. See this paper of Bentkus and this paper of Pinelis for some further useful (and less obvious) comparison inequalities of this type.

In this note I wanted to record the observation that one can improve the Bennett bound by a small polynomial factor once one leaves the Gaussian regime {u = O(1/\sqrt{\lambda})}, in particular gaining a factor of {1/\sqrt{\lambda}} when {u \sim 1}. This observation is not difficult and is implicitly in the literature (one can extract it for instance from the much more general results of this paper of Talagrand, and the basic idea already appears in this paper of Glynn), but I was not able to find a clean version of this statement in the literature, so I am placing it here on my blog. (But if a reader knows of a reference that basically contains the bound below, I would be happy to know of it.)

Proposition 3 (Improved Bennett’s inequality) One has

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) \geq \lambda(1+u)) \ll \frac{\exp(-\lambda h(u))}{\sqrt{1 + \lambda \min(u, u^2)}}

for {u \geq 0} and

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) \leq \lambda(1+u)) \ll \frac{\exp(-\lambda h(u))}{\sqrt{1 + \lambda u^2 (1+u)}}

for {-1 < u \leq 0}.

Proof: We begin with the first inequality. We may assume that {u \geq 1/\sqrt{\lambda}}, since otherwise the claim follows from the usual Bennett inequality. We expand out the left-hand side as

\displaystyle  e^{-\lambda} \sum_{k \geq \lambda(1+u)} \frac{\lambda^k}{k!}.

Observe that for {k \geq \lambda(1+u)} that

\displaystyle  \frac{\lambda^{k+1}}{(k+1)!} \leq \frac{1}{1+u} \frac{\lambda^{k+1}}{(k+1)!} .

Thus the sum is dominated by the first term times a geometric series {\sum_{j=0}^\infty \frac{1}{(1+u)^j} = 1 + \frac{1}{u}}. We can thus bound the left-hand side by

\displaystyle  \ll e^{-\lambda} (1 + \frac{1}{u}) \sup_{k \geq \lambda(1+u)} \frac{\lambda^k}{k!}.

By the Stirling approximation, this is

\displaystyle  \ll e^{-\lambda} (1 + \frac{1}{u}) \sup_{k \geq \lambda(1+u)} \frac{1}{\sqrt{k}} \frac{(e\lambda)^k}{k^k}.

The expression inside the supremum is decreasing in {k} for {k > \lambda}, thus we can bound it by

\displaystyle  \ll e^{-\lambda} (1 + \frac{1}{u}) \frac{1}{\sqrt{\lambda(1+u)}} \frac{(e\lambda)^{\lambda(1+u)}}{(\lambda(1+u))^{\lambda(1+u)}},

which simplifies to

\displaystyle  \ll \frac{\exp(-\lambda h(u))}{\sqrt{1 + \lambda \min(u, u^2)}}

after a routine calculation.

Now we turn to the second inequality. As before we may assume that {u \leq -1/\sqrt{\lambda}}. We first dispose of a degenerate case in which {\lambda(1+u) < 1}. Here the left-hand side is just

\displaystyle  {\bf P}( {\bf Poisson}(\lambda) = 0 ) = e^{-\lambda}

and the right-hand side is comparable to

\displaystyle  e^{-\lambda} \exp( - \lambda (1+u) \log (1+u) + \lambda(1+u) ) / \sqrt{\lambda(1+u)}.

Since {-\lambda(1+u) \log(1+u)} is negative and {0 < \lambda(1+u) < 1}, we see that the right-hand side is {\gg e^{-\lambda}}, and the estimate holds in this case.

It remains to consider the regime where {u \leq -1/\sqrt{\lambda}} and {\lambda(1+u) \geq 1}. The left-hand side expands as

\displaystyle  e^{-\lambda} \sum_{k \leq \lambda(1+u)} \frac{\lambda^k}{k!}.

The sum is dominated by the first term times a geometric series {\sum_{j=-\infty}^0 \frac{1}{(1+u)^j} = \frac{1}{|u|}}. The maximal {k} is comparable to {\lambda(1+u)}, so we can bound the left-hand side by

\displaystyle  \ll e^{-\lambda} \frac{1}{|u|} \sup_{\lambda(1+u) \ll k \leq \lambda(1+u)} \frac{\lambda^k}{k!}.

Using the Stirling approximation as before we can bound this by

\displaystyle  \ll e^{-\lambda} \frac{1}{|u|} \frac{1}{\sqrt{\lambda(1+u)}} \frac{(e\lambda)^{\lambda(1+u)}}{(\lambda(1+u))^{\lambda(1+u)}},

which simplifies to

\displaystyle  \ll \frac{\exp(-\lambda h(u))}{\sqrt{1 + \lambda u^2 (1+u)}}

after a routine calculation. \Box

The same analysis can be reversed to show that the bounds given above are basically sharp up to constants, at least when {\lambda} (and {\lambda(1+u)}) are large.

December 12, 2022

Mark Chu-CarrollHow Computers Work, Part 2: Transistors

From my previous post, we understand what a diode is, and how it works. In this post, we’re going to move on to the transistor. Once we’ve covered that, then we’ll start to move up a layer, and stop looking at the behavior of individual subcomponents, and focus on the more complicated (and interesting) components that are built using the basic building blocks of transistors.

By a fortuitous coincidence, today is actually the anniversary of the first transistor, which was first successfully tested on December 16th 1947, at Bell Labs.

Transistors are amazing devices. It’s hard to really grasp just how important they are. Virtually everything you interact with in 21st century live depends on transistors. Just to give you a sense, here’s a sampling of my day so far.

  • Get woken up by the alarm on my phone. The phone has billions of transistors. But even if I wasn’t using my phone, any bedside clock that I’ve had in my lifetime has been built on transistors.
  • Brush my teeth and shave. Both my electric razor and toothbrush use transistors for a variety of purposes.
  • Put on my glasses. These days, I wear lenses with a prismatic correction. The prism in the lens is actually a microscoping set of grids, carved into the lens by an ultraprecise CNC machine. Aka, a computer – a ton of transistors, controlling a robotic cutting tool by using transistors as switches.
  • Come downstairs and have breakfast. The refrigerator where I get my milk uses transistors for managing the temperature, turning the compressor for the cooling on and off.
  • Grind some coffee. Again, transistor based electronics controlling the grinder.
  • Boil water, in a digitally controlled gooseneck kettle. Again, transistors.

There’s barely a step of my day that doesn’t involve something with transistors. But most of us have next to no idea what they actually do, much less how. That’s what we’re going to look at in this post. This very much builds on the discussion of diodes in last weeks post, so if you haven’t read that, now would be a good time to jump back.

What is a transistor? At the simplest, it’s an electronic component that can be used in two main ways. It’s works like an on-off switch, or like a volume control – but without any moving parts. In a computer, it’s almost always used as a switch – but, crucially, a switch that doesn’t need to move to turn on or off. The way that it works is a bit tricky, but if we don’t really get too deep and do the math, it’s not too difficult to understand.

We’ll start with the one my father explained to me first, because he thought it was the simplest kind of transistor to understand. It’s called a junction transistor. A junction transistor is, effectively, two diodes stack back to back. There’s two kinds – the PNP and the NPN. We’ll look an the NPN type, which acts like a switch that’s off by default.

An NPN field effect transistor

Each diode consists of a piece of N type silicon connected to a piece of P type silicon. By joining them back to back, we get a piece of P type silicon in the middle, with N type silicon on either side. That means that on the left, we’ve got an NP boundary – a diode that wants to flow right-to-left, and block current left-to-right; and on the right, we’ve got a PN boundary – a diode that wants to let current flow from right to left, and block left to right.

If we only have the two outer contacts, we’ve got something that simply won’t conduct electricity. But if we add a third contact to the P region in the middle, then suddenly things change!

Let’s give things some names – it’ll help with making the explanation easier to follow. We’ll call the left-hand contact of the transistor the emitter, and the right hand the collector The contact that we added to the middle, we’ll call the base.

(Quick aside: these names are very misleading, but sadly they’re so standardized that we’re stuck with them. Electrical engineering got started before we really knew which charges were moving in a circuit. By convention, circuits were computed as if it was the positive charges that move. So the names of the transistor come from that “conventional” current flow tradition. The “collector” recieves positive charges, and the emitter emits them. Yech.)

If we don’t do anything with the base, but we attach the emitter to the negative side of a battery, and the collector to the positive, what happens is nothing. The two diodes that make up the transistor block any current from flowing. It’s still exactly the same as in the diodes – there’s a depletion region around the NP and PN boundaries. So while current could flow from the emitter to the base, it can’t flow into the collector and out of the circuit; and the base isn’t connected to anything. So all that it can do is increase the size of the depletion zone around the PN boundary on the right.

What if we apply some voltage to the base?

Then we’ve got a negative charge coming into the P type silicon from the base contact, filling holes and creating a negative charge in the P-type silicon away from the NP boundary. This creates an electric field that pushes electrons out of the holes along the NP-boundary, essentially breaking down the depletion zone. By applying a voltage to the base, we’ve opened a connection between the emitter and the collector, and current will flow through the transistor.

Another way of thinking about this is in terms of electrons and holes. If you have a solid field of free electrons, and you apply a voltage, then current will flow. But if you have electron holes, then the voltage will push some electrons into holes, creating a negatively charged region without free electrons that effectively blocks any current from flowing. By adding a voltage at the base, we’re attracting holes to the base, which means that they’re not blocking current from flowing from the emitter to the collector!

The transistor is acting like a switch controlled by the base. If there’s a voltage at the base, the switch is on, and current can flow through the transistor; if there’s no voltage at the base, then the switch is off, and current won’t flow.

I said before that it can also act like a volume control, or an amplifier. That’s because it’s not strictly binary. The amount of voltage at the base matters. If you apply a small voltage, it will allow a small current to flow from the emitter to the collector. As you increase the voltage at the base, the amount of current that flows through the transistor also increases. You can amplify things by putting a high voltage at the emitter, and then the signal you want to amplify at the base. When the signal is high, the amount of voltage passing will be high. If the voltage at the emitter is significantly higher than the voltage of the signal, then what comes out of the collector is the same signal, but at a higher voltage. So it’s an amplifier!

There’s a bunch of other types of transistors – I’m not going to go through all of them. But I will look at one more, because it’s just so important. It’s called a MOSFET – metal oxide semiconductor field effect transistor. Pretty much all of our computers are built on an advanced version of MOSFET called CMOS.

Just to be annoying, the terminology changes for the names of the contacts on a MOSFET. In a MOSFET, the negative terminal is called the source, and the positive terminal is the drain. The control terminal is still called the gate. In theory, there’s a fourth terminal called the body, but in practice, that’s usually connected to the source.

The way a field effect transistor works is similar to a junction transistor – but the big difference is, no current ever flows through the base, because it’s not actually electrically connected to the P-type silicon of the body. It’s shielded by a metal oxide layer (thus the “metal oxide” part of MOSFET).

An NPN MOSFET transistor in the off state, with depletion regions

In a MOSFET, we’ve got a bulky section of P-type silicon, called the substrate. On top of it, we’ve got two small N-type regions for the source and the drain. Between the source and the drain, on the surface of the MOSFET, there’s a thin layer of non-conductive metal oxide (or, sometimes, silicon dioxide – aka glass), and then on top of that metal oxide shield is the base terminal. Underneath the gate is P-type silicon, in an area called the channel region.

Normally, if there’s a voltage at the drain, but no voltage on the gate, it behaves like a junction transistor. Along the boundaries of the N-type terminals, You get electrons moving from the N-type terminals to the P-type holes, creating a depletion region. The current can’t flow – the negatively charged P-side of the depletion region blocks electrons from flowing in to fill more holes, and the open holes in the rest of the P-type region prevent electrons from flowing through.

An NPN MOSFET in the on state, with a positive voltage at the base producing an inversion zone.

If we apply a positive voltage (that is, a positive charge) at the gate, then you start to build up a (relative) positive charge near the gate and a negative charge near the body terminal. The resulting field pushes the positively charged holes away from the gate, and pulls free electrons towards the gate. If the voltage is large enough, it eventually creates what’s called an inversion region – a region which has effectively become N-type silicon because the holes have been pushed away, and free electrons have been pulled in. Now there’s a path of free electrons from source to drain, and current can flow across the transistor.

That’s what we call an N-type MOSFET transistor, because the source and drain are N-type silicon. There’s also a version where the source and drain are P-type, and the body is N type, called a P-type transistor. A P-type MOSFET transistor conducts current when there is no voltage on the base, and stops doing so when voltage is applied.

There’s an advanced variant of MOSFET called CMOS – complementary metal oxide semiconductor. It’s an amazing idea that pairs P-type and N-type transistors together to produce a circuit that doesn’t draw power when it isn’t being switched. I’m not going to go into depth about it here – I may write something about it later. You can see an article about it at the computer history museum.

On a personal note, in that article, you’ll see how “RCA Research Laboratories and the Somerville manufacturing operation pioneered the production of CMOS technology (under the trade name COS/MOS) for very low-power integrated circuits, first in aerospace and later in commercial applications.” One of the team at RCA Somerville semiconductor manufacturing center who worked on the original CMOS manufacturing process was my father. He was a semiconductor physicist who worked on manufacturing processes for aerospace systems.

While doing that, my father met William Shockley. Shockley was the lead of the team at Bell Labs that developed the first transistor. He was, without doubt, one of the most brilliant physisists of the 20th century. He was also a total asshole of absolutely epic proportions. Based on his interactions with Shockley, my dad developed his own theory of human intelligence: “Roughly speaking, everyone is equally intelligent. If you’re a genius in one field, that means that you must be an idiot in all others”. I think of all the people he met in his life, my dad thought Shockley was, by far, the worst.

If you don’t know about Shockley, well… Like I said, the guy was a stunningly brilliant physisist and a stunningly awful person. He was a coinventor of the transistor, and pretty much created silicon valley. But he also regularly told anyone who’d listen about how his children were “genetic regressions” on his intellectual quality (due of course, he would happily explain, to the genetic inferiority of his first wife). After collecting his Nobel prize for the invention of the transistor, he dedicated the rest of his life to promoting eugenics and the idea that non-white people are genetically inferior to whites. You can read more about his turn to pseudo-scientific racism and eugenics in this article by the SPLC.)

References

A few of the sources I looked at while writing this. (As usual, I’m a bit scattered, so I’m sure there were other places I looked, but these are the links I remembered.)

December 09, 2022

Mark Chu-CarrollTwittering about Twitter by a Former Twitter Engineer

Since I used to work at twitter, a few folks have asked what I think of what’s going on with Twitter now that Elon Musk bought it.

The answer is a little bit complicated. I worked for Twitter back in 2013, so it’s quite a while ago, and an awful lot has changed in that time.

I went to Twitter out of idealism. I believed it was a platform that would change the world in a good way. At the time, there was a lot going on to back up that idea. Twitter was being used in the Arab Spring uprisings, it was giving underrepresented voices like Black Lives Matter a platform, and it really looked like it was going to democratize communication in a wonderful way. I was so thrilled to be able to become a part of that.

But what I found working there was a really dysfunctional company. The company seemed to be driven by a really odd kind of fear. Even though it seemed (at least from my perspective) like no one ever got penalized, everyone in management was terrified of making a decision that might fail. Failure was a constant spectre, which haunted everything, but which was never acknowledged, because if anyone admitted that something had failed, they might be held responsible for the failure. I’ve never seen anything like it anywhere else I worked, but people at Twitter were just terrified of making decisions that they could be held responsible for.

A concrete example of that was something called Highline. When I got to Twitter, the company was working on a new interface, based on the idea that a user could have multiple timelines. You’d go to twitter and look at your News timeline to find out what was going on in your world, and all you’d see was news. Then you’d switch to your music timeline to read what your favorite artists were talking about. Or your Friends timeline to chat with your friends.

At first, HighLine was everything. Every team at the company was being asked to contribute. It was the only topic at the first two monthly company-wide all-hands. It was the future of twitter.

And then it disappeared, like it had never happened. Suddenly there was no highline, no new interface in the plans. No one ever said it was cancelled. It was just gone.

At the next company all-hands, Dick Costolo, the CEO, gave his usual spiel about the stuff we’d been working on, and never mentioned highline. In fact, he talked about the work of the last few months, and talked about everything except highline. If I hadn’t been there, and didn’t know about it, I would never have guessed that nearly everyone in that room had been working heads-down on a huge high-profile effort for months. It was just gone, disappeared down the memory hole. It hadn’t failed, and no one was responsible, because it had never happened.

There was more nonsense like that – rewriting history to erase things that no one wanted to be responsible for, and avoiding making any actual decisions about much of anything. The only things that ever got decided where things where we were copying facebook, because you couldn’t be blamed for doing the same thing as the most successful social media company, right?

After about a year, we got (another) new head of engineering, who wanted to get rid of distributed teams. I was the sole NYer in an SF based team. I couldn’t move to SF, so I left.

I came away from it feeling really depressed about the company and its future. A company that can’t make decisions, that can’t take decisive actions, that can’t own up to and learn from its mistakes isn’t a company with a bright future. The whole abuse situation had also grown dramatically during my year there, and it was clear that the company had no idea what to do about it, and were terrified of trying anything that might hurt the user numbers. So it really seemed like the company was heading into trouble, both in terms of the business, and the platform.

Looking at the Musk acquisition: I’ll be up front before I get into it. I think Elon is a jackass. I’ll say more about why, but being clear, I think he’s an idiot with no idea of what he’s gotten himself into.

That said: the company really needed to be shaken up. They hired far too many people – largely because of that same old indecisiveness. You can’t move existing staff off of what they’re doing and on to something new unless you’re willing to actually cancel the thing that they were working on. But cancelling a stream of in-progress work takes some responsibility, and you have to account for the now wasted sunk cost of what they were doing. So instead of making sure that everyone was working on something that was really important and valuable to the company, they just hired more people to do new things. As a result, they were really bloated.

So trimming down was something they really needed. Twitter is a company that, run well, could probably chug along making a reasonable profit, but it was never going to be a massive advertising juggernaut like Google or Facebook. So keeping tons of people on the staff when they aren’t really contributing to the bottom line just doesn’t work. Twitter couldn’t afford to pay that many people given the amount of money it was bringing in, but no one wanted to be responsible for deciding who to cut. So as sad as it is to see so many smart, hard-working people lose their jobs, it was pretty inevitable that it would happen eventually.

But The way that Elon did it was absolutely mind-numbingly stupid.

He started by creating a bullshit metric for measuring productivity. Then he stack-ranked engineers based on that bullshit metric, and fired everyone below a threshold. Obviously a brilliant way to do it, right? After all, metrics are based on data, and data is the basis of good decisions. So deciding who to fire and who to keep based on a metric will work out well!

Sadly, his metric didn’t include trivial, unimportant things like whether the people he was firing were essential to running the business. Because he chose the metric before he understood anything about what the different engineers did. An SRE might not commit as many lines of code to git, but god help your company if your service discovery system goes down, and you don’t have anyone who knows how to spin it back up without causing a massive DDOS across your infrastructure.

Then came the now-infamous loyalty email. Without telling any of the employees what he was planning, he demanded that they commit themselves to massive uncompensated increase in their workload. If they wouldn’t commit to it, the company would take that as a resignation. (And just to add insult to injury, he did it with a Google forms email.)

Dumb, dumb, dumb.

Particularly because, again, it didn’t bother to consider that people aren’t interchangable. There are some people that are really necessary, that the company can’t do without. What if they decide to leave rather than sign? (Which is exactly what happened, leading to managers making desperate phone calls begging people to come back.)

So dumb. As my friend Mike would say, “Oh fuck-a-duck”.

Elon is supposed to be a smart guy. So why is he doing such dumb stuff at twitter?

Easy. He’s not really that smart.

This is something I’ve learned from working in tech, where you deal with a lot of people who became very wealthy. Really wealthy people tend to lose touch with the reality of what’s going on around them. They get surrounded by sycophants who tell them how brilliant, how wonderful, how kind and generous they are. They say these things not because they’re true, but because it’s their job to say them.

But if you’re a person who hears nothing, from dawn to dusk, but how brilliant you are, you’ll start to believe it. I’ve seen, several times, how this can change people. People who used to be reasonable and down to earth, but after a few years of being surrounded by yes-men and sycophants completely lose touch.

Elon is an extreme case of that. He grew up in a rich family, and from the time he was a tiny child, he’s been surrounded by people whose job it is to tell him how smart, how brilliant, how insightful, how wonderful he is. And he genuinely believes that. In his head, he’s a combination of Einstein, Plato, and Ghandi – he’s one of the finest fruits of the entire human race. Anyone who dares to say otherwise gets fired and kicked out of the great presence of the mighty Musk.

Remember that this is a guy who went to Stanford, and dropped out, because he didn’t like dealing with Professors who thought they knew more than he did. He’s a guy who believes that he’s a brilliant rocket scientist, but who stumbles every time he tries to talk about it. But he believes that he’s an expert – because (a) he’s the smartest guy in the world, and (b) all of the engineers in his company tell him how brilliant he is and how much he’s contributing to their work. He’d never even consider the possibility that they say that because he’dfire them if he didn’t. And, besides, as I’m sure his fanboys are going to say: If he’s not so smart, then why does he have so much more money than me?

Take that kind of clueless arrogance, and you see exactly why he’s making the kinds of decisions that he is at Twitter. Why the stupid heavy-handed layoffs? He’s the smartest guy in the world, and he invented a metric which is obviously correct for picking which engineers to keep. Why pull the stupid “promise me your children or you’re fired” scam? Because they should feel privileged to work for the Great Elon. They don’t need to know what he’s planning: they should just know that because it’s Elon, it’s going to be the most brilliant plan they can imagine, and they should be willing to put their entire lives on the line for the privilege of fulfilling it.

It’s the stupidity of arrogance from start to finish. Hell, just look at the whole acquisition process.

He started off with the whole acquisition as a dick-swinging move: “How dare you flag my posts and point out that I’m lying? Do you know
who I am? I can buy your company with spare change and fire your sorry ass!”.

Then he lined up some financing, and picked a ridiculously over-valued price for his buyout offer based on a shitty pot joke. (Seriously, that’s how he decided how much to pay for Twitter: he wanted the per-share price to be a pot joke.)

Once the deal was set, he realized how ridiculous it was, so he tried to back out. Only the deal that he demanded, and then agreed to, didn’t leave him any way to back out! He’d locked himself in to paying this ridiculous price for a shitty company. But he went to court to try to get out of it, by making up a bullshit story about how Twitter had lied to him.

When it became clear that the courts were going to rule against his attempt to back out, he reversed course, and claimed that he really did want to buy Twitter, despite the fact that he’d just spent months trying to back out. It had nothing to do with the fact that he was going to lose the case – perish the thought! Elon never loses! He just changed his mind, because he’s such a wonderful person, and he believes twitter is important and only he can save it.

It’s such stupid shit from start to finish.

So I don’t hold out much hope for the future of Twitter. It’s being run by a thin-skinned, egotistical asshole who doesn’t understand the business that he bought. I’m guessing that he’s got some scheme where he’ll come out of it with more money than he started with, while leaving the investors who backed his acquisition holding the bag. That’s how money guys like Elon roll. But Twitter is probably going to be burned to the ground along the way.

But hey, he bought all of my Twitter stock for way more than it was worth, so that was nice.

Matt Strassler Send Your Dog Through a Wormhole?

A wormhole! What an amazing concept — a secret tunnel that connects two different regions of space! Could real ones exist? Could we — or our dogs — travel through them, and visit other galaxies billions of light years away, and come back to tell everyone all about it?

I bring up dogs because of a comment, quoted in the Guardian and elsewhere, by my friend and colleague, experimentalist Maria Spiropulu. Spiropulu is a senior author on the wormhole-related paper that has gotten so much attention in the past week, and she was explaining what it was all about.

  • “People come to me and they ask me, ‘Can you put your dog in the wormhole?’ So, no,” Spiropulu told reporters during a video briefing. “… That’s a huge leap.”

For this, I can’t resist teasing Spiropulu a little. She’s done many years of important work at the Large Hadron Collider and previously at the Tevatron, before taking on quantum computing and the simulation of wormholes. But, oh my! The idea that this kind of research could ever lead to a wormhole that a dog could traverse… that’s more than a huge leap of imagination. It’s a huge leap straight out of reality!

I’ve been trying to train our dog, Phoebe, to fetch a ball through a wormhole. She seems eager but nervous.

What’s the problem?

Decades ago there was a famous comedian by the name of Henny Youngman. He told the following joke — which, being no comedian myself, I will paraphrase.

  • I know a guy who wanted to set a mousetrap but had no cheese in his fridge. So he cut a picture of a piece of cheese from a magazine, and used that instead. Just before bed, he heard the trap snap shut, so he went to look. In the trap was a picture of a mouse.

Well, with that in mind, consider this:

  • Imaginary cheese can’t catch a real mouse, and an imaginary wormhole can’t transport a real dog!

As I explained in my last post, the recent wormhole-related paper is about an artificial simulation of a wormhole… hence the title, “Traversable wormhole dynamics on a quantum processor”, rather than “First creation of a wormhole.” Actually, they’re not even simulating the wormhole directly. As I described, the simulation is of some stationary particles — not actual particles, just simulated ones, represented in a computer — and the (simulated) interactions of those particles create a special effect which acts, in some ways, like a (simulated) wormhole. [The math of this is called the SYK model, or a simplified version of it.]

This is a very cool trick for artificially simulating a wormhole, one that can be crossed from one side to the other before it collapses. The trick was invented by theorists in this paper (see also this one), following on this pioneering idea. But it is not a trick for making a real wormhole. Moreover, this is a simulated wormhole in one spatial dimension, not the three we live in. In this sense, it is a cartoon of a wormhole, like a stick figure, with no flesh and blood.

Even if this were a real one-dimensional wormhole, you cannot hope to send a three-dimensional dog through it. You could not even send a three-dimensional atom through a one-dimensional wormhole. Dimensions don’t work that way.

I thought maybe I could ease her into the idea by starting with a wormhole that was simpler and less scary — just one-dimensional instead of three-dimensional. But she wouldn’t budge.

Remember, this wormhole does not exist in the real world; it is being represented by the bits of the computer. In a sense, it is being thought — represented in the computer’s crude memory. Try it: imagine a wormhole (it doesn’t matter how accurate.) Imagine a dog now going through it. Ok, you have just done a simulation of a dog going through a wormhole… an imagined dog moving through an imagined wormhole. Naturally, your brain didn’t do a very accurate simulation. It lacks all the fancy math. Armed with that math, the computer can do a professional-quality artificial simulation.

But just as you cannot take your real dog, the one you pet and play fetch with, and have it travel through the wormhole you imagined in your brain, you cannot take a real dog and pass it through a computer simulation of a wormhole. That would be true even if that wormhole were three-dimensional, rather than the one-dimensional cartoon. Nor can you take a real atom, or even a real photon [a particle of light], and send it through an imaginary, artificially simulated wormhole. Only an artificially simulated photon, atom or dog can pass through an artificially simulated wormhole.

So then, thinking it might reassure her, I tried showing her how wormholes work by simulating one on a computer. She watched attentively, and then she licked her paw.

Wormholes in nature are about real gravity. Wormholes in a computer are about mathematically simulated gravity. Real gravity pulls real things and might or might not make real wormholes; it has to obey the laws of nature of our universe. Imaginary gravity pulls imaginary things and can create imaginary wormholes; it is far less constrained, because the person doing the simulation can have the computer consider all sorts of imaginary universes in which the laws of nature might be very different from ours. Imaginary wormholes might behave in all sorts of ways that are impossible in the real world. For instance, the real world has (at least) three dimensions of space, but on a computer there’s no problem to simulate a universe with just one dimension of space… and that’s effectively what was done by Spiropulu and her colleagues, following the proposals of this paper and others by quantum gravity experts.

So let’s not confuse what’s real with what’s artificially simulated. And by the way, just because a quantum computer was used instead of an ordinary one doesn’t change what’s real with what is not. Real dogs are quantum; quantum computers are real; both have to obey the laws of the real world. But anything simulated on a quantum computer is not real, and need not obey those laws.

“Maybe she needs to see that it’s not dangerous,” I thought, so I showed her a simulation of a one-dimensional dog safely passing through my simulated one-dimensional wormhole. She looked at me, then wagged her tail and lay down in a bored three-dimensional heap. I guess the one-dimensional ball wasn’t enticing enough.

What about real wormholes?

Setting aside these simulated wormholes — could real wormholes exist, and could you send your dog through one?

Until recently there was a lot of debate as to whether wormholes actually make sense; maybe, it was thought, they violate some deep principles and are forbidden in nature. But in the last few years this debate has subsided. I’ll discuss this in more detail in my next post. But here are a few things to keep in mind:

It has been shown (most directly here, by Maldacena and Milekhin) that in some imaginary universes that are not so different from our own, it is possible for wormholes to exist that are large enough for dogs and humans to travel through. BUT:

  1. A person in that universe could not use them to travel faster than light from point A to point B — i.e., there is no chance that these wormholes could be used to go instantly halfway across the universe, and thus communicate faster than a message sent by radio waves outside the wormhole from A to B. Nor could they be used for time travel to the past.
  2. To avoid travelers being torn apart by tidal forces, the openings to these wormholes must be immense — far, far larger than a human. They’re not like the round doorways you see in science fiction movies.
  3. Although the wormhole traveler would feel the trip to be short, the travel time from the point of view of those outside the wormhole would be spectacularly long. If you did a round trip through the wormhole and back, your friends and family would all be long dead when you returned.
  4. The region inside the wormhole could easily become very dangerous; any photons that leak in from the other side will become extreme gamma rays bombarding the traveler passing through. To avoid this and other similar problems, the wormhole’s huge openings must be kept isolated and absolutely pristine.
  5. It’s hard to understand how to produce stable wormholes like this in a universe whose temperature is as high as ours (2.7 Kelvin about absolute zero).
  6. It is hard to imagine how such a wormhole could be created through any natural or artificial process. (I wrote here about why real wormholes, even if they can exist in our universe, are extremely difficult to create or manage; and that’s true not only for macroscopic ones large enough for a dog but for microscopic ones as well. The same is true for black holes, which definitely do exist in our universe.)
  7. For these and/or other reasons, large traversable wormholes of this sort may not be possible in our universe; the specific laws of nature we live in may not allow wormholes worthy of the name, or at least not large ones. This is an open question and may depend on facts about our universe that we don’t yet know.

So you will not be sending your dog on any such journey. It’s wildly unrealistic.

One final note — if it becomes possible, decades or centuries from now, to attempt the fabrication of real, microscopic wormholes in an Earth-bound lab, it should not be attempted without a thorough safety review. Real black holes and wormholes aren’t easily handled and can potentially be very dangerous if anything goes wrong. It would be a terrible thing if one got away from you and ate your dog.

Jordan EllenbergLittle did I know (real analysis edition)

I just finished teaching Math 521, undergraduate real analysis. I first took on this course as an emergency pandemic replacement, and boy did I not know how much I would like teaching it! You get a variety of students — our second and third year math majors, econ majors aiming for top Ph.D. programs, financial math people, CS people — students learning analysis for all kinds of reasons.

A fun thing about teaching outside my research area is encountering weird little facts I don’t know at all — facts which, were they of equal importance and obscurity and size and about algebra, I imagine I would just know. For instance, I was talking about the strategy of the Riemann integral, before launching into the formal definition, as “you are trying to find a sequence of step functions which are getting closer and closer to f, because step functions are the ones you a priori know how to integrate.” But do Riemann-integrable functions actually have sequences of step functions converging to them uniformly? No! It turns out the class of functions which are uniform limits of step functions is called the regulated functions and an equivalent characterization of regulated functions is that the right and left limits f(x+) and f(x-) exist for any x.