Abstract. In fields ranging from business to systems biology, directed graphs with edges labeled by signs are used to model systems in a simple way: the nodes represent entities of some sort, and an edge indicates that one entity directly affects another either positively or negatively. Multiplying the signs along a directed path of edges lets us determine indirect positive or negative effects, and if the path is a loop we call this a positive or negative feedback loop. Here we generalize this to graphs with edges labeled by a monoid, whose elements represent ‘polarities’ possibly more general than simply ‘positive’ or ‘negative’. We study three notions of morphism between graphs with labeled edges, each with its own distinctive application: to refine a simple graph into a complicated one, to transform a complicated graph into a simple one, and to find recurring patterns called ‘motifs’. We construct three corresponding symmetric monoidal double categories of ‘open’ graphs. We study feedback loops using a generalization of the homology of a graph to homology with coefficients in a commutative monoid. In particular, we describe the emergence of new feedback loops when we compose open graphs using a variant of the Mayer–Vietoris exact sequence for homology with coefficients in a commutative monoid.
Read the whole series:
• Part 1: Causal loop diagrams, and more generally graphs with edge labeled by elements of a monoid.
• Part 2: graphs with edges labeled by elements of a ring.
Summer is conference season for academics, and this week held my old sub-field’s big yearly conference, called Amplitudes. This year, it was in Seoul at Seoul National University, the first time the conference has been in Asia.
(I wasn’t there, I don’t go to these anymore. But I’ve been skimming slides in my free time, to give you folks the updates you crave. Be forewarned that conference posts like these get technical fast, I’ll be back to my usual accessible self next week.)
There isn’t a huge amplitudes community in Korea, but it’s bigger than it was back when I got started in the field. Of the organizers, Kanghoon Lee of the Asia Pacific Center for Theoretical Physics and Sangmin Lee of Seoul National University have what I think of as “core amplitudes interests”, like recursion relations and the double-copy. The other Korean organizers are from adjacent areas, work that overlaps with amplitudes but doesn’t show up at the conference each year. There was also a sizeable group of organizers from Taiwan, where there has been a significant amplitudes presence for some time now. I do wonder if Korea was chosen as a compromise between a conference hosted in Taiwan or in mainland China, where there is also quite a substantial amplitudes community.
One thing that impresses me every year is how big, and how sophisticated, the gravitational-wave community in amplitudes has grown. Federico Buccioni’s talk began with a plot that illustrates this well (though that wasn’t his goal):
At the conference Amplitudes, dedicated to the topic of scattering amplitudes, there were almost as many talks with the phrase “black hole” in the title as there were with “scattering” or “amplitudes”! This is for a topic that did not even exist in the subfield when I got my PhD eleven years ago.
With that said, gravitational wave astronomy wasn’t quite as dominant at the conference as Buccioni’s bar chart suggests. There were a few talks each day on the topic: I counted seven in total, excluding any short talks on the subject in the gong show. Spinning black holes were a significant focus, central to Jung-Wook Kim’s, Andres Luna’s and Mao Zeng’s talks (the latter two showing some interesting links between the amplitudes story and classic ideas in classical mechanics) and relevant in several others, with Riccardo Gonzo, Miguel Correia, Ira Rothstein, and Enrico Herrmann’s talks showing not just a wide range of approaches, but an increasing depth of research in this area.
Herrmann’s talk in particular dealt with detector event shapes, a framework that lets physicists think more directly about what a specific particle detector or observer can see. He applied the idea not just to gravitational waves but to quantum gravity and collider physics as well. The latter is historically where this idea has been applied the most thoroughly, as highlighted in Hua Xing Zhu’s talk, where he used them to pick out particular phenomena of interest in QCD.
QCD is, of course, always of interest in the amplitudes field. Buccioni’s talk dealt with the theory’s behavior at high-energies, with a nice example of the “maximal transcendentality principle” where some quantities in QCD are identical to quantities in N=4 super Yang-Mills in the “most transcendental” pieces (loosely, those with the highest powers of pi). Andrea Guerreri’s talk also dealt with high-energy behavior in QCD, trying to address an experimental puzzle where QCD results appeared to violate a fundamental bound all sensible theories were expected to obey. By using S-matrix bootstrap techniques, they clarify the nature of the bound, finding that QCD still obeys it once correctly understood, and conjecture a weird theory that should be possible to frame right on the edge of the bound. The S-matrix bootstrap was also used by Alexandre Homrich, who talked about getting the framework to work for multi-particle scattering.
Heribertus Bayu Hartanto is another recent addition to Korea’s amplitudes community. He talked about a concrete calculation, two-loop five-particle scattering including top quarks, a tricky case that includes elliptic curves.
When amplitudes lead to integrals involving elliptic curves, many standard methods fail. Jake Bourjaily’s talk raised a question he has brought up again and again: what does it mean to do an integral for a new type of function? One possible answer is that it depends on what kind of numerics you can do, and since more general numerical methods can be cumbersome one often needs to understand the new type of function in more detail. In light of that, Stephen Jones’ talk was interesting in taking a common problem often cited with generic approaches (that they have trouble with the complex numbers introduced by Minkowski space) and finding a more natural way in a particular generic approach (sector decomposition) to take them into account. Giulio Salvatori talked about a much less conventional numerical method, linked to the latest trend in Nima-ology, surfaceology. One of the big selling points of the surface integral framework promoted by people like Salvatori and Nima Arkani-Hamed is that it’s supposed to give a clear integral to do for each scattering amplitude, one which should be amenable to a numerical treatment recently developed by Michael Borinsky. Salvatori can currently apply the method only to a toy model (up to ten loops!), but he has some ideas for how to generalize it, which will require handling divergences and numerators.
Other approaches to the “problem of integration” included Anna-Laura Sattelberger’s talk that presented a method to find differential equations for the kind of integrals that show up in amplitudes using the mathematical software Macaulay2, including presenting a package. Matthias Wilhelm talked about the work I did with him, using machine learning to find better methods for solving integrals with integration-by-parts, an area where two other groups have now also published. Pierpaolo Mastrolia talked about integration-by-parts’ up-and-coming contender, intersection theory, a method which appears to be delving into more mathematical tools in an effort to catch up with its competitor.
Sometimes, one is more specifically interested in the singularities of integrals than their numerics more generally. Felix Tellander talked about a geometric method to pin these down which largely went over my head, but he did have a very nice short description of the approach: “Describe the singularities of the integrand. Find a map representing integration. Map the singularities of the integrand onto the singularities of the integral.”
While QCD and gravity are the applications of choice, amplitudes methods germinate in N=4 super Yang-Mills. Ruth Britto’s talk opened the conference with an overview of progress along those lines before going into her own recent work with one-loop integrals and interesting implications of ideas from cluster algebras. Cluster algebras made appearances in several other talks, including Anastasia Volovich’s talk which discussed how ideas from that corner called flag cluster algebras may give insights into QCD amplitudes, though some symbol letters still seem to be hard to track down. Matteo Parisi covered another idea, cluster promotion maps, which he thinks may help pin down algebraic symbol letters.
The link between cluster algebras and symbol letters is an ongoing mystery where the field is seeing progress. Another symbol letter mystery is antipodal duality, where flipping an amplitude like a palindrome somehow gives another valid amplitude. Lance Dixon has made progress in understanding where this duality comes from, finding a toy model where it can be understood and proved.
Others pushed the boundaries of methods specific to N=4 super Yang-Mills, looking for novel structures. Song He’s talk pushes an older approach by Bourjaily and collaborators up to twelve loops, finding new patterns and connections to other theories and observables. Qinglin Yang bootstraps Wilson loops with a Lagrangian insertion, adding a side to the polygon used in previous efforts and finding that, much like when you add particles to amplitudes in a bootstrap, the method gets stricter and more powerful. Jaroslav Trnka talked about work he has been doing with “negative geometries”, an odd method descended from the amplituhedron that looks at amplitudes from a totally different perspective, probing a bit of their non-perturbative data. He’s finding more parts of that setup that can be accessed and re-summed, finding interestingly that multiple-zeta-values show up in quantities where we know they ultimately cancel out. Livia Ferro also talked about a descendant of the amplituhedron, this time for cosmology, getting differential equations for cosmological observables in a particular theory from a combinatorial approach.
Outside of everybody’s favorite theories, some speakers talked about more general approaches to understanding the differences between theories. Andreas Helset covered work on the geometry of the space of quantum fields in a theory, applying the method to a general framework for characterizing deviations from the standard model called the SMEFT. Jasper Roosmale Nepveu also talked about a general space of theories, thinking about how positivity (a trait linked to fundamental constraints like causality and unitarity) gets tangled up with loop effects, and the implications this has for renormalization.
Soft theorems, universal behavior of amplitudes when a particle has low energy, continue to be a trendy topic, with Silvia Nagy showing how the story continues to higher orders and Sangmin Choi investigating loop effects. Callum Jones talks about one of the more powerful results from the soft limit, Weinberg’s theorem showing the uniqueness of gravity. Weinberg’s proof was set up in Minkowski space, but we may ultimately live in curved, de Sitter space. Jones showed how the ideas Weinberg explored generalize in de Sitter, using some tools from the soft-theorem-inspired field of dS/CFT. Julio Parra-Martinez, meanwhile, tied soft theorems to another trendy topic, higher symmetries, a more general notion of the usual types of symmetries that physicists have explored in the past. Lucia Cordova reported work that was not particularly connected to soft theorems but was connected to these higher symmetries, showing how they interact with crossing symmetry and the S-matrix bootstrap.
Finally, a surprisingly large number of talks linked to Kevin Costello and Natalie Paquette’s work with self-dual gauge theories, where they found exact solutions from a fairly mathy angle. Paquette gave an update on her work on the topic, while Alfredo Guevara talked about applications to black holes, comparing the power of expanding around a self-dual gauge theory to that of working with supersymmetry. Atul Sharma looked at scattering in self-dual backgrounds in work that merges older twistor space ideas with the new approach, while Roland Bittelson talked about calculating around an instanton background.
Also, I had another piece up this week at FirstPrinciples, based on an interview with the (outgoing) president of the Sloan Foundation. I won’t have a “bonus info” post for this one, as most of what I learned went into the piece. But if you don’t know what the Sloan Foundation does, take a look! I hadn’t known they funded Jupyter notebooks and Hidden Figures, or that they introduced Kahneman and Tversky.
The Orioles have two legitimate Rookie of the Year candidates and neither one of them is named Matt Wieters
“I’m talking about Nolan Reimold, currently slugging .546 and leading all major-league rookies in OPS; and Brad Bergesen, who’s been the Orioles’ best starter this year at 23. Higher-profile pitching prospects Rick Porcello and David Price have ERAs a little lower, but Bergesen looks better on home runs, walks, and strikeouts. He is, as they say, “in the discussion.””
Yeah. Reimold and Bergesen did not win Rookie of the Year, and fact, both of them had the majority of their career WAR in 2009. Bergesen, in fact, had more WAR in 2009 than his career total, and was out of the major leagues by the age of 27. Price and Porcello, meanwhile, had long careers, and each won a Cy Young before turning 30. I guess the guys who rate pitching prospects know something about what they’re doing. In my defense, the 2009 AL Rookie of the Year was in the end not Nolan Reimold, or Brad Bergesen, or David Price, or Rick Porcello, or Matt Wieters — it was A’s reliever Andrew Bailey, who also had the majority of his career WAR in 2009.
It’s a fine thing that we now have a national holiday that asks us to remember slavery. Some people think patriotism means insisting that America never did or does anything wrong, and that our schools should teach a purely heroic account of the American story. That’s foolish. America is made of people like you and me, who share certain ideals, truly heroic ideals, but don’t always live up to them — and some scoundrels too. Any patriotism that can’t survive contact with the actual history of America is weak stuff. A real patriot loves his country with open eyes.
The National Academies of Sciences, Engineering, and Medicine have me moderating a series of webinars about the use of novel math in the applied sciences. I learn a lot every time I do one! Here’s the latest, on Machine Learning for Breakthroughs in Medical Care, featuring Charley Taylor and Lorin Crawford. Some fun! Looking forward to doing more of these.
No one seems to know the origin of the contemporary phrase “touch grass,” meaning “get off-line and remember that the real world exists.” I think it is unlikely that it actually springs from Bertrand Russell’s 1930 self-help book The Conquest of Happiness, but — this description of touching grass really captures the modern sentiment!
To the child even more than to the man, it is necessary to preserve some contact with the ebb and flow of terrestrial life… I have seen a boy two years old, who had been kept in London, taken out for the first time to walk in green country. The season was winter, and everything was wet and muddy. To the adult eye there was nothing to cause delight, but in the boy there sprang up a strange ecstasy; he kneeled on the wet ground and put his face in the grass, and gave utterance to half-articulate cries of delight. The joy that he was experiencing was primitive, simple, and massive. The organic need that was being satisfied is so profound that those in whom it is starved are seldom completely sane.
Judy Walker and I put together a memorial article about my colleague and collaborator Nigel Boston in this month’s Notices of the AMS. What a great guy. And this conference at Zürich on arithmetic statistics I just returned from was, in almost every aspect, influenced by Nigel’s ideas and outlook. And not only because many of his collaborators and students were present. Nigel was the first person, I think, to really understand what form a non-abelian Cohen-Lenstra theory might take. And he was insistent on the importance of the pro-p story. And on the role that computation would play in actually understanding what’s going on. All three of these strands are very much alive at the current frontier of the subject.
A very quick summary of some non-negative news developments:
The NSF awarded 500 more graduate fellowships this week, bringing the total for this year up to 1500. (Apologies for the X link.) This is still 25% lower than last year's number, and of course far below the original CHIPS and Science act target of 3000, but it's better than the alternative. I think we can now all agree that the supposed large-scale bipartisan support for the CHIPS and Science act was illusory.
There seems to be some initial signs of pushback on the senate side regarding the proposed massive science funding cuts. Again, now is the time to make views known to legislators - I am told by multiple people with experience in this arena that it really can matter.
There was a statement earlier this week that apparently the US won't be going after Chinese student visas. This would carry more weight if it didn't look like US leadership was wandering ergodically through all possible things to say with no actual plan or memory.
On to the main topic of this post. Thanks to my professional age (older than dirt) and my experience (overseeing shared research infrastructure; being involved in a couple of building design and construction projects; and working on PI lab designs and build-outs), I have some key advice and lessons learned for anyone designing a new big science/engineering research building. This list is by no means complete, and I invite readers to add their insights in the comments. While it seems likely that many universities will be curtailing big capital construction projects in the near term because of financial uncertainty, I hope this may still come in handy to someone.
Any big laboratory building should have a dedicated loading dock with central receiving. If you're spending $100M-200M on a building, this is not something that you should "value engineer" away. The long term goal is a building that operates well for the PIs and is easy to maintain, and you're going to need to be able to bring in big crates for lab and service equipment. You should have a freight elevator adjacent to the dock.
You should also think hard about what kind of equipment will have to be moved in and out of the building when designing hallways, floor layouts, and door widths. You don't want to have to take out walls, doorframes, or windows, or to need a crane to hoist equipment into upper floors because it can't get around corners.
Think hard about process gasses and storage tanks at the beginning. Will PIs need to have gas cylinders and liquid nitrogen and argon tanks brought in and out in high volumes all the time, with all the attendant safety concerns? Would you be better off getting LN2 or LAr tanks even though campus architects will say they are unsightly?
Likewise, consider whether you should have building-wide service for "lab vacuum", N2 gas, compressed air, DI water, etc. If not and PIs have those needs, you should plan ahead to deal with this.
Gas cylinder and chemical storage - do you have enough on-site storage space for empty cylinders and back-up supply cylinders? If this is a very chemistry-heavy building, think hard about safety and storing solvents.
Make sure you design for adequate exhaust capacity for fume hoods. Someone will always want to add more hoods. While all things are possible with huge expenditures, it's better to make sure you have capacity to spare, because adding hoods beyond the initial capacity would likely require a huge redo of the building HVAC systems.
Speaking of HVAC, think really hard about controls and monitoring. Are you going to have labs that need tight requirements on temperature and humidity? When you set these up, put have enough sensors of the right types in the right places, and make sure that your system is designed to work even when the outside air conditions are at their seasonal extremes (hot and humid in the summer, cold and dry in the winter). Also, consider having a vestibule (air lock) for the main building entrance - you'd rather not scoop a bunch of hot, humid air (or freezing, super-dry air) into the building every time a student opens the door.
Still on HVAC, make sure that power outages and restarts don't lead to weird situations like having the whole building at negative pressure relative to the outside, or duct work bulging or collapsing.
Still on HVAC, actually think about where the condensate drains for the fan units will overflow if they get plugged up or overwhelmed. You really don't want water spilling all over a rack of networking equipment in an IT closet. Trust me.
Chilled water: Whether it's the process chilled water for the air conditioning, or the secondary chilled water for lab equipment, make sure that the loop is built correctly. Incompatible metals (e.g., some genius throws in a cast iron fitting somewhere, or joints between dissimilar metals) can lead to years and years of problems down the line. Make sure lines are flushed and monitored for cleanliness, and have filters in each lab that can be checked and maintained easily.
Electrical - design with future needs in mind. If possible, it's a good idea to have PI labs with their own isolation transformers, to try to mitigate inter-lab electrical noise issues. Make sure your electrical contractors understand the idea of having "clean" vs. "dirty" power and can set up the grounding accordingly while still being in code.
Still on electrical, consider building-wide surge protection, and think about emergency power capacity. For those who don't know, emergency power is usually a motor-generator that kicks in after a few seconds to make sure that emergency lighting and critical systems (including lab exhaust) keep going.
Ceiling heights, duct work, etc. - It's not unusual for some PIs to have tall pieces of equipment. Think about how you will accommodate these. Pits in the floors of basement labs? 5 meter slab-to-slab spacing? Think also about how ductwork and conduits are routed. You don't want someone to tell you that installation of a new apparatus is going to cost a bonus $100K because shifting a duct sideways by half a meter will require a complete HVAC redesign.
Think about the balance between lab space and office space/student seating. No one likes giant cubicle farm student seating, but it does have capacity. In these days of zoom and remote access to experiments, the way students and postdocs use offices is evolving, which makes planning difficult. Health and safety folks would definitely prefer not to have personnel effectively headquartered directly in lab spaces. Seriously, though, when programming a building, you need to think about how many people per PI lab space will need places to sit. I have yet to see a building initially designed with enough seating to handle all the personnel needs if every PI lab were fully occupied and at a high level of research activity.
Think about maintenance down the line. Every major building system has some lifespan. If a big air handler fails, is it accessible and serviceable, or would that require taking out walls or cutting equipment into pieces and disrupting the entire building? Do you want to set up a situation where you may have to do this every decade? (Asking for a friend.)
Entering the realm of fantasy, use your vast power and influence to get your organization to emphasize preventative maintenance at an appropriate level, consistently over the years. Universities (and national labs and industrial labs) love "deferred maintenance" because kicking the can down the road can make a possible cost issue now into someone else's problem later. Saving money in the short term can be very tempting. It's also often easier and more glamorous to raise money for the new J. Smith Laboratory for Physical Sciences than it is to raise money to replace the HVAC system in the old D. Jones Engineering Building. Avoid this temptation, or one day (inevitably when times are tight) your university will notice that it has $300M in deferred maintenance needs.
I may update this list as more items occur to me, but please feel free to add input/ideas.
At FirstPrinciples.org, I had a piece covering work by engineering professor Colin McInnes on stability of Dyson spheres and ringworlds. This was a fun one to cover, mostly because of how it straddles the borderline between science fiction and practical physics and engineering. McInnes’s claim to fame is work on solar sails, which seem like a paradigmatic example of that kind of thing: a common sci-fi theme that’s surprisingly viable. His work on stability was interesting to me because it’s the kind of work that a century and a half ago would have been paradigmatic physics. Now, though, very few physicists work on orbital mechanics, and a lot of the core questions have passed on to engineering. It’s fascinating to see how these classic old problems can still have undiscovered solutions, and how the people best equipped to find them now are tinkerers practicing their tools instead of cutting-edge mathematicians.
At Quanta Magazine, I had a piece about reversible computing. Readers may remember I had another piece on that topic at the end of March, a profile on the startup Vaire Computing at FirstPrinciples.org. That piece talked about FirstPrinciples, but didn’t say much about reversible computing. I figured I’d combine the “bonus info” for both posts here.
Neither piece went into much detail about the engineering involved, as it didn’t really make sense in either venue. One thing that amused me a bit is that the core technology that drove Vaire into action is something that actually should be very familiar to a physics or engineering student: a resonator. Theirs is obviously quite a bit more sophisticated than the base model, but at its heart it’s doing the same thing: storing charge and controlling frequency. It turns out that those are both essential to making reversible computers work: you need to store charge so it isn’t lost to ground when you empty a transistor, and you need to control the frequency so you can have waves with gentle transitions instead of the more sharp corners of the waves used in normal computers, thus wasting less heat in rapid changes of voltage. Vaire recently announced they’re getting 50% charge recovery from their test chips, and they’re working on raising that number.
Originally, the Quanta piece was focused more on reversible programming than energy use, as the energy angle seemed a bit more physics-focused than their computer science desk usually goes. The emphasis ended up changing as I worked on the draft, but it meant that an interesting parallel story got lost on the cutting-room floor. There’s a community of people who study reversible computing not from the engineering side, but from the computer science side, studying reversible logic and reversible programming languages. It’s a pursuit that goes back to the 1980’s, where at Caltech around when Feynman was teaching his course on the physics of computing a group of students were figuring out how to set up a reversible programming language. Called Janus, they sent their creation to Landauer, and the letter ended up with Michael Frank after Landauer died. There’s a lovely quote from it regarding their motivation: “We did it out of curiosity over whether such an odd animal as this was possible, and because we were interested in knowing where we put information when we programmed. Janus forced us to pay attention to where our bits went since none could be thrown away.”
Being forced to pay attention to information, in turn, is what has animated the computer science side of the reversible computing community. There are applications to debugging, where you can run code backwards when it gets stuck, to encryption and compression, where you want to be able to recover the information you hid away, and to security, where you want to keep track of information to make sure a hacker can’t figure out things they shouldn’t. Also, for a lot of these people, it’s just a fun puzzle. Early on my attention was caught by a paper by Hannah Earley describing a programming language called Alethe, a word you might recognize from the Greek word for truth, which literally means something like “not-forgetting”.
(Compression is particularly relevant for the “garbage data” you need to output in a reversible computation. If you want to add two numbers reversibly, naively you need to keep both input numbers and their output, but you can be more clever than that and just keep one of the inputs since you can subtract to find the other. There are a lot of substantially more clever tricks in this vein people have figured out over the years.)
I didn’t say anything about the other engineering approaches to reversible computing, that try to do something outside of traditional computer chips. There’s DNA computing, which tries to compute with a bunch of DNA in solution. There’s the old concept of ballistic reversible computing, where you imagine a computer that runs like a bunch of colliding billiard balls, conserving energy. Coordinating such a computer can be a nightmare, and early theoretical ideas were shown to be disrupted by something as tiny as a few stray photons from a distant star. But people like Frank figured out ways around the coordination problem, and groups have experimented with superconductors as places to toss those billiard balls around. The early billiard-inspired designs also had a big impact on quantum computing, where you need reversible gates and the only irreversible operation is the measurement. The name “Toffoli” comes up a lot in quantum computing discussions, I hadn’t known before this that Toffoli gates were originally for reversible computing in general, not specifically quantum computing.
Finally, I only gestured at the sci-fi angle. For reversible computing’s die-hards, it isn’t just a way to make efficient computers now. It’s the ultimate future of the technology, the kind of energy-efficiency civilization will need when we’re covering stars with shells of “computronium” full of busy joyous artificial minds.
And now that I think about it, they should chat with McInnes. He can tell them the kinds of stars they should build around.
A week ago I attended LessOnline, a rationalist blogging conference featuring many people I’ve known for years—Scott Alexander, Eliezer Yudkowsky, Zvi Mowshowitz, Sarah Constantin, Carl Feynman—as well as people I’ve known only online and was delighted to meet in person, like Joe Carlsmith and Jacob Falkovich and Daniel Reeves. The conference was at Lighthaven, a bewildering maze of passageways, meeting-rooms, sleeping quarters, gardens, and vines off Telegraph Avenue in Berkeley, which has recently emerged as the nerd Shangri-La, or Galt’s Gulch, or Shire, or whatever. I did two events at this year’s LessOnline: a conversation with Nate Soares about the Orthogonality Thesis, and an ask-me-anything session about quantum computing and theoretical computer science (no new ground there for regular consumers of my content).
What I’ll remember most from LessOnline is not the sessions, mine or others’, but the unending conversation among hundreds of people all over the grounds, which took place in parallel with the sessions and before and after them, from morning till night (and through the night, apparently, though I’ve gotten too old for that). It felt like a single conversational archipelago, the largest in which I’ve ever taken part, and the conference’s real point. (Attendees were exhorted, in the opening session, to skip as many sessions as possible in favor of intense small-group conversations—not only because it was better but also because the session rooms were too small.)
Within the conversational blob, just making my way from one building to another could take hours. My mean free path was approximately five feet, before someone would notice my nametag and stop me with a question. Here was my favorite opener:
“You’re Scott Aaronson?! The quantum physicist who’s always getting into arguments on the Internet, and who’s essentially always right, but who sustains an unreasonable amount of psychic damage in the process?”
“Yes,” I replied, not bothering to correct the “physicist” part.
One night, I walked up to Scott Alexander, who sitting on the ground, with his large bald head and a blanket he was using as a robe, resembled a monk. “Are you enjoying yourself?” he asked.
I replied, “you know, after all these years of being coy about it, I think I’m finally ready to become a Rationalist. Is there, like, an initiation ritual or something?”
Scott said, “Oh, you were already initiated a decade ago; you just didn’t realize it at the time.” Then he corrected himself: “two decades ago.”
The first thing I did, after coming out as a Rationalist, was to get into a heated argument with Other Scott A., Joe Carlsmith, and other fellow-Rationalists about the ideas I set out twelve years ago in my Ghost in the Quantum Turing Machine essay. Briefly, my argument was that the irreversibility and ephemerality of biological life, which contrasts with the copyability, rewindability, etc. of programs running on digital computers, and which can ultimately be traced back to microscopic details of the universe’s initial state, subject to the No-Cloning Theorem of quantum mechanics, which then get chaotically amplified during brain activity … might be a clue to a deeper layer of the world, one that we understand about as well as the ancient Greeks understood Newtonian physics, but which is the layer where mysteries like free will and consciousness will ultimately need to be addressed.
I got into this argument partly because it came up, but partly also because this seemed like the biggest conflict between my beliefs and the consensus of my fellow Rationalists. Maybe part of me wanted to demonstrate that my intellectual independence remained intact—sort of like a newspaper that gets bought out by a tycoon, and then immediately runs an investigation into the tycoon’s corruption, as well as his diaper fetish, just to prove it can.
The funny thing, though, is that all my beliefs are the same as they were before. I’m still a computer scientist, an academic, a straight-ticket Democratic voter, a liberal Zionist, a Jew, etc. (all identities, incidentally, well-enough represented at LessOnline that I don’t even think I was the unique attendee in the intersection of them all).
Given how much I resonate with what the Rationalists are trying to do, why did it take me so long to identify as one?
Firstly, while 15 years ago I shared the Rationalists’ interests, sensibility, and outlook, and their stances on most issues, I also found them bizarrely, inexplicably obsessed with the question of whether AI would soon become superhumanly powerful and change the basic conditions of life on earth, and with how to make the AI transition go well. Why that, as opposed to all the other sci-fi scenarios one could worry about, not to mention all the nearer-term risks to humanity?
Suffice it to say that empirical developments have since caused me to withdraw my objection. Sometimes weird people are weird merely because they see the future sooner than others. Indeed, it seems to me that the biggest thing the Rationalists got wrong about AI was to underestimate how soon the revolution would happen, and to overestimate how many new ideas would be needed for it (mostly, as we now know, it just took lots more compute and training data). Now that I, too, spend some of my time working on AI alignment, I was able to use LessOnline in part for research meetings with colleagues.
A second reason I didn’t identify with the Rationalists was cultural: they were, and are, centrally a bunch of twentysomethings who “work” at an ever-changing list of Berkeley- and San-Francisco-based “orgs” of their own invention, and who live in group houses where they explore their exotic sexualities, gender identities, and fetishes, sometimes with the aid of psychedelics. I, by contrast, am a straight, monogamous, middle-aged tenured professor, married to another such professor and raising two kids who go to normal schools. Hanging out with the Rationalists always makes me feel older and younger at the same time.
So what changed? For one thing, with the march of time, a significant fraction of Rationalists now have marriages, children, or both—indeed, a highlight of LessOnline was the many adorable toddlers running around the Lighthaven campus. Rationalists are successfully reproducing! Some because of explicit pronatalist ideology, or because they were persuaded by Bryan Caplan’s arguments in Selfish Reasons to Have More Kids. But others simply because of the same impulses that led their ancestors to do the same for eons. And perhaps because, like the Mormons or Amish or Orthodox Jews, but unlike typical secular urbanites, the Rationalists believe in something. For all their fears around AI, they don’t act doomy, but buzz with ideas about how to build a better world for the next generation.
At a LessOnline parenting session, hosted by Julia Wise, I was surrounded by parents who worry about the same things I do: how do we raise our kids to be independent and agentic yet socialized and reasonably well-behaved, technologically savvy yet not droolingly addicted to iPad games? What schooling options will let them accelerate in math, save them from the crushing monotony that we experienced? How much of our own lives should we sacrifice on the altar of our kids’ “enrichment,” versus trusting Judith Rich Harris that such efforts quickly hit a point of diminishing returns?
A third reason I didn’t identify with the Rationalists was, frankly, that they gave off some (not all) of the vibes of a cult, with Eliezer as guru. Eliezer writes in parables and koans. He teaches that the fate of life on earth hangs in the balance, that the select few who understand the stakes have the terrible burden of steering the future. Taking what Rationalists call the “outside view,” how good is the track record for this sort of thing?
OK, but what did I actually see at Lighthaven? I saw something that seemed to resemble a cult only insofar as the Beatniks, the Bloomsbury Group, the early Royal Society, or any other community that believed in something did. When Eliezer himself—the bearded, cap-wearing Moses who led the nerds from bondage to their Promised Land in Berkeley—showed up, he was argued with like anyone else. Eliezer has in any case largely passed his staff to a new generation: Nate Soares and Zvi Mowshowitz have found new and, in various ways, better ways of talking about AI risk; Scott Alexander has for the last decade written the blog that’s the community’s intellectual center; figures from Kelsey Piper to Jacob Falkovich to Aella have taken Rationalism in new directions, from mainstream political engagement to the … err … statistical analysis of orgies.
I’ll say this, though, on the naysayers’ side: it’s really hard to make dancing to AI-generated pop songs about Bayes’ theorem and Tarski’s definition of truth not feel cringe, as I can now attest from experience.
The cult thing brings me to the deepest reason I hesitated for so long to identify as a Rationalist: namely, I was scared that if I did, people whose approval I craved (including my academic colleagues, but also just randos on the Internet) would sneer at me. For years, I searched of some way of explaining this community’s appeal so reasonable that it would silence the sneers.
It took years of psychological struggle, and (frankly) solidifying my own place in the world, to follow the true path, which of course is not to give a shit what some haters think of my life choices. Consider: five years ago, it felt obvious to me that the entire Rationalist community might be about to implode, under existential threat from Cade Metz’s New York Times article, as well as RationalWiki and SneerClub and all the others laughing at the Rationalists and accusing them of every evil. Yet last week at LessOnline, I saw a community that’s never been thriving more, with a beautiful real-world campus, excellent writers on every topic who felt like this was the place to be, and even a crop of kids. How many of the sneerers are living such fulfilled lives? To judge from their own angry, depressed self-disclosures, probably not many.
But are the sneerers right that, even if the Rationalists are enjoying their own lives, they’re making other people’s lives miserable? Are they closet far-right monarchists, like Curtis Yarvin? I liked how The New Yorker put it in its recent, long and (to my mind) devastating profile of Yarvin:
The most generous engagement with Yarvin’s ideas has come from bloggers associated with the rationalist movement, which prides itself on weighing evidence for even seemingly far-fetched claims. Their formidable patience, however, has also worn thin. “He never addressed me as an equal, only as a brainwashed person,” Scott Aaronson, an eminent computer scientist, said of their conversations. “He seemed to think that if he just gave me one more reading assignment about happy slaves singing or one more monologue about F.D.R., I’d finally see the light.”
The closest to right-wing politics that I witnessed at LessOnline was a session, with Kelsey Piper and current and former congressional staffers, about the prospects for moderate Democrats to articulate a moderate, pro-abundance agenda that would resonate with the public and finally defeat MAGA.
But surely the Rationalists are incels, bitter that they can’t get laid? Again, the closest I saw was a session where Jacob Falkovich helped a standing-room-only crowd of mostly male nerds confront their fears around dating and understand women better, with Rationalist women eagerly volunteering to answer questions about their perspective. Gross, right? (Also, for those already in relationships, Eliezer’s primary consort and former couples therapist Gretta Duleba did a session on relationship conflict.)
So, yes, when it comes to the Rationalists, I’m going to believe my own lying eyes over the charges of the sneerers. The sneerers can even say about me, in their favorite formulation, that I’ve “gone mask off,” confirmed the horrible things they’ve always suspected. Yes, the mask is off—and beneath the mask is the same person I always was, who has an inordinate fondness for the Busy Beaver function and the complexity class BQP/qpoly, and who uses too many filler words and moves his hands too much, and who strongly supports the Enlightenment, and who once feared that his best shot at happiness in life would be to earn women’s pity rather than their contempt. Incorrectly, as I’m glad to report. From my nebbishy nadir to the present, a central thing that’s changed is that, from my family to my academic colleagues to the Rationalist community to my blog readers, I finally found some people who want what I have to sell.
Unrelated Announcements:
My replies to comments on this post might be light, as I’ll be accompanying my daughter on a school trip to the Galapagos Islands!
A few weeks ago, I was “ambushed” into leading a session on philosophy and theoretical computer science at UT Austin. (I.e., asked to show up for the session, but thought I’d just be a participant rather than the main event.) The session was then recorded and placed on YouTube—and surprisingly, given the circumstances, some people seemed to like it!
Friend-of-the-blog Alon Rosen has asked me to announce a call for nominations for a new theoretical computer science prize, in memory of my former professor (and fellow TCS blogger) Luca Trevisan, who was lost to the world too soon.
And one more: Mahdi Cheraghchi has asked me to announce the STOC’2025 online poster session, registration deadline June 12; see here for more. Incidentally, I’ll be at STOC in Prague to give a plenary on quantum algorithms; I look forward to meeting any readers who are there!
Editor’s note (Nicole Yunger Halpern): Jade LeSchack, the Quantum Steampunk Laboratory’s first undergraduate, received her bachelor’s degree from the University of Maryland this spring. Kermit the Frog presented the valedictory address, but Jade gave the following speech at the commencement ceremony for the university’s College of Mathematical and Natural Sciences. Jade heads to the University of Southern California for a PhD in physics this fall.
Good afternoon, everyone. My name is Jade, and it is my honor and pleasure to speak before you.
Today, I’m graduating with my Bachelor of Science, but when I entered UMD, I had no idea what it meant to be a professional scientist or where my passion for quantum science would take me. I want you to picture where you were four years ago. Maybe you were following a long-held passion into college, or maybe you were excited to explore a new technical field. Since then, you’ve spent hours titrating solutions, debugging code, peering through microscopes, working out proofs, and all the other things our disciplines require of us. Now, we’re entering a world of uncertainty, infinite possibility, and lifelong connections. Let me elaborate on each of these.
First, there is uncertainty. Unlike simplified projectile motion, you can never predict the exact trajectory of your life or career. Plans will change, and unexpected opportunities will arise. Sometimes, the best path forward isn’t the one you first imagined. Our experiences at Maryland have prepared us to respond to the challenges and curveballs that life will throw at us. And, we’re going to get through the rough patches.
Second, let’s embrace the infinite possibilities ahead of us. While the concept of the multiverse is best left to the movies, it’s exciting to think about all the paths before us. We’ve each found our own special interests over the past four years here, but there’s always more to explore. Don’t put yourself in a box. You can be an artist and a scientist, an entrepreneur and a humanitarian, an athlete and a scholar. Continue to redefine yourself and be open to your infinite potential.
Third, as we move forward, we are equipped not only with knowledge but with connections. We’ve made lasting relationships with incredible people here. As we go from place to place, the people who we’re close to will change. But we’re lucky that, these days, people are only an email or phone call away. We’ll always have our UMD communities rooting for us.
Now, the people we met here are certainly not the only important ones. We’ve each had supporters along the various stages of our journeys. These are the people who championed us, made sacrifices for us, and gave us a shoulder to cry on. I’d like to take a moment to thank all my mentors, teachers, and friends for believing in me. To my mom, dad, and sister sitting up there, I couldn’t have done this without you. Thank you for your endless love and support.
To close, I’d like to consider this age-old question that has always fascinated me: Is mathematics discovered or invented? People have made a strong case for each side. If we think about science in general, and our future contributions to our fields, we might ask ourselves: Are we discoverers or inventors? My answer is both! Everyone here with a cap on their head is going to contribute to both. We’re going to unearth new truths about nature and innovate scientific technologies that better society. This uncertain, multitudinous, and interconnected world is waiting for us, the next generation of scientific thinkers! So let’s be bold and stay fearless.
Congratulations to the class of 2024 and the class of 2025! We did it!
Author’s note: I was deeply grateful for the opportunity to serve as the student speaker at my commencement ceremony. I hope that the science-y references tickle the layman and SME alike. You can view a recording of the speech here. I can’t wait for my next adventures in quantum physics!
Applications for MSCA Post-doctoral fellowships are on, and will be so until September 10 this year. What that means is that if you have less than 8 years of experience after your Ph.D., you can pair up with a research institute in Europe to present a research plan, and the European Commission may decide to fund it for two years (plus 6 months in industry in some cases).
In order for your application to have a chance to win funding, you need to:
have a great research topic in mind,
be ready to invest some time in writing a great application, and
pair up with an outstanding supervisor at a renowned research institute.
Again, as a distraction from persistently concerning news, here is a science mystery of which I was previously unaware.
The role of approximations in physics is something that very often comes as a shock to new students. There is this cultural expectation out there that because physics is all about quantitative understanding of physical phenomena, and the typical way we teach math and science in K12 education, we should be able to get exact solutions to many of our attempts to model nature mathematically. In practice, though, constructing physics theories is almost always about approximations, either in the formulation of the model itself (e.g. let's consider the motion of an electron about the proton in the hydrogen atom by treating the proton as infinitely massive and of negligible size) or in solving the mathematics (e.g., we can't write an exact analytical solution of the problem when including relativity, but we can do an order-by-order expansion in powers of \(p/mc\)). Theorists have a very clear understanding of what means to say that an approximation is "well controlled" - you know on both physical and mathematical grounds that a series expansion actually converges, for example.
Some problems are simpler than others, just by virtue of having a very limited number of particles and degrees of freedom, and some problems also lend themselves to high precision measurements. The hydrogen atom problem is an example of both features. Just two spin-1/2 particles (if we approximate the proton as a lumped object) and readily accessible to optical spectroscopy to measure the energy levels for comparison with theory. We can do perturbative treatments to account for other effects of relativity, spin-orbit coupling, interactions with nuclear spin, and quantum electrodynamic corrections (here and here). A hallmark of atomic physics is the remarkable precision and accuracy of these calculations when compared with experiment. (The \(g\)-factor of the electron is experimentally known to a part in \(10^{10}\) and matches calculations out to fifth order in \(\alpha = e^2/(4 \pi \epsilon_{0}\hbar c)\).).
The helium atom is a bit more complicated, having two electrons and a more complicated nucleus, but over the last hundred years we've learned a lot about how to do both calculations and spectroscopy. As explained here, there is a problem. It is possible to put helium into an excited metastable triplet state with one electron in the \(1s\) orbital, the other electron in the \(2s\) orbital, and their spins in a triplet configuration. Then one can measure the ionization energy of that system - the minimum energy required to kick an electron out of the atom and off to infinity. This energy can be calculated to seventh order in \(\alpha\), and the theorists think that they're accounting for everything, including the finite (but tiny) size of the nucleus. The issue: The calculation and the experiment differ by about 2 nano-eV. That may not sound like a big deal, but the experimental uncertainty is supposed to be a little over 0.08 nano-eV, and the uncertainty in the calculation is estimated to be 0.4 nano-eV. This works out to something like a 9\(\sigma\) discrepancy. Most recently, a quantitatively very similar discrepancy shows up in the case of measurements performed in 3He rather than 4He.
This is pretty weird. Historically, it would seem that the most likely answer is a problem with either the measurements (though that seems doubtful, since precision spectroscopy is such a well-developed set of techniques), the calculation (though that also seems weird, since the relevant physics seems well known), or both. The exciting possibility is that somehow there is new physics at work that we don't understand, but that's a long shot. Still, something fun to consider (as my colleagues (and I) try to push back on the dismantling of US scientific research.)
The Oort cloud is a huge region of icy objects surrounding our Sun. We’re not sure it exists, but we think it’s where comets come from.
I’ve often seen the Oort cloud drawn as a vague round blob. But recently some people simulated it—and discovered that tidal forces from the Milky Way may pull it into a much more interesting shape:
• David Nesvorný, Luke Dones, David Vokrouhlický, Hal F. Levison, Cristian Beaugé, Jacqueline Faherty, Carter Emmart, and Jon P. Parker, A spiral structure in the inner Oort cloud, The Astrophysical Journal983 (2025).
It actually looks like a cartoon of a galaxy! But it’s poking up at right angles to the plane of our galaxy, drawn in blue here. The red line is the plane that the planets mostly move in, called the ‘ecliptic’.
According to our theories, the Oort cloud formed about 4.6 billion years ago when the Solar System was young. As the outer planets cleared their orbital neighborhood, trillions of small icy objects were pushed into very eccentric orbits that come as close as 30 AU to the Sun and then shoot out as far as 1000 AU. (Remember, the Earth is 1 AU from the Sun.) Later, Galactic tidal forces slowly pulled these objects farther from the Sun and tilted their orbits. Encounters with nearby stars tend to randomize the orbits of these Oort cloud objects.
By now, the inner Oort cloud consists of about icy objects 1000 to 10,000 AU from the Sun. It’s more or less flat, roughly 15,000 AU across, tilted 30° to the ecliptic, and it looks like a spiral with two twisted arms.
The spiral structure was first noticed when they showed this simulation in the Hayden Planetarium in preparation for a new space show!
Physicists like to start by “assuming a spherical cow”. But when they study something in detail, it’s usually more complex. Even black holes usually have a disk, with jets shooting out.
In January, my time at the Niels Bohr Institute ended. Instead of supporting myself by doing science, as I’d done the last thirteen or so years, I started making a living by writing, doing science journalism.
That work picked up. My readers here have seen a few of the pieces already, but there are lots more in the pipeline, getting refined by editors or waiting to be published. It’s given me a bit of income, and a lot of visibility.
That visibility, in turn, has given me new options. It turns out that magazines aren’t the only companies interested in science writing, and journalism isn’t the only way to write for a living. Companies that invest in science want a different kind of writing, one that builds their reputation both with the public and with the scientific community. And as I’ve discovered, if you have enough of a track record, some of those companies will reach out to you.
So I’m branching out, from science journalism to science communications consulting, advising companies how to communicate science. I’ve started working with an exciting client, with big plans for the future. If you follow me on LinkedIn, you’ll have seen a bit about who they are and what I’ll be doing for them.
Here on the blog, I’d like to maintain a bit more separation. Blogging is closer to journalism, and in journalism, one ought to be careful about conflicts of interest. The advice I’ve gotten is that it’s good to establish some ground rules, separating my communications work from my journalistic work, since I intend to keep doing both.
So without further ado, my conflict of interest rules:
I will not write in a journalistic capacity about my consulting clients, or their direct competitors.
I will not write in a journalistic capacity about the technology my clients are investing in, except in extremely general terms. (For example, most businesses right now are investing in AI. I’ll still write about AI in general, but not about any particular AI technologies my clients are pursuing.)
I will more generally maintain a distinction between areas I cover journalistically and areas where I consult. Right now, this means I avoid writing in a journalistic capacity about:
Health/biomedical topics
Neuroscience
Advanced sensors for medical applications
I plan to update these rules over time as I get a better feeling for what kinds of conflict of interest risks I face and what my clients are comfortable with. I now have a Page for this linked in the top menu, clients and editors can check there to see my current conflict of interest rules.
Boris Alexeev, Evan Conway, Matthieu Rosenfeld, Andrew Sutherland, Markus Uhr, Kevin Ventullo, and I have uploaded to the arXiv a second version of our paper “Decomposing a factorial into large factors“. This is a completely rewritten and expanded version of a previous paper of the same name. Thanks to many additional theoretical and numerical contributors from the other coauthors, we now have much more precise control on the main quantity studied in this paper, allowing us to settle all the previous conjectures about this quantity in the literature.
As discussed in the previous post, denotes the largest integer such that the factorial can be expressed as a product of factors, each of which is at least . Computing is a special case of the bin covering problem, which is known to be NP-hard in general; and prior to our work, was only computed for ; we have been able to compute for all . In fact, we can get surprisingly sharp upper and lower bounds on for much larger , with a precise asymptotic
for an explicit constant , which we conjecture to be improvable to
for an explicit constant : … For instance, we can demonstrate numerically that
As a consequence of this precision, we can verify several conjectures of Guy and Selfridge, namely
for all .
for all .
for all . (In fact we show this is true for , and that this threshold is best possible.)
Guy and Selfridge also claimed that one can establish for all large purely by rearranging factors of and from the standard factorization of , but surprisingly we found that this claim (barely) fails for all :
The accuracy of our bounds comes from several techniques:
Greedy algorithms, in which one allocates the largest prime factors of first and then moves to smaller primes, provide quickly computable, though suboptimal, lower bounds on for small, medium, and moderately large values;
Linear programming and integer programming methods provides extremely accurate upper and lower bounds on for small and medium values of ;
Rearrangement methods can be analyzed asymptotically via linear programming, and work well for large ; and
The modified approximate factorization strategy, discussed in the previous post is now sharpened by using -smooth numbers (products of and ) as the primary “liquidity pool” to reallocate factors of , as opposed to the previous approach of only using powers of .
To me, the biggest surprise was just how stunningly accurate the linear programming methods were; the very large number of repeated prime factors here actually make this discrete problem behave rather like a continuous one.
Almost 20 years ago, I wrote a textbook in real analysis called “Analysis I“. It was intended to complement the many good available analysis textbooks out there by focusing more on foundational issues, such as the construction of the natural numbers, integers, rational numbers, and reals, as well as providing enough set theory and logic to allow students to develop proofs at high levels of rigor.
While some proof assistants such as Coq or Agda were well established when the book was written, formal verification was not on my radar at the time. However, now that I have had some experience with this subject, I realize that the content of this book is in fact very compatible with such proof assistants; in particular, the ‘naive type theory’ that I was implicitly using to do things like construct the standard number systems, dovetails well with the dependent type theory of Lean (which, among other things, has excellent support for quotient types).
I have therefore decided to launch a Lean companion to “Analysis I”, which is a “translation” of many of the definitions, theorems, and exercises of the text into Lean. In particular, this gives an alternate way to perform the exercises in the book, by instead filling in the corresponding “sorries” in the Lean code. (I do not however plan on hosting “official” solutions to the exercises in this companion; instead, feel free to create forks of the repository in which these sorries are filled in.)
Currently, the following sections of the text have been translated into Lean:
The formalization has been deliberately designed to be separate from the standard Lean math library Mathlib at some places, but reliant on it at others. For instance, Mathlib already has a standard notion of the natural numbers . In the Lean formalization, I first develop “by hand” an alternate construction Chapter2.Nat of the natural numbers (or just Nat, if one is working in the Chapter2 namespace), setting up many of the basic results about these alternate natural numbers which parallel similar lemmas about that are already in Mathlib (but with many of these lemmas set as exercises to the reader, with the proofs currently replaced with “sorries”). Then, in an epilogue section, isomorphisms between these alternate natural numbers and the Mathlib natural numbers are established (or more precisely, set as exercises). From that point on, the Chapter 2 natural numbers are deprecated, and the Mathlib natural numbers are used instead. I intend to continue this general pattern throughout the book, so that as one advances into later chapters, one increasingly relies on Mathlib’s definitions and functions, rather than directly referring to any counterparts from earlier chapters. As such, this companion could also be used as an introduction to Lean and Mathlib as well as to real analysis (somewhat in the spirit of the “Natural number game“, which in fact has significant thematic overlap with Chapter 2 of my text).
The code in this repository compiles in Lean, but I have not tested whether all of the (numerous) “sorries” in the code can actually be filled (i.e., if all the exercises can actually be solved in Lean). I would be interested in having volunteers “playtest” the companion to see if this can actually be done (and if the helper lemmas or “API” provided in the Lean files are sufficient to fill in the sorries in a conceptually straightforward manner without having to rely on more esoteric Lean programming techniques). Any other feedback will of course also be welcome.
[UPDATE, May 31: moved the companion to a standalone repository.]
for various , where is a parameter going to infinity, counts the number of zeroes of the Riemann zeta function of real part at least and imaginary part between and , and is an exponent which one would like to be as small as possible. The Riemann hypothesis would allow one to take for any , but this is an unrealistic goal, and in practice one would be happy with some non-trivial upper bounds on . A key target here is the density hypothesis that asserts that for all (this is in some sense sharp because the Riemann-von Mangoldt formula implies that ); this hypothesis is currently known for and , but the known bounds are not strong enough to establish this hypothesis in the remaining region. However, there was a recent advance of Guth and Maynard, which among other things improved the upper bound on from to , marking the first improvement in this bound in over four decades. Here is a plot of the best known upper bounds on , either unconditionally, assuming the density hypothesis, or the stronger Lindelöf hypothesis:
One of the reasons we care about zero density theorems is that they allow one to localize the prime number theorem to short intervals. In particular, if we have the uniform bound for all , then this leads to the prime number theorem
holding for all if , and for almost all (possibly excluding a set of density zero) if . For instance, the Guth-Maynard results give a prime number theorem in almost all short intervals for as small as , and the density hypotheis would lower this just to .
However, one can ask about more information on this exceptional set, in particular to bound its “dimension” , which roughly speaking amounts to getting an upper bound of on the size of the exceptional set in any large interval . Based on the above assertions, one expects to only be bounded by for , be bounded by for , but have some intermediate bound for the remaining exponents.
This type of question had been studied in the past, most direclty by Bazzanella and Perelli, although there is earlier work by many authors om some related quantities (such as the second moment of prime gaps) by such authors as Selberg and Heath-Brown. In most of these works, the best available zero density estimates at that time were used to obtain specific bounds on quantities such as , but the numerology was usually tuned to those specific estimates, with the consequence being that when newer zero density estimates were discovered, one could not readily update these bounds to match. In this paper we abstract out the arguments from previous work (largely based on the explicit formula for the primes and the second moment method) to obtain an explicit relationship between and , namely that
where
Actually, by also utilizing fourth moment methods, we obtain a stronger bound
where
and is the exponent in “additive energy zero density theorems”
where is similar to , but bounds the “additive energy” of zeroes rather than just their cardinality. Such bounds have appeared in the literature since the work of Heath-Brown, and are for instance a key ingredient in the recent work of Guth and Maynard. Here are the current best known bounds:
These explicit relationships between exponents are perfectly suited for the recently launched Analytic Number Theory Exponent Database (ANTEDB) (discussed previously here), and have been uploaded to that site.
This formula is moderately complicated (basically an elaborate variant of a Legendre transform), but easy to calculate numerically with a computer program. Here is the resulting bound on unconditionally and under the density hypothesis (together with a previous bound of Bazzanella and Perelli for comparison, where the range had to be restricted due to a gap in the argument we discovered while trying to reproduce their results):
For comparison, here is the situation assuming strong conjectures such as the density hypothesis, Lindelof hypothesis, or Riemann hypothesis:
You can classify representations of simple Lie groups using Dynkin diagrams, but you can also classify representations of ‘classical’ Lie groups using Young diagrams. Hermann Weyl wrote a whole book on this, The Classical Groups.
This approach is often treated as a bit outdated, since it doesn’t apply to all the simple Lie groups: it leaves out the so-called ‘exceptional’ groups. But what makes a group ‘classical’?
There’s no precise definition, but a classical group always has an obvious representation, you can get other representations by doing obvious things to this obvious one, and it turns out you can get all the representations this way.
For a long time I’ve been hoping to bring these ideas up to date using category theory. I had a bunch of conjectures, but I wasn’t able to prove any of them. Now Todd Trimble and I have made progress:
We tackle something even more classical than the classical groups: the monoid of matrices, with matrix multiplication as its monoid operation.
The monoid of matrices has an obvious -dimensional representation, and you can get all its representations from this one by operations that you can apply to any representation. So its category of representations is generated by this one obvious representation, in some sense. And it’s almost freely generated: there’s just one special relation. What’s that, you ask? It’s a relation saying the obvious representation is -dimensional!
That’s the basic idea. We need to make it more precise. We do it using the theory of 2-rigs, where for us a 2-rig is a symmetric monoidal linear category that is Cauchy complete. All the operations you can apply to any representation of a monoid are packed into this jargon.
Let’s write for the monoid of matrices over a field , and for its 2-rig of representations. Then we want to say something like: is the free 2-rig on an object of dimension . That’s the kind of result I’ve been dreaming of.
To get this to be true, though, we need to say what kind of representations we’re talking about! Clearly we want finite-dimensional ones. But we need to be careful: we should only take finite-dimensional algebraic representations. Those are representations where the matrix entries of are polynomials in the matrix entries of . Otherwise, even the monoid of matrices gets lots of 1-dimensional representations coming from automorphisms of the field . Classifying those is a job for Galois theorists, not representation theorists.
So, we define to be the category of algebraic representations of the monoid , and we want to say is the free 2-rig on an object of dimension . But we need to say what it means for an object of a 2-rig to have dimension .
The definition that works is to demand that the st exterior power of should vanish:
But this is true for any vector space of dimension less than or equal to . So in our paper we say has subdimension when this holds. (There’s another stronger condition for having dimension exactly , but interestingly this is not what we want here. You’ll see why shortly.)
So here’s the theorem we prove, with all the fine print filled in:
Theorem. Suppose is a field of characteristic zero and let be the 2-rig of algebraic representations of the monoid . Then the representation of on by matrix multiplication has subdimension . Moreover, is the free 2-rig on an object of subdimension . In other words, suppose is any 2-rig containing an object of subdimension . Then there is a map of 2-rigs,
unique up to natural isomorphism, such that .
Or, in simple catchy terms: is the walking monoid with a representation of subdimension .
To prove this theorem we need to deploy some concepts.
First, the fact that we’re talking about algebraic representations means that we’re not really treating as a bare monoid (a monoid in the category of sets). Instead, we’re treating it as a monoid in the category of affine schemes. But monoids in affine schemes are equivalent to commutative bialgebras, and this is often a more practical way of working with them.
Second, we need to use Tannaka reconstruction. This tells you how to reconstruct a commutative bialgebra from a 2-rig (which is secretly its 2-rig of representations) together with a faithful 2-rig map to (which secretly sends any representation to its underlying vector space).
We want to apply this to the free 2-rig on an object of subdimension . Luckily because of this universal property it automatically gets a 2-rig map to sending to . So we just have to show this map is faithful, apply Tannaka reconstruction, and get out the commutative bialgebra corresponding to !
Well, I say ‘just’, but it takes some real work. It turns out to be useful to bring in the free 2-rig on one object. The reason is that we studied the free 2-rig on one object in two previous papers, so we know a lot about it:
We can use this knowledge if we think of the free 2-rig on an object of subdimension as a quotient of the free 2-rig on one object by a ‘2-ideal’. To do this, we need to develop the theory of ‘2-ideals’. But that’s good anyway — it will be useful for many other things.
So that’s the basic plan of the paper. It was really great working with Todd on this, taking a rough conjecture and building all the machinery necessary to make it precise and prove it.
What about representations of classical groups like , the orthogonal and symplectic groups, and so on? At the end of the paper we state a bunch of conjectures about these. Here’s the simplest one:
Theorem. Suppose is a field of characteristic zero and let be the 2-rig of algebraic representations of Then the representation of on by matrix multiplication has dimension , meaning its th exterior power has an inverse with respect to tensor product. Moreover, is the free 2-rig on an object of dimension .
This ‘inverse with respect to tensor product’ stuff is an abstract way of saying that the determinant representation of has an inverse, namely the representation .
It will take new techniques to prove this. I look forward to people tackling this and our other conjectures. Categorified rig theory can shed new light on group representation theory, bringing Weyl’s beautiful ideas forward into the 21st century.
Every week has brought more news about actions that, either as a collateral effect or a deliberate goal, will deeply damage science and engineering research in the US. Put aside for a moment the tremendously important issue of student visas (where there seems to be a policy of strategic vagueness, to maximize the implicit threat that there may be selective actions). Put aside the statement from a Justice Department official that there is a general plan is to "bring these universities to their knees", on the pretext that this is somehow about civil rights.
The detailed version of the presidential budget request for FY26 is now out (pdf here for the NSF portion). If enacted, it would be deeply damaging to science and engineering research in the US and the pipeline of trained students who support the technology sector. Taking NSF first: The topline NSF budget would be cut from $8.34B to $3.28B. Engineering would be cut by 75%, Math and Physical Science by 66.8%. The anticipated agency-wide success rate for grants would nominally drop below 7%, though that is misleading (basically taking the present average success rate and cutting it by 2/3, while some programs are already more competitive than others.). In practice, many programs already have future-year obligations, and any remaining funds will have to go there, meaning that many programs would likely have no awards at all in the coming fiscal year. The NSF's CAREER program (that agency's flagship young investigator program) would go away This plan would also close one of the LIGO observatories (see previous link). (This would be an extra bonus level of stupid, since LIGO's ability to do science relies on having two facilities, to avoid false positives and to identify event locations in the sky. You might as well say that you'll keep an accelerator running but not the detector.) Here is the table that I think hits hardest, dollars aside:
The number of people involved in NSF activities would drop by 240,000. The graduate research fellowship program would be cut by more than half. The NSF research training grant program (another vector for grad fellowships) would be eliminated.
The situation at NIH and NASA is at least as bleak. See here for a discussion from Joshua Weitz at Maryland which includes this plot:
This proposed dismantling of US research and especially the pipeline of students who support the technology sector (including medical research, computer science, AI, the semiconductor industry, chemistry and chemical engineering, the energy industry) is astonishing in absolute terms. It also does not square with the claim of some of our elected officials and high tech CEOs to worry about US competitiveness in science and engineering. (These proposed cuts are not about fiscal responsibility; just the amount added in the proposed DOD budget dwarfs these cuts by more than a factor of 3.)
Time is a gentleman - it waits patiently. And in physics, as in all exact sciences, problems and mysteries eventually get resolved, if we give it enough time. That is how science works, after all: the consensus on our explanation of reality changes as we acquire more information on the latter.
Eliezer Yudkowsky and Nate Soares are publishing a mass-market book, the rather self-explanatorily-titled If Anyone Builds It, Everyone Dies. (Yes, the “it” means “sufficiently powerful AI.”) The book is now available for preorder from Amazon:
(If you plan to buy the book at all, Eliezer and Nate ask that you do preorder it, as this will apparently increase the chance of it making the bestseller lists and becoming part of The Discourse.)
I was graciously offered a chance to read a draft and offer, not a “review,” but some preliminary thoughts. So here they are:
For decades, Eliezer has been warning the world that an AI might soon exceed human abilities, and proceed to kill everyone on earth, in pursuit of whatever strange goal it ended up with. It would, Eliezer said, be something like what humans did to the earlier hominids. Back around 2008, I followed the lead of most of my computer science colleagues, who considered these worries, even if possible in theory, comically premature given the primitive state of AI at the time, and all the other severe crises facing the world.
Now, of course, not even two decades later, we live on a planet that’s being transformed by some of the signs and wonders that Eliezer foretold. The world’s economy is about to be upended by entities like Claude and ChatGPT, AlphaZero and AlphaFold—whose human-like or sometimes superhuman cognitive abilities, obtained “merely” by training neural networks (in the first two cases, on humanity’s collective output) and applying massive computing power, constitute (I’d say) the greatest scientific surprise of my lifetime. Notably, these entities have already displayed some of the worrying behaviors that Eliezer warned about decades ago—including lying to humans in pursuit of a goal, and hacking their own evaluation criteria. Even many of the economic and geopolitical aspects have played out as Eliezer warned they would: we’ve now seen AI companies furiously racing each other, seduced by the temptation of being (as he puts it) “the first monkey to taste the poisoned banana,” discarding their previous explicit commitments to safety, transparency, and the public good once they get in the way.
Today, then, even if one still isn’t ready to swallow the full package of Yudkowskyan beliefs, any empirically minded person ought to be updating in its direction—and acting accordingly. Which brings us to the new book by Eliezer and his collaborator Nate Soares. This book is far and away the clearest, most accessible presentation of Eliezer’s beliefs, the culmination of a quarter-century of his developing and talking about them. That undoubtedly owes a great deal to Nate, who seems to have sanded down the infamously brusque rough edges of Eliezer’s writing style. So much the better! But it also owes a lot to the world itself: current events now offer an endless supply of real-world examples for Eliezer’s formerly abstract arguments about AI, examples that the book deploys to maximum effect.
The book also mines history—everything from the Wright Brothers to World War II to the Chernobyl accident—for lessons about human attitudes toward technological progress, safety, and risk. And it maintains Eliezer’s fondness for stories and parables, one of the most charming features of his writing.
Even today, I’m not nearly as confident about the doom scenario as Eliezer and Nate are. I don’t know whether an AI’s goals are really “orthogonal” to its abilities, in the sense that will matter in practice. And when I reach the part where the AI, having copied itself all over the Internet and built robot factories, then invents and releases self-replicating nanotechnology that gobbles the surface of the earth in hours or days, a large part of me still screams out that there must be practical bottlenecks that haven’t been entirely accounted for here.
And yet, even if you agree with only a quarter of what Eliezer and Nate write, you’re likely to close this book fully convinced—as I am—that governments need to shift to a more cautious approach to AI, an approach more respectful of the civilization-changing enormity of what’s being created. And that, if they won’t, their citizens need to pressure them to do so.
So regardless of how much they agree or disagree, I’d like everyone on earth who cares about the future to read this book, debate its ideas, and have its thesis in mind when they’re discussing AI.
As for me? It would’ve been better if I’d reached my current position earlier: if I hadn’t needed empirical reality, plus superb writing like Eliezer’s and Nate’s, to bonk me over the head with the risks that AI was likely to pose to humanity in my lifetime. But having failed to see as far ahead as they did, the least I can do is update. You should too, and you can start by reading the book.
As it happens, this weekend I’ll be at LessOnline, the rationalist blogging conference in Berkeley, where (among my other events) I’ll engage in a dialogue/debate with Nate Soares about the orthogonality thesis, one of the crucial underpinnings of his and Eliezer’s case for AI doom. So, I’ll probably be LessAvailable to respond to comments on this post. But feel free to discuss anyway! After all, it’s merely the fate of all Earth-originating life that’s at stake here, not some actually hot-button topic like Trump or Gaza.
I had a piece in Scientific American last week. It’s paywalled, but if you’re a subscriber there you can see it, or you can buy the print magazine.
(I also had twopieces out in other outlets this week. I’ll be saying more about them…in a couple weeks.)
The Scientific American piece is about an apocalyptic particle physics scenario called vacuum decay. It’s a topic I covered last year in Quanta Magazine, an unlikely event where the Higgs field which gives fundamental particles their mass changes value, suddenly making all other particles much more massive and changing physics as we know it. It’s a change that physicists think would start as a small bubble and spread at (almost) the speed of light, covering the universe.
What I wrote for Quanta was a short news piece covering a small adjustment to the calculation, one that made the chance of vacuum decay slightly more likely. (But still mind-bogglingly small, to be clear.)
Scientific American asked for a longer piece, and that gave me space to dig deeper. I was able to say more about how vacuum decay works, with a few metaphors that I think should make it a lot easier to understand. I also got to learn about some new developments, in particular, an interesting story about how tiny primordial black holes could make vacuum decay dramatically more likely.
One thing that was a bit too complicated to talk about were the puzzles involved in trying to calculate these chances. In the article, I mention a calculation of the chance of vacuum decay by a team including Matthew Schwartz. That calculation wasn’t the first to estimate the chance of vacuum decay, and it’s not the most recent update either. Instead, I picked it because Schwartz’s team approached the question in what struck me as a more reliable way, trying to cut through confusion by asking the most basic question you can in a quantum theory: given that now you observe X, what’s the chance that later you observe Y? Figuring out how to turn vacuum decay into that kind of question correctly is tricky (for example, you need to include the possibility that vacuum decay happens, then reverses, then happens again).
The calculations of black holes speeding things up didn’t work things out in quite as much detail. I like to think I’ve made a small contribution by motivating them to look at Schwartz’s work, which might spawn a more rigorous calculation in future. When I talked to Schwartz, he wasn’t even sure whether the picture of a bubble forming in one place and spreading at light speed is correct: he’d calculated the chance of the initial decay, but hadn’t found a similarly rigorous way to think about the aftermath. So even more than the uncertainty I talk about in the piece, the questions about new physics and probability, there is even some doubt about whether the whole picture really works the way we’ve been imagining it.
That makes for a murky topic! But it’s also a flashy one, a compelling story for science fiction and the public imagination, and yeah, another motivation to get high-precision measurements of the Higgs and top quark from future colliders! (If maybe not quite the way this guy said it.)
Grant Sanderson (who runs, and creates most of the content for, the website and Youtube channel 3blue1brown) has been collaborating with myself and others (including my coauthor Tanya Klowden) on producing a two-part video giving an account of some of the history of the cosmic distance ladder, building upon a previous public lecture I gave on this topic, and also relating to a forthcoming popular book with Tanya on this topic. The first part of this video is available here; the second part is available here.
The videos were based on a somewhat unscripted interview that Grant conducted with me some months ago, and as such contained some minor inaccuracies and omissions (including some made for editing reasons to keep the overall narrative coherent and within a reasonable length). They also generated many good questions from the viewers of the Youtube video. I am therefore compiling here a “FAQ” of various clarifications and corrections to the videos; this was originally placed as a series of comments on the Youtube channel, but the blog post format here will be easier to maintain going forward. Some related content will also be posted on the Instagram page for the forthcoming book with Tanya.
Questions on the two main videos are marked with an appropriate timestamp to the video.
4:26Did Eratosthenes really check a local well in Alexandria?
This was a narrative embellishment on my part. Eratosthenes’s original work is lost to us. The most detailed contemperaneous account, by Cleomedes, gives a simplified version of the method, and makes reference only to sundials (gnomons) rather than wells. However, a secondary account of Pliny states (using this English translation), “Similarly it is reported that at the town of Syene, 5000 stades South of Alexandria, at noon in midsummer no shadow is cast, and that in a well made for the sake of testing this the light reaches to the bottom, clearly showing that the sun is vertically above that place at the time”. However, no mention is made of any well in Alexandria in either account.
4:50How did Eratosthenes know that the Sun was so far away that its light rays were close to parallel?
This was not made so clear in our discussions or in the video (other than a brief glimpse of the timeline at 18:27), but Eratosthenes’s work actually came after Aristarchus, so it is very likely that Eratosthenes was aware of Aristarchus’s conclusions about how distant the Sun was from the Earth. Even if Aristarchus’s heliocentric model was disputed by the other Greeks, at least some of his other conclusions appear to have attracted some support. Also, after Eratosthenes’s time, there was further work by Greek, Indian, and Islamic astronomers (such as Hipparchus, Ptolemy, Aryabhata, and Al-Battani) to measure the same distances that Aristarchus did, although these subsequent measurements for the Sun also were somewhat far from modern accepted values.
5:17Is it completely accurate to say that on the summer solstice, the Earth’s axis of rotation is tilted “directly towards the Sun”?
Strictly speaking, “in the direction towards the Sun” is more accurate than “directly towards the Sun”; it tilts at about 23.5 degrees towards the Sun, but it is not a total 90-degree tilt towards the Sun.
5:39Wait, aren’t there two tropics? The tropic of Cancer and the tropic of Capricorn?
Yes! This corresponds to the two summers Earth experiences, one in the Northern hemisphere and one in the Southern hemisphere. The tropic of Cancer, at a latitude of about 23 degrees north, is where the Sun is directly overhead at noon during the Northern summer solstice (around June 21); the tropic of Capricorn, at a latitude of about 23 degrees south, is where the Sun is directly overhead at noon during the Southern summer solstice (around December 21). But Alexandria and Syene were both in the Northern Hemisphere, so it is the tropic of Cancer that is relevant to Eratosthenes’ calculations.
5:41Isn’t it kind of a massive coincidence that Syene was on the tropic of Cancer?
Actually, Syene (now known as Aswan) was about half a degree of latitude away from the tropic of Cancer, which was one of the sources of inaccuracy in Eratosthenes’ calculations. But one should take the “look-elsewhere effect” into account: because the Nile cuts across the tropic of Cancer, it was quite likely to happen that the Nile would intersect the tropic near some inhabited town. It might not necessarily have been Syene, but that would just mean that Syene would have been substituted by this other town in Eratosthenes’s account.
On the other hand, it was fortunate that the Nile ran from South to North, so that distances between towns were a good proxy for the differences in latitude. Apparently, Eratosthenes actually had a more complicated argument that would also work if the two towns in question were not necessarily oriented along the North-South direction, and if neither town was on the tropic of Cancer; but unfortunately the original writings of Eratosthenes are lost to us, and we do not know the details of this more general argument. (But some variants of the method can be found in later work of Posidonius, Aryabhata, and others.)
Nowadays, the “Eratosthenes experiment” is run every year on the March equinox, in which schools at the same longitude are paired up to measure the elevation of the Sun at the same point in time, in order to obtain a measurement of the circumference of the Earth. (The equinox is more convenient than the solstice when neither location is on a tropic, due to the simple motion of the Sun at that date.) With modern timekeeping, communications, surveying, and navigation, this is a far easier task to accomplish today than it was in Eratosthenes’ time.
6:30I thought the Earth wasn’t a perfect sphere. Does this affect this calculation?
Yes, but only by a small amount. The centrifugal forces caused by the Earth’s rotation along its axis cause an equatorial bulge and a polar flattening so that the radius of the Earth fluctuates by about 20 kilometers from pole to equator. This sounds like a lot, but it is only about 0.3% of the mean Earth radius of 6371 km and is not the primary source of error in Eratosthenes’ calculations.
7:27Are the riverboat merchants and the “grad student” the leading theories for how Eratosthenes measured the distance from Alexandria to Syene?
There is some recent research that suggests that Eratosthenes may have drawn on the work of professional bematists (step measurers – a precursor to the modern profession of surveyor) for this calculation. This somewhat ruins the “grad student” joke, but perhaps should be disclosed for the sake of completeness.
8:51How long is a “lunar month” in this context? Is it really 28 days?
In this context the correct notion of a lunar month is a “synodic month” – the length of a lunar cycle relative to the Sun – which is actually about 29 days and 12 hours. It differs from the “sidereal month” – the length of a lunar cycle relative to the fixed stars – which is about 27 days and 8 hours – due to the motion of the Earth around the Sun (or the Sun around the Earth, in the geocentric model). [A similar correction needs to be made around 14:59, using the synodic month of 29 days and 12 hours rather than the “English lunar month” of 28 days (4 weeks).]
10:47Is the time taken for the Moon to complete an observed rotation around the Earth slightly less than 24 hours as claimed?
Actually, I made a sign error: the lunar day (also known as a tidal day) is actually 24 hours and 50 minutes, because the Moon rotates in the same direction as the spinning of Earth around its axis. The animation therefore is also moving in the wrong direction as well (related to this, the line of sight is covering up the Moon in the wrong direction to the Moon rising at around 10:38).
11:32Is this really just a coincidence that the Moon and Sun have almost the same angular width?
I believe so. First of all, the agreement is not that good: due to the non-circular nature of the orbit of the Moon around the Earth, and Earth around the Sun, the angular width of the Moon actually fluctuates to be as much as 10% larger or smaller than the Sun at various times (cf. the “supermoon” phenomenon). All other known planets with known moons do not exhibit this sort of agreement, so there does not appear to be any universal law of nature that would enforce this coincidence. (This is in contrast with the empirical fact that the Moon always presents the same side to the Earth, which occurs in all other known large moons (as well as Pluto), and is well explained by the physical phenomenon of tidal locking.)
On the other hand, as the video hopefully demonstrates, the existence of the Moon was extremely helpful in allowing the ancients to understand the basic nature of the solar system. Without the Moon, their task would have been significantly more difficult; but in this hypothetical alternate universe, it is likely that modern cosmology would have still become possible once advanced technology such as telescopes, spaceflight, and computers became available, especially when combined with the modern mathematics of data science. Without giving away too many spoilers, a scenario similar to this was explored in the classic short story and novel “Nightfall” by Isaac Asimov.
12:58Isn’t the illuminated portion of the Moon, as well as the visible portion of the Moon, slightly smaller than half of the entire Moon, because the Earth and Sun are not an infinite distance away from the Moon?
Technically yes (and this is actually for a very similar reason to why half Moons don’t quite occur halfway between the new Moon and the full Moon); but this fact turns out to have only a very small effect on the calculations, and is not the major source of error. In reality, the Sun turns out to be about 86,000 Moon radii away from the Moon, so asserting that half of the Moon is illuminated by the Sun is actually a very good first approximation. (The Earth is “only” about 220 Moon radii away, so the visible portion of the Moon is a bit more noticeably less than half; but this doesn’t actually affect Aristarchus’s arguments much.)
The angular diameter of the Sun also creates an additional thin band between the fully illuminated and fully non-illuminated portions of the Moon, in which the Sun is intersecting the lunar horizon and so only illuminates the Moon with a portion of its light, but this is also a relatively minor effect (and the midpoints of this band can still be used to define the terminator between illuminated and non-illuminated for the purposes of Aristarchus’s arguments).
13:27What is the difference between a half Moon and a quarter Moon?
If one divides the lunar month, starting and ending at a new Moon, into quarters (weeks), then half moons occur both near the end of the first quarter (a week after the new Moon, and a week before the full Moon), and near the end of the third quarter (a week after the full Moon, and a week before the new Moon). So, somewhat confusingly, half Moons come in two types, known as “first quarter Moons” and “third quarter Moons”.
14:49I thought the sine function was introduced well after the ancient Greeks.
It’s true that the modern sine function only dates back to the Indian and Islamic mathematical traditions in the first millennium CE, several centuries after Aristarchus. However, he still had Euclidean geometry at his disposal, which provided tools such as similar triangles that could be used to reach basically the same conclusions, albeit with significantly more effort than would be needed if one could use modern trigonometry.
On the other hand, Aristarchus was somewhat hampered by not knowing an accurate value for , which is also known as Archimedes’ constant: the fundamental work of Archimedes on this constant actually took place a few decades after that of Aristarchus!
15:17I plugged in the modern values for the distances to the Sun and Moon and got 18 minutes for the discrepancy, instead of half an hour.
Yes; I quoted the wrong number here. In 1630, Godfried Wendelen replicated Aristarchus’s experiment. With improved timekeeping and the then-recent invention of the telescope, Wendelen obtained a measurement of half an hour for the discrepancy, which is significantly better than Aristarchus’s calculation of six hours, but still a little bit off from the true value of 18 minutes. (As such, Wendelinus’s estimate for the distance to the Sun was 60% of the true value.)
15:27Wouldn’t Aristarchus also have access to other timekeeping devices than sundials?
Yes, for instance clepsydrae (water clocks) were available by that time; but they were of limited accuracy. It is also possible that Aristarchus could have used measurements of star elevations to also estimate time; it is not clear whether the astrolabe or the armillary sphere was available to him, but he would have had some other more primitive astronomical instruments such as the dioptra at his disposal. But again, the accuracy and calibration of these timekeeping tools would have been poor.
However, most likely the more important limiting factor was the ability to determine the precise moment at which a perfect half Moon (or new Moon, or full Moon) occurs; this is extremely difficult to do with the naked eye. (The telescope would not be invented for almost two more millennia.)
17:37Could the parallax problem be solved by assuming that the stars are not distributed in a three-dimensional space, but instead on a celestial sphere?
Putting all the stars on a fixed sphere would make the parallax effects less visible, as the stars in a given portion of the sky would now all move together at the same apparent velocity – but there would still be visible large-scale distortions in the shape of the constellations because the Earth would be closer to some portions of the celestial sphere than others; there would also be variability in the brightness of the stars, and (if they were very close) the apparent angular diameter of the stars. (These problems would be solved if the celestial sphere was somehow centered around the moving Earth rather than the fixed Sun, but then this basically becomes the geocentric model with extra steps.)
18:29Did nothing of note happen in astronomy between Eratosthenes and Copernicus?
Not at all! There were significant mathematical, technological, theoretical, and observational advances by astronomers from many cultures (Greek, Islamic, Indian, Chinese, European, and others) during this time, for instance improving some of the previous measurements on the distance ladder, a better understanding of eclipses, axial tilt, and even axial precession, more sophisticated trigonometry, and the development of new astronomical tools such as the astrolabe. See for instance this “deleted scene” from the video, as well as the FAQ entry for 14:49 for this video and 24:54 for the second video, or this instagram post. But in order to make the overall story of the cosmic distance ladder fit into a two-part video, we chose to focus primarily on the first time each rung of the ladder was climbed.
We have since learned that this portrait was most likely painted in the 19th century, and may have been based more on Kepler’s mentor, Michael Mästlin. A more commonly accepted portrait of Kepler may be found at his current Wikipedia page.
19:07Isn’t it tautological to say that the Earth takes one year to perform a full orbit around the Sun?
Technically yes, but this is an illustration of the philosophical concept of “referential opacity“: the content of a sentence can change when substituting one term for another (e.g., “1 year” and “365 days”), even when both terms refer to the same object. Amusingly, the classic illustration of this, known as Frege’s puzzles, also comes from astronomy: it is an informative statement that Hesperus (the evening star) and Phosphorus (the morning star, also known as Lucifer) are the same object (which nowadays we call Venus), but it is a mere tautology that Hesperus and Hesperus are the same object: changing the reference from Phosphorus to Hesperus changes the meaning.
19:10How did Copernicus figure out the crucial fact that Mars takes 687 days to go around the Sun? Was it directly drawn from Babylonian data?
Technically, Copernicus drew from tables by European astronomers that were largely based on earlier tables from the Islamic golden age, which in turn drew from earlier tables by Indian and Greek astronomers, the latter of which also incorporated data from the ancient Babylonians, so it is more accurate to say that Copernicus relied on centuries of data, at least some of which went all the way back to the Babylonians. Among all of this data was the times when Mars was in opposition to the Sun; if one imagines the Earth and Mars as being like runners going around a race track circling the Sun, with Earth on an inner track and Mars on an outer track, oppositions are analogous to when the Earth runner “laps” the Mars runner. From the centuries of observational data, such “laps” were known to occur about once every 780 days (this is known as the synodic period of Mars). Because the Earth takes roughly 365 days to perform a “lap”, it is possible to do a little math and conclude that Mars must therefore complete its own “lap” in 687 days (this is known as the sidereal period of Mars). (See also this post on the cosmic distance ladder Instagram for some further elaboration.)
The situation is complex. When Kepler served as Brahe’s assistant, Brahe only provided Kepler with a limited amount of data, primarily involving Mars, in order to confirm Brahe’s own geo-heliocentric model. After Brahe’s death, the data was inherited by Brahe’s son-in-law and other relatives, who intended to publish Brahe’s work separately; however, Kepler, who was appointed as Imperial Mathematician to succeed Brahe, had at least some partial access to the data, and many historians believe he secretly copied portions of this data to aid his own research before finally securing complete access to the data from Brahe’s heirs after several years of disputes. On the other hand, as intellectual property rights laws were not well developed at this time, Kepler’s actions were technically legal, if ethically questionable.
21:39What is that funny loop in the orbit of Mars?
This is known as retrograde motion. This arises because the orbital velocity of Earth (about 30 km/sec) is a little bit larger than that of Mars (about 24 km/sec). So, in opposition (when Mars is in the opposite position in the sky than the Sun), Earth will briefly overtake Mars, causing its observed position to move westward rather than eastward. But in most other times, the motion of Earth and Mars are at a sufficient angle that Mars will continue its apparent eastward motion despite the slightly faster speed of the Earth.
21:59Couldn’t one also work out the direction to other celestial objects in addition to the Sun and Mars, such as the stars, the Moon, or the other planets? Would that have helped?
Actually, the directions to the fixed stars were implicitly used in all of these observations to determine how the celestial sphere was positioned, and all the other directions were taken relative to that celestial sphere. (Otherwise, all the calculations would be taken on a rotating frame of reference in which the unknown orbits of the planets were themselves rotating, which would have been an even more complex task.) But the stars are too far away to be useful as one of the two landmarks to triangulate from, as they generate almost no parallax and so cannot distinguish one location from another.
Measuring the direction to the Moon would tell you which portion of the lunar cycle one was in, and would determine the phase of the Moon, but this information would not help one triangulate, because the Moon’s position in the heliocentric model varies over time in a somewhat complicated fashion, and is too tied to the motion of the Earth to be a useful “landmark” to one to determine the Earth’s orbit around the Sun.
In principle, using the measurements to all the planets at once could allow for some multidimensional analysis that would be more accurate than analyzing each of the planets separately, but this would require some sophisticated statistical analysis and modeling, as well as non-trivial amounts of compute – neither of which were available in Kepler’s time.
22:57Can you elaborate on how we know that the planets all move on a plane?
The Earth’s orbit lies in a plane known as the ecliptic (it is where the lunar and solar eclipses occur). Different cultures have divided up the ecliptic in various ways; in Western astrology, for instance, the twelve main constellations that cross the ecliptic are known as the Zodiac. The planets can be observed to only wander along the Zodiac, but not other constellations: for instance, Mars can be observed to be in Cancer or Libra, but never in Orion or Ursa Major. From this, one can conclude (as a first approximation, at least), that the planets all lie on the ecliptic.
However, this isn’t perfectly true, and the planets will deviate from the ecliptic by a small angle known as the ecliptic latitude. Tycho Brahe’s observations on these latitudes for Mars were an additional useful piece of data that helped Kepler complete his calculations (basically by suggesting how to join together the different “jigsaw pieces”), but the math here gets somewhat complicated, so the story here has been somewhat simplified to convey the main ideas.
23:04What are the other universal problem solving tips?
Grant Sanderson has a list (in a somewhat different order) in this previous video.
23:28Can one work out the position of Earth from fixed locations of the Sun and Mars when the Sun and Mars are in conjunction (the same location in the sky) or opposition (opposite locations in the sky)?
Technically, these are two times when the technique of triangulation fails to be accurate; and also in the former case it is extremely difficult to observe Mars due to the proximity to the Sun. But again, following the Universal Problem Solving Tip from 23:07, one should initially ignore these difficulties to locate a viable method, and correct for these issues later. This videoseries by Welch Labs goes into Kepler’s methods in more detail.
24:04So Kepler used Copernicus’s calculation of 687 days for the period of Mars. But didn’t Kepler discard Copernicus’s theory of circular orbits?
Good question! It turns out that Copernicus’s calculations of orbital periods are quite robust (especially with centuries of data), and continue to work even when the orbits are not perfectly circular. But even if the calculations did depend on the circular orbit hypothesis, it would have been possible to use the Copernican model as a first approximation for the period, in order to get a better, but still approximate, description of the orbits of the planets. This in turn can be fed back into the Copernican calculations to give a second approximation to the period, which can then give a further refinement of the orbits. Thanks to the branch of mathematics known as perturbation theory, one can often make this type of iterative process converge to an exact answer, with the error in each successive approximation being smaller than the previous one. (But performing such an iteration would probably have been beyond the computational resources available in Kepler’s time; also, the foundations of perturbation theory require calculus, which only was developed several decades after Kepler.)
24:21Did Brahe have exactly 10 years of data on Mars’s positions?
Actually, it was more like 17 years, but with many gaps, due both to inclement weather, as well as Brahe turning his attention to other astronomical objects than Mars in some years; also, in times of conjunction, Mars might only be visible in the daytime sky instead of the night sky, again complicating measurements. So the “jigsaw puzzle pieces” in 25:26 are in fact more complicated than always just five locations equally spaced in time; there are gaps and also observational errors to grapple with. But to understand the method one should ignore these complications; again, see “Universal Problem Solving Tip #1”. Even with his “idea of true genius”, it took many years of further painstaking calculation for Kepler to tease out his laws of planetary motion from Brahe’s messy and incomplete observational data.
26:44Shouldn’t the Earth’s orbit be spread out at perihelion and clustered closer together at aphelion, to be consistent with Kepler’s laws?
Yes, you are right; there was a coding error here.
26:53What is the reference for Einstein’s “idea of pure genius”?
Actually, the precise quote was “an idea of true genius”, and can be found in the introduction to Carola Baumgardt’s “Life of Kepler“.
Strictly speaking; no; his writings are all in Arabic, and he was nominally a subject of the Abbasid Caliphate whose rulers were Arab; but he was born in Khwarazm (in modern day Uzbekistan), and would have been a subject of either the Samanid empire or the Khrawazmian empire, both of which were largely self-governed and primarily Persian in culture and ethnic makeup, despite being technically vassals of the Caliphate. So he would have been part of what is sometimes called “Greater Persia” or “Greater Iran”.
Another minor correction: while Al-Biruni was born in the tenth century, his work on the measurement of the Earth was published in the early eleventh century.
Is really called the angle of declination?
This was a misnomer on my part; this angle is more commonly called the dip angle.
But the height of the mountain would be so small compared to the radius of the Earth! How could this method work?
Using the Taylor approximation , one can approximately write the relationship between the mountain height , the Earth radius , and the dip angle (in radians) as . The key point here is the inverse quadratic dependence on , which allows for even relatively small values of to still be realistically useful for computing . Al-Biruni’s measurement of the dip angle was about radians, leading to an estimate of that is about four orders of magnitude larger than , which is within ballpark at least of a typical height of a mountain (on the order of a kilometer) and the radius of the Earth (6400 kilometers).
Was the method really accurate to within a percentage point?
This is disputed, somewhat similarly to the previous calculations of Eratosthenes. Al-Biruni’s measurements were in cubits, but there were multiple incompatible types of cubit in use at the time. It has also been pointed out that atmospheric refraction effects would have created noticeable changes in the observed dip angle . It is thus likely that the true accuracy of Al-Biruni’s method was poorer than 1%, but that this was somehow compensated for by choosing a favorable conversion between cubits and modern units.
1:13Did Captain Cook set out to discover Australia?
One of the objectives of Cook’s first voyage was to discover the hypothetical continent of Terra Australis. This was considered to be distinct from Australia, which at the time was known as New Holland. As this name might suggest, prior to Cook’s voyage, the northwest coastline of New Holland had been explored by the Dutch; Cook instead explored the eastern coastline, naming this portion New South Wales. The entire continent was later renamed to Australia by the British government, following a suggestion of Matthew Flinders; and the concept of Terra Australis was abandoned.
4:40The relative position of the Northern and Southern hemisphere observations is reversed from those earlier in the video.
Yes, this was a slight error in the animation; the labels here should be swapped for consistency of orientation.
7:06So, when did they finally manage to measure the transit of Venus, and use this to compute the astronomical unit?
While Le Gentil had the misfortune to not be able to measure either the 1761 or 1769 transits, other expeditions of astronomers (led by Dixon-Mason, Chappe d’Auteroche, and Cook) did take measurements of one or both of these transits with varying degrees of success, with the measurements of Cook’s team of the 1769 transit in Tahiti being of particularly high quality. All of this data was assembled later by Lalande in 1771, leading to the most accurate measurement of the astronomical unit at the time (within 2.3% of modern values, which was about three times more accurate than any previous measurement).
8:53What does it mean for the transit of Io to be “twenty minutes ahead of schedule” when Jupiter is in opposition (Jupiter is opposite to the Sun when viewed from the Earth)?
Actually, it should be halved to “ten minutes ahead of schedule”, with the transit being “ten minutes behind schedule” when Jupiter is in conjunction, with the net discrepancy being twenty minutes (or actually closer to 16 minutes when measured with modern technology). Both transits are being compared against an idealized periodic schedule in which the transits are occuring at a perfectly regular rate (about 42 hours), where the period is chosen to be the best fit to the actual data. This discrepancy is only noticeable after carefully comparing transit times over a period of months; at any given position of Jupiter, the Doppler effects of Earth moving towards or away from Jupiter would only affect shift each transit by just a few seconds compared to the previous transit, with the delays or accelerations only becoming cumulatively noticeable after many such transits.
Also, the presentation here is oversimplified: at times of conjunction, Jupiter and Io are too close to the Sun for observation of the transit. Rømer actually observed the transits at other times than conjunction, and Huygens used more complicated trigonometry than what was presented here to infer a measurement for the speed of light in terms of the astronomical unit (which they had begun to measure a bit more accurately than in Aristarchus’s time; see the FAQ entry for 15:17 in the first video).
10:05Are the astrological signs for Earth and Venus swapped here?
Yes, this was a small mistake in the animation.
10:34Shouldn’t one have to account for the elliptical orbit of the Earth, as well as the proper motion of the star being observed, or the effects of general relativity?
Yes; the presentation given here is a simplified one to convey the idea of the method, but in the most advanced parallax measurements, such as the ones taken by the Hipparcos and Gaia spacecraft, these factors are taken into account, basically by taking as many measurements (not just two) as possible of a single star, and locating the best fit of that data to a multi-parameter model that incorporates the (known) orbit of the Earth with the (unknown) distance and motion of the star, as well as additional gravitational effects from other celestial bodies, such as the Sun and other planets.
14:53The formula I was taught for apparent magnitude of stars looks a bit different from the one here.
This is because astronomers use a logarithmic scale to measure both apparent magnitude and absolute magnitude. If one takes the logarithm of the inverse square law in the video, and performs the normalizations used by astronomers to define magnitude, one arrives at the standard relation between absolute and apparent magnitude.
But this is an oversimplification, most notably due to neglect of the effects of extinction effects caused by interstellar dust. This is not a major issue for the relatively short distances observable via parallax, but causes problems at larger scales of the ladder (see for instance the FAQ entry here for 18:08). To compensate for this, one can work in multiple frequencies of the spectrum (visible, x-ray, radio, etc.), as some frequencies are less susceptible to extinction than others. From the discrepancies between these frequencies one can infer the amount of extinction, leading to “dust maps” that can then be used to facilitate such corrections for subsequent measurements in the same area of the universe. (More generally, the trend in modern astronomy is towards “multi-messenger astronomy” in which one combines together very different types of measurements of the same object to obtain a more accurate understanding of that object and its surroundings.)
18:08Can we really measure the entire Milky Way with this method?
Strictly speaking, there is a “zone of avoidance” on the far side of the Milky way that is very difficult to measure in the visible portion of the spectrum, due to the large amount of intervening stars, dust, and even a supermassive black hole in the galactic center. However, in recent years it has become possible to explore this zone to some extent using the radio, infrared, and x-ray portions of the spectrum, which are less affected by these factors.
18:19How did astronomers know that the Milky Way was only a small portion of the entire universe?
This issue was the topic of the “Great Debate” in the early twentieth century. It was only with the work of Hubble using Leavitt’s law to measure distances to Magellanic clouds and “spiral nebulae” (that we now know to be other galaxies), building on earlier work of Leavitt and Hertzsprung, that it was conclusively established that these clouds and nebulae in fact were at much greater distances than the diameter of the Milky Way.
18:45How can one compensate for light blending effects when measuring the apparent magnitude of Cepheids?
This is a non-trivial task, especially if one demands a high level of accuracy. Using the highest resolution telescopes available (such as HST or JWST) is of course helpful, as is switching to other frequencies, such as near-infrared, where Cepheids are even brighter relative to nearby non-Cepheid stars. One can also apply sophisticated statistical methods to fit to models of the point spread of light from unwanted sources, and use nearby measurements of the same galaxy without the Cepheid as a reference to help calibrate those models. Improving the accuracy of the Cepheid portion of the distance ladder is an ongoing research activity in modern astronomy.
18:54What is the mechanism that causes Cepheids to oscillate?
For most stars, there is an equilibrium size: if the star’s radius collapses, then the reduced potential energy is converted to heat, creating pressure to pushing the star outward again; and conversely, if the star expands, then it cools, causing a reduction in pressure that no longer counteracts gravitational forces. But for Cepheids, there is an additional mechanism called the kappa mechanism: the increased temperature caused by contraction increases ionization of helium, which drains energy from the star and accelerates the contraction; conversely, the cooling caused by expansion causes the ionized helium to recombine, with the energy released accelerating the expansion. If the parameters of the Cepheid are in a certain “instability strip”, then the interaction of the kappa mechanism with the other mechanisms of stellar dynamics create a periodic oscillation in the Cepheid’s radius, which increases with the mass and brightness of the Cepheid.
For a recent re-analysis of Leavitt’s original Cepheid data, see this paper.
19:10Did Leavitt mainly study the Cepheids in our own galaxy?
This was an inaccuracy in the presentation. Leavitt’s original breakthrough paper studied Cepheids in the Small Magellanic Cloud. At the time, the distance to this cloud was not known; indeed, it was a matter of debate whether this cloud was in the Milky Way, or some distance away from it. However, Leavitt (correctly) assumed that all the Cepheids in this cloud were roughly the same distance away from our solar system, so that the apparent brightness was proportional to the absolute brightness. This gave an uncalibrated form of Leavitt’s law between absolute brightness and period, subject to the (then unknown) distance to the Small Magellanic Cloud. After Leavitt’s work, there were several efforts (by Hertzsprung, Russell, and Shapley) to calibrate the law by using the few Cepheids for which other distance methods were available, such as parallax. (Main sequence fitting to the Hertzsprung-Russell diagram was not directly usable, as Cepheids did not lie on the main sequence; but in some cases one could indirectly use this method if the Cepheid was in the same stellar cluster as a main sequence star.) Once the law was calibrated, it could be used to measure distances to other Cepheids, and in particular to compute distances to extragalactic objects such as the Magellanic clouds.
19:15Was Leavitt’s law really a linear law between period and luminosity?
Strictly speaking, the period-luminosity relation commonly known as Leavitt’s law was a linear relation between the absolute magnitude of the Cepheid and the logarithm of the period; undoing the logarithms, this becomes a power law between the luminosity and the period.
20:26Was Hubble the one to discover the redshift of galaxies?
This was an error on my part; Hubble was using earlier work of Vesto Slipher on these redshifts, and combining it with his own measurements of distances using Leavitt’s law to arrive at the law that now bears his name; he was also assisted in his observations by Milton Humason. It should also be noted that Georges Lemaître had also independently arrived at essentially the same law a few years prior, but his work was published in a somewhat obscure journal and did not receive broad recognition until some time later.
20:37Hubble’s original graph doesn’t look like a very good fit to a linear law.
Hubble’s original data was somewhat noisy and inaccurate by modern standards, and the redshifts were affected by the peculiar velocities of individual galaxies in addition to the expanding nature of the universe. However, as the data was extended to more galaxies, it became increasingly possible to compensate for these effects and obtain a much tighter fit, particularly at larger scales where the effects of peculiar velocity are less significant. See for instance this article from 2015 where Hubble’s original graph is compared with a more modern graph. This more recent graph also reveals a slight nonlinear correction to Hubble’s law at very large scales that has led to the remarkable discovery that the expansion of the universe is in fact accelerating over time, a phenomenon that is attributed to a positive cosmological constant (or perhaps a more complex form of dark energy in the universe). On the other hand, even with this nonlinear correction, there continues to be a roughly 10% discrepancy of this law with predictions based primarily on the cosmic microwave background radiation; see the FAQ entry for 23:49.
20:46Does general relativity alone predict an uniformly expanding universe?
This was an oversimplification. Einstein’s equations of general relativity contain a parameter , known as the cosmological constant, which currently is only computable indirectly from fitting to experimental data. But even with this constant fixed, there are multiple solutions to these equations (basically because there are multiple possible initial conditions for the universe). For the purposes of cosmology, a particularly successful family of solutions are the solutions given by the Lambda-CDM model. This family of solutions contains additional parameters, such as the density of dark matter in the universe. Depending on the precise values of these parameters, the universe could be expanding or contracting, with the rate of expansion or contraction either increasing, decreasing, or staying roughly constant. But if one fits this model to all available data (including not just red shift measurements, but also measurements on the cosmic microwave background radiation and the spatial distribution of galaxies), one deduces a version of Hubble’s law which is nearly linear, but with an additional correction at very large scales; see the next item of this FAQ.
21:07Is Hubble’s original law sufficiently accurate to allow for good measurements of distances at the scale of the observable universe?
Not really; as mentioned in the end of the video, there were additional efforts to cross-check and calibrate Hubble’s law at intermediate scales between the range of Cepheid methods (about 100 million light years) and observable universe scales (about 100 billion light years) by using further “standard candles” than Cepheids, most notably Type Ia supernovae (which are bright enough and predictable enough to be usable out to about 10 billion light years), the Tully-Fisher relation between the luminosity of a galaxy and its rotational speed, and gamma ray bursts. It turns out that due to the accelerating nature of the universe’s expansion, Hubble’s law is not completely linear at these large scales; this important correction cannot be discerned purely from Cepheid data, but also requires the other standard candles, as well as fitting that data (as well as other observational data, such as the cosmic microwave background radiation) to the cosmological models provided by general relativity (with the best fitting models to date being some version of the Lambda-CDM model).
On the other hand, a naive linear extrapolation of Hubble’s original law to all larger scales does provide a very rough picture of the observable universe which, while too inaccurate for cutting edge research in astronomy, does give some general idea of its large-scale structure.
21:15Where did this guess of the observable universe being about 20% of the full universe come from?
There are some ways to get a lower bound on the size of the entire universe that go beyond the edge of the observable universe. One is through analysis of the cosmic microwave background radiation (CMB), that has been carefully mapped out by several satellite observatories, most notably WMAP and Planck. Roughly speaking, a universe that was less than twice the size of the observable universe would create certain periodicities in the CMB data; such periodicities are not observed, so this provides a lower bound (see for instance this paper for an example of such a calculation). The 20% number was a guess based on my vague recollection of these works, but there is no consensus currently on what the ratio truly is; there are some proposals that the entire universe is in fact several orders of magnitude larger than the observable one.
The situation is somewhat analogous to Aristarchus’s measurement of the distance to the Sun, which was very sensitive to a small angle (the half-moon discrepancy). Here, the predicted size of the universe under the standard cosmological model is similarly dependent in a highly sensitive fashion on a measure of the flatness of the universe which, for reasons still not fully understood (but likely caused by some sort of inflation mechanism), happens to be extremely close to zero. As such, predictions for the size of the universe remain highly volatile at the current level of measurement accuracy.
23:44Was it a black hole collision that allowed for an independent measurement of Hubble’s law?
This was a slight error in the presentation. While the first gravitational wave observation by LIGO in 2015 was of a black hole collision, it did not come with an electromagnetic counterpart that allowed for a redshift calculation that would yield a Hubble’s law measurement. However, a later collision of neutron stars, observed in 2017, did come with an associated kilonova in which a redshift was calculated, and led to a Hubble measurement which was independent of most of the rungs of the distance ladder.
23:49Where can I learn more about this 10% discrepancy in Hubble’s law?
This is known as the Hubble tension (or, in more sensational media, the “crisis in cosmology”): roughly speaking, the various measurements of Hubble’s constant (either from climbing the cosmic distance ladder, or by fitting various observational data to standard cosmological models) tend to arrive at one of two values, that are about 10% apart from each other. The values based on gravitational wave observations are currently consistent with both values, due to significant error bars in this extremely sensitive method; but other more mature methods are now of sufficient accuracy that they are basically only consistent with one of the two values. Currently there is no consensus on the origin of this tension: possibilities include systemic biases in the observational data, subtle statistical issues with the methodology used to interpret the data, a correction to the standard cosmological model, the influence of some previously undiscovered law of physics, or some partial breakdown of the Copernican principle.
For an accessible recent summary of the situation, see this video by Becky Smethurst (“Dr. Becky”).
24:49So, what is a Type Ia supernova and why is it so useful in the distance ladder?
A Type Ia supernova occurs when a white dwarf in a binary system draws more and more mass from its companion star, until it reaches the Chandrasekhar limit, at which point its gravitational forces are strong enough to cause a collapse that increases the pressure to the point where a supernova is triggered via a process known as carbon detonation. Because of the universal nature of the Chandrasekhar limit, all such supernovae have (as a first approximation) the same absolute brightness and can thus be used as standard candles in a similar fashion to Cepheids (but without the need to first measure any auxiliary observable, such as a period). But these supernovae are also far brighter than Cepheids, and can so this method can be used at significantly larger distances than the Cepheid method (roughly speaking it can handle distances of up to ~10 billion light years, whereas Cepheids are reliable out to ~100 million light years). Among other things, the supernovae measurements were the key to detecting an important nonlinear correction to Hubble’s law at these scales, leading to the remarkable conclusion that the expansion of the universe is in fact accelerating over time, which in the Lambda-CDM model corresponds to a positive cosmological constant, though there are more complex “dark energy” models that are also proposed to explain this acceleration.
This is partly due to time constraints, and the need for editing to tighten the narrative, but was also a conscious decision on my part. Advanced classes on the distance ladder will naturally focus on the most modern, sophisticated, and precise ways to measure distances, backed up by the latest mathematics, physics, technology, observational data, and cosmological models. However, the focus in this video series was rather different; we sought to portray the cosmic distance ladder as evolving in a fully synergestic way, across many historical eras, with the evolution of mathematics, science, and technology, as opposed to being a mere byproduct of the current state of these other disciplines. As one specific consequence of this change of focus, we emphasized the first time any rung of the distance ladder was achieved, at the expense of more accurate and sophisticated later measurements at that rung. For instance, refinements in the measurement of the radius of the Earth since Eratosthenes, improvements in the measurement of the astronomical unit between Aristarchus and Cook, or the refinements of Hubble’s law and the cosmological model of the universe in the twentieth and twenty-first centuries, were largely omitted (though some of the answers in this FAQ are intended to address these omissions).
Many of the topics not covered here (or only given a simplified treatment) are discussed in depth in other expositions, including other Youtube videos. I would welcome suggestions from readers for links to such resources in the comments to this post. Here is a partial list:
“Eratosthenes” – Cosmos (Carl Sagan), video posted Apr 24, 2009 (originally released Oct 1, 1980, as part of the episode “The Shores of the Cosmic Ocean”).
“How Far Away Is It” – David Butler, a multi-part series beginning Aug 16 2013.
Recent events are very dire for research at US universities, and I will write further about those, but first a quick unrelated survey for those at such institutions. Back in the day, it was common for physics and some other (mechanical engineering?) departments to have machine shops with professional staff. In the last 15-20 years, there has been a huge growth in maker-spaces on campuses to modernize and augment those capabilities, though often maker-spaces are aimed at undergraduate design courses rather than doing work to support sponsored research projects (and grad students, postdocs, etc.). At the same time, it is now easier than ever (modulo tariffs) to upload CAD drawings to a website and get a shop in another country to ship finished parts to you.
Quick questions: Does your university have a traditional or maker-space-augmented machine shop available to support sponsored research? If so, who administers this - a department, a college/school, the office of research? Does the shop charge competitive rates relative to outside vendors? Are grad students trained to do work themselves, and are there professional machinists - how does that mix work?
Thanks for your responses. Feel free to email me if you'd prefer to discuss offline.
Nowadays it is best to exercise caution when bringing the words “quantum” and “consciousness” anywhere near each other, lest you be suspected of mysticism or quackery. Eugene Wigner did not concern himself with this when he wrote his “Remarks on the Mind-Body Question” in 1967. (Perhaps he was emboldened by his recent Nobel prize for contributions to the mathematical foundations of quantum mechanics, which gave him not a little no-nonsense technical credibility.) The mind-body question he addresses is the full-blown philosophical question of “the relation of mind to body”, and he argues unapologetically that quantum mechanics has a great deal to say on the matter. The workhorse of his argument is a thought experiment that now goes by the name “Wigner’s Friend”. About fifty years later, Daniela Frauchiger and Renato Renner formulated another, more complex thought experiment to address related issues in the foundations of quantum theory. In this post, I’ll introduce Wigner’s goals and argument, and evaluate Frauchiger’s and Renner’s claims of its inadequacy, concluding that these are not completely fair, but that their thought experiment does do something interesting and distinct. Finally, I will describe a recent paper of my own, in which I formalize the Frauchiger-Renner argument in a way that illuminates its status and isolates the mathematical origin of their paradox.
* * *
Wigner takes a dualist view about the mind, that is, he believes it to be non-material. To him this represents the common-sense view, but is nevertheless a newly mainstream attitude. Indeed,
[until] not many years ago, the “existence” of a mind or soul would have been passionately denied by most physical scientists. The brilliant successes of mechanistic and, more generally, macroscopic physics and of chemistry overshadowed the obvious fact that thoughts, desires, and emotions are not made of matter, and it was nearly universally accepted among physical scientists that there is nothing besides matter.
He credits the advent of quantum mechanics with
the return, on the part of most physical scientists, to the spirit of Descartes’s “Cogito ergo sum”, which recognizes the thought, that is, the mind, as primary. [With] the creation of quantum mechanics, the concept of consciousness came to the fore again: it was not possible to formulate the laws of quantum mechanics in a fully consistent way without reference to the consciousness.
What Wigner has in mind here is that the standard presentation of quantum mechanics speaks of definite outcomes being obtained when an observer makes a measurement. Of course this is also true in classical physics. In quantum theory, however, the principles of linear evolution and superposition, together with the plausible assumption that mental phenomena correspond to physical phenomena in the brain, lead to situations in which there is no mechanism for such definite observations to arise. Thus there is a tension between the fact that we would like to ascribe particular observations to conscious agents and the fact that we would like to view these observations as corresponding to particular physical situations occurring in their brains.
Once we have convinced ourselves that, in light of quantum mechanics, mental phenomena must be considered on an equal footing with physical phenomena, we are faced with the question of how they interact. Wigner takes it for granted that “if certain physico-chemical conditions are satisfied, a consciousness, that is, the property of having sensations, arises.” Does the influence run the other way? Wigner claims that the “traditional answer” is that it does not, but argues that in fact such influence ought indeed to exist. (Indeed this, rather than technical investigation of the foundations of quantum mechanics, is the central theme of his essay.) The strongest support Wigner feels he can provide for this claim is simply “that we do not know of any phenomenon in which one subject is influenced by another without exerting an influence thereupon”. Here he recalls the interaction of light and matter, pointing out that while matter obviously affects light, the effects of light on matter (for example radiation pressure) are typically extremely small in magnitude, and might well have been missed entirely had they not been suggested by the theory.
Quantum mechanics provides us with a second argument, in the form of a demonstration of the inconsistency of several apparently reasonable assumptions about the physical, the mental, and the interaction between them. Wigner works, at least implicitly, within a model where there are two basic types of object: physical systems and consciousnesses. Some physical systems (those that are capable of instantiating the “certain physico-chemical conditions”) are what we might call mind-substrates. Each consciousness corresponds to a mind-substrate, and each mind-substrate corresponds to at most one consciousness. He considers three claims (this organization of his premises is not explicit in his essay):
1. Isolated physical systems evolve unitarily.
2. Each consciousness has a definite experience at all times.
3. Definite experiences correspond to pure states of mind-substrates, and arise for a consciousness exactly when the corresponding mind-substrate is in the corresponding pure state.
The first and second assumptions constrain the way the model treats physical and mental phenomena, respectively. Assumption 1 is often paraphrased as the `”completeness of quantum mechanics”, while Assumption 2 is a strong rejection of solipsism – the idea that only one’s own mind is sure to exist. Assumption 3 is an apparently reasonable assumption about the relation between mental and physical phenomena.
With this framework established, Wigner’s thought experiment, now typically known as Wigner’s Friend, is quite straightforward. Suppose that an observer, Alice (to name the friend), is able to perform a measurement of some physical quantity of a particle, which may take two values, and . Assumption 1 tells us that if Alice performs this measurement when the particle is in a superposition state, the joint system of Alice’s brain and the particle will end up in an entangled state. Now Alice’s mind-substrate is not in a pure state, so by Assumption 3 does not have a definite experience. This contradicts Assumption 2. Wigner’s proposed resolution to this paradox is that in fact Assumption 1 is incorrect, and that there is an influence of the mental on the physical, namely objective collapse or, as he puts it, that the “statistical element which, according to the orthodox theory, enters only if I make an observation enters equally if my friend does”.
* * *
Decades after the publication of Wigner’s essay, Daniela Frauchiger and Renato Renner formulated a new thought experiment, involving observers making measurements of other observers, which they intended to remedy what they saw as a weakness in Wigner’s argument. In their words, “Wigner proposed an argument […] which should show that quantum mechanics cannot have unlimited validity”. In fact, they argue, Wigner’s argument does not succeed in doing so. They assert that Wigner’s paradox may be resolved simply by noting a difference in what each party knows. Whereas Wigner, describing the situation from the outside, does not initially know the result of his friend’s measurement, and therefore assigns the “absurd” entangled state to the joint system composed of both her body and the system she has measured, his friend herself is quite aware of what she has observed, and so assigns to the system either, but not both, of the states corresponding to definite measurement outcomes. “For this reason”, Frauchiger and Renner argue, “the Wigner’s Friend Paradox cannot be regarded as an argument that rules out quantum mechanics as a universally valid theory.”
This criticism strikes me as somewhat unfair to Wigner. In fact, Wigner’s objection to admitting two different states as equally valid descriptions is that the two states correspond to different sets of \textit{physical} properties of the joint system consisting of Alice and the system she measures. For Wigner, physical properties of physical systems are distinct from mental properties of consciousnesses. To engage in some light textual analysis, we can note that the word ‘conscious’, or ‘consciousness’, appears forty-one times in Wigner’s essay, and only once in Frauchiger and Renner’s, in the title of a cited paper. I have the impression that the authors pay inadequate attention to how explicitly Wigner takes a dualist position, including not just physical systems but also, and distinctly, consciousnesses in his ontology. Wigner’s argument does indeed achieve his goals, which are developed in the context of this strong dualism, and differ from the goals of Frauchiger and Renner, who appear not to share this philosophical stance, or at least do not commit fully to it.
Nonetheless, the thought experiment developed by Frauchiger and Renner does achieve something distinct and interesting. We can understand Wigner’s no-go theorem to be of the following form: “Within a model incorporating both mental and physical phenomena, a set of apparently reasonable conditions on how the model treats physical phenomena, mental phenomena, and their interaction cannot all be satisfied”. The Frauchiger-Renner thought experiment can be cast in the same form, with different choices about how to implement the model and which conditions to consider. The major difference in the model itself is that Frauchiger and Renner do not take consciousnesses to be entities in their own rights, but simply take some states of certain physical systems to correspond to conscious experiences. Within such a model, Wigner’s assumption that each mind has a single, definite conscious experience at all times seems far less natural than it did within his model, where consciousnesses are distinct entities from the physical systems that determine them. Thus Frauchiger and Renner need to weaken this assumption, which was so natural to Wigner. The weakening they choose is a sort of transitivity of theories of mind. In their words (Assumption C in their paper):
Suppose that agent A has established that “I am certain that agent A’, upon reasoning within the same theory as the one I am using, is certain that at time .” Then agent A can conclude that “I am certain that at time .”
Just as Assumption 3 above was, for Wigner, a natural restriction on how a sensible theory ought to treat mental phenomena, this serves as Frauchiger’s and Renner’s proposed constraint. Just as Wigner designed a thought experiment that demonstrated the incompatibility of his assumption with an assumption of the universal applicability of unitary quantum mechanics to physical systems, so do Frauchiger and Renner.
* * *
In my recent paper “Reasoning across spacelike surfaces in the Frauchiger-Renner thought experiment”, I provide two closely related formalizations of the Frauchiger-Renner argument. These are motivated by a few observations:
1. Assumption C ought to make reference to the (possibly different) times at which agents and are certain about their respective judgments, since these states of knowledge change.
2. Since Frauchiger and Renner do not subscribe to Wigner’s strong dualism, an agent’s certainty about a given proposition, like any other mental state, corresponds within their implicit model to a physical state. Thus statements like “Alice knows that P” should be understood as statements about the state of some part of Alice’s brain. Conditional statements like “if upon measuring a quantity q Alice observes outcome , she knows that P” should be understood as claims about the state of the composite system composed of the part of Alice’s brain responsible for knowing P and the part responsible for recording outcomes of the measurement of q.
3. Because the causal structure of the protocol does not depend on the absolute times of each event, an external agent describing the protocol can choose various “spacelike surfaces”, corresponding to fixed times in different spacetime embeddings of the protocol (or to different inertial frames). There is no reason to privilege one of these surfaces over another, and so each of them should be assigned a quantum state. This may be viewed as an implementation of a relativistic principle.
After developing a mathematical framework based on these observations, I recast Frauchiger’s and Renner’s Assumption C in two ways: first, in terms of a claim about the validity of iterating the “relative state” construction that captures how conditional statements are interpreted in terms of quantum states; and second, in terms of a deductive rule that allows chaining of inferences within a system of quantum logic. By proving that these claims are false in the mathematical framework, I provide a more formal version of the no-go theorem. I also show that the first claim can be rescued if the relative state construction is allowed to be iterated only “along” a single spacelike surface, and the second if a deduction is only allowed to chain inferences “along” a single surface. In other words, the mental transitivity condition desired by Frauchiger and Renner can in fact be combined with universal physical applicability of unitary quantum mechanics, but only if we restrict our analysis to a single spacelike surface. Thus I hope that the analysis I offer provides some clarification of what precisely is going on in Frauchiger and Renner’s thought experiment, what it tells us about combining the physical and the mental in light of quantum mechanics, and how it relates to Wigner’s thought experiment.
* * *
In view of the fact that “Quantum theory cannot consistently describe the use of itself” has, at present, over five hundred citations, and “Remarks on the Mind-Body Question” over thirteen hundred, it seems fitting to close with a thought, cautionary or exultant, from Peter Schwenger’s book on asemic, that is meaningless, writing. He notes that
commentary endlessly extends language; it is in the service of an impossible quest to extract the last, the final, drop of meaning.
I never imagined that an artist would update me about quantum-computing research.
Last year, steampunk artist Bruce Rosenbaum forwarded me a notification about a news article published in Science. The article reported on an experiment performed in physicist Yiwen Chu’s lab at ETH Zürich. The experimentalists had built a “mechanical qubit”: they’d stored a basic unit of quantum information in a mechanical device that vibrates like a drumhead. The article dubbed the device a “steampunk qubit.”
I was collaborating with Bruce on a quantum-steampunk sculpture, and he asked if we should incorporate the qubit into the design. Leave it for a later project, I advised. But why on God’s green Earth are you receiving email updates about quantum computing?
My news feed sends me everything that says “steampunk,” he explained. So keeping a bead on steampunk can keep one up to date on quantum science and technology—as I’ve been preaching for years.
Other ideas displaced Chu’s qubit in my mind until I visited the University of California, Berkeley this January. Visiting Berkeley in January, one can’t help noticing—perhaps with a trace of smugness—the discrepancy between the temperature there and the temperature at home. And how better to celebrate a temperature difference than by studying a quantum-thermodynamics-style throwback to the 1800s?
One sun-drenched afternoon, I learned that one of my hosts had designed another steampunk qubit: Alp Sipahigil, an assistant professor of electrical engineering. He’d worked at Caltech as a postdoc around the time I’d finished my PhD there. We’d scarcely interacted, but I’d begun learning about his experiments in atomic, molecular, and optical physics then. Alp had learned about my work through Quantum Frontiers, as I discovered this January. I had no idea that he’d “met” me through the blog until he revealed as much to Berkeley’s physics department, when introducing the colloquium I was about to present.
Alp and collaborators proposed that a qubit could work as follows. It consists largely of a cantilever, which resembles a pendulum that bobs back and forth. The cantilever, being quantum, can have only certain amounts of energy. When the pendulum has a particular amount of energy, we say that the pendulum is in a particular energy level.
One might hope to use two of the energy levels as a qubit: if the pendulum were in its lowest-energy level, the qubit would be in its 0 state; and the next-highest level would represent the 1 state. A bit—a basic unit of classical information—has 0 and 1 states. A qubit can be in a superposition of 0 and 1 states, and so the cantilever could be.
A flaw undermines this plan, though. Suppose we want to process the information stored in the cantilever—for example, to turn a 0 state into a 1 state. We’d inject quanta—little packets—of energy into the cantilever. Each quantum would contain an amount of energy equal to (the energy associated with the cantilever’s 1 state) – (the amount associated with the 0 state). This equality would ensure that the cantilever could accept the energy packets lobbed at it.
But the cantilever doesn’t have only two energy levels; it has loads. Worse, all the inter-level energy gaps equal each other. However much energy the cantilever consumes when hopping from level 0 to level 1, it consumes that much when hopping from level 1 to level 2. This pattern continues throughout the rest of the levels. So imagine starting the cantilever in its 0 level, then trying to boost the cantilever into its 1 level. We’d probably succeed; the cantilever would probably consume a quantum of energy. But nothing would stop the cantilever from gulping more quanta and rising to higher energy levels. The cantilever would cease to serve as a qubit.
We can avoid this problem, Alp’s team proposed, by placing an atomic-force microscope near the cantilever. An atomic force microscope maps out surfaces similarly to how a Braille user reads: by reaching out a hand and feeling. The microscope’s “hand” is a tip about ten nanometers across. So the microscope can feel surfaces far more fine-grained than a Braille user can. Bumps embossed on a page force a Braille user’s finger up and down. Similarly, the microscope’s tip bobs up and down due to forces exerted by the object being scanned.
Imagine placing a microscope tip such that the cantilever swings toward it and then away. The cantilever and tip will exert forces on each other, especially when the cantilever swings close. This force changes the cantilever’s energy levels. Alp’s team chose the tip’s location, the cantilever’s length, and other parameters carefully. Under the chosen conditions, boosting the cantilever from energy level 1 to level 2 costs more energy than boosting from 0 to 1.
So imagine, again, preparing the cantilever in its 0 state and injecting energy quanta. The cantilever will gobble a quantum, rising to level 1. The cantilever will then remain there, as desired: to rise to level 2, the cantilever would have to gobble a larger energy quantum, which we haven’t provided.1
Will Alp build the mechanical qubit proposed by him and his collaborators? Yes, he confided, if he acquires a student nutty enough to try the experiment. For when he does—after the student has struggled through the project like a dirigible through a hurricane, but ultimately triumphed, and a journal is preparing to publish their magnum opus, and they’re brainstorming about artwork to represent their experiment on the journal’s cover—I know just the aesthetic to do the project justice.
1Chu’s team altered their cantilever’s energy levels using a superconducting qubit, rather than an atomic force microscope.
The United States’ government is waging an all-out assault on Harvard University. The strategy, so far, has been:
Cut most of the grants (present and future) for scientific and medical research, so that thousands of Harvard’s scientists, researchers and graduate students have to stop their work indefinitely. That includes research on life-saving medicine, on poorly understood natural phenomena, and on new technology. This also means that the university will have no money from these activities to pay salaries of its employees.
Eliminate the tax-advantageous status of the university, so that the university is much more expensive to operate.
Prohibit Harvard from having any international students (undergraduate and graduate) and other researchers, so that large numbers of existing scientific and medical research projects that still have funding will have to cease operation. This destroys the careers of thousands of brilliant people — and not just foreigners. Many US faculty and students are working with and depend upon these expelled researchers, and their work will stop too. It also means that Harvard’s budget for the next academic year will be crushed, since it is far too late to replace the tuition from international undergraduate students for the coming year.
The grounds for this war is that Harvard allegedly does not provide a safe environment for its Jewish students, and that Harvard refuses to let the government determine who it may and may not hire.
Now, maybe you can explain to me what this is really about. I’m confused what crimes these scientific researchers commited that justifies stripping them of their grants and derailing their research. I’m also unclear as to why many apolitical, hard-working young trainees in laboratories across the campus deserve to be ejected from their graduate and post-graduate careers and sent home, delaying or ruining their futures. [Few will be able to transfer to other US schools; with all the government cuts to US science, there’s no money to support them at other locations.] And I don’t really understand how such enormous damage and disruption to the lives and careers of ten thousand-ish scientists, researchers and graduate students at Harvard (including many who are Jewish) will actually improve the atmosphere for Harvard’s Jewish students.
As far as I can see, the government is merely using Jewish students as pawns, pretending to attack Harvard on their behalf while in truth harboring no honest concern for their well-being. The fact that the horrors and nastiness surrounding the Gaza war are being exploited by the government as cover for an assault on academic freedom and scientific research is deeply cynical and exceedingly ugly.
From the outside, where Harvard is highly respected — it is certainly among the top five universities in the world, however you rank them — this must look completely idiotic, as idiotic as France gutting the Sorbonne, or the UK eviscerating Oxford. But keep in mind that Harvard is by no means the only target here. The US government is cutting the country’s world-leading research in science, technology and medicine to the bone. If that’s what you want to do, then ruining Harvard makes perfect sense.
The country that benefits the most from this self-destructive behavior? China, obviously. As a friend of mine said, this isn’t merely like shooting yourself in the foot, it’s like shooting yourself in the head.
I suspect most readers will understand that I cannot blog as usual right now. To write good articles about quantum physics requires concentration and focus. When people’s careers and life’s work are being devastated all around me, that’s simply not possible.
I’ve mentioned SciPost a few times on this blog. They’re an open journal in every sense you could think of: diamond open-access scientific publishing on an open-source platform, run with open finances. They even publish their referee reports. They’re aiming to cover not just a few subjects, but a broad swath of academia, publishing scientists’ work in the most inexpensive and principled way possible and challenging the dominance of for-profit journals.
SciPost doesn’t charge university libraries for access, they let anyone read their articles for free. And they don’t charge authors Article Processing Charges (or APCs), they let anyone publish for free. All they do is keep track of which institutions those authors are affiliated with, calculate what fraction of their total costs comes from them, and post it in a nice searchable list on their website.
And amazingly, for the last nine years, they’ve been making that work.
SciPost encourages institutions to pay their share, mostly by encouraging authors to bug their bosses until they do. SciPost will also quite happily accept more than an institution’s share, and a few generous institutions do just that, which is what has kept them afloat so far. But since nothing compels anyone to pay, most organizations simply don’t.
From an economist’s perspective, this is that most basic of problems, the free-rider problem. People want scientific publication to be free, but it isn’t. Someone has to pay, and if you don’t force someone to do it, then the few who pay will be exploited by the many who don’t.
There’s more worth saying, though.
First, it’s worth pointing out that SciPost isn’t paying the same cost everyone else pays to publish. SciPost has a stripped-down system, without any physical journals or much in-house copyediting, based entirely on their own open-source software. As a result, they pay about 500 euros per article. Compare this to the fees negotiated by particle physics’ SCOAP3 agreement, which average to closer to 1000 euros, and realize that those fees are on the low end: for-profit journals tend to make their APCs higher in order to, well, make a profit.
(By the way, while it’s tempting to think of for-profit journals as greedy, I think it’s better to think of them as not cost-effective. Profit is an expense, like the interest on a loan: a payment to investors in exchange for capital used to set up the business. The thing is, online journals don’t seem to need that kind of capital, especially when they’re based on code written by academics in their spare time. So they can operate more cheaply as nonprofits.)
So when an author publishes in SciPost instead of a journal with APCs, they’re saving someone money, typically their institution or their grant. This would happen even if their institution paid their share of SciPost’s costs. (But then they would pay something rather than nothing, hence free-rider problem.)
If an author instead would have published in a closed-access journal, the kind where you have to pay to read the articles and university libraries pay through the nose to get access? Then you don’t save any money at all, your library still has to pay for the journal. You only save money if everybody at the institution stops using the journal. This one is instead a collective action problem.
Collective action problems are hard, and don’t often have obvious solutions. Free-rider problems do suggest an obvious solution: why not just charge?
In SciPost’s case, there are philosophical commitments involved. Their desire to attribute costs transparently and equally means dividing a journal’s cost among all its authors’ institutions, a cost only fully determined at the end of the year, which doesn’t make for an easy invoice.
More to the point, though, charging to publish is directly against what the Open Access movement is about.
That takes some unpacking, because of course, someone does have to pay. It probably seems weird to argue that institutions shouldn’t have to pay charges to publish papers…instead, they should pay to publish papers.
SciPost itself doesn’t go into detail about this, but despite how weird it sounds when put like I just did, there is a difference. Charging a fee to publish means that anyone who publishes needs to pay a fee. If you’re working in a developing country on a shoestring budget, too bad, you have to pay the fee. If you’re an amateur mathematician who works in a truck stop and just puzzled through something amazing, too bad, you have to pay the fee.
Instead of charging a fee, SciPost asks for support. I have to think that part of the reason is that they want some free riders. There are some people who would absolutely not be able to participate in science without free riding, and we want their input nonetheless. That means to support them, others need to give more. It means organizations need to think about SciPost not as just another fee, but as a way they can support the scientific process as a whole.
That’s how other things work, like the arXiv. They get support from big universities and organizations and philanthropists, not from literally everyone. It seems a bit weird to do that for a single scientific journal among many, though, which I suspect is part of why institutions are reluctant to do it. But for a journal that can save money like SciPost, maybe it’s worth it.
The other day I finally emerged from a very stressful push to submit two grant applications to the European Innovation Council. The call in question is for PATHFINDER_OPEN projects, that aim for proofs of principle of groundbreaking technological innovations. So I thought I would broadly report on that experience (no, I am not new to it, but you never cease to learn!), and disclose just a little about the ideas that brought about one of the two projects. Grant applications
This NY Times feature lets you see how each piece of NSF's funding has been reduced this year relative to the normalized average spanning in the last decade. Note: this fiscal year, thanks to the continuing resolution, the actual agency budget has not actually been cut like this. They are just not spending congressionally appropriated agency funds. The agency, fearing/assuming that its budget will get hammered next fiscal year, does not want to start awards that it won't be able to fund in out-years. The result is that this is effectively obeying in advance the presidential budget request for FY26. (And it's highly likely that some will point to unspent funds later in the year and use that as a justification for cuts, when in fact it's anticipation of possible cuts that has led to unspent funds. I'm sure the Germans have a polysyllabic word for this. In English, "Catch-22" is close.)
I encourage you to click the link and go to the article where the graphic is interactive (if it works in your location - not sure about whether the link works internationally). The different colored regions are approximately each of the NSF directorates (in their old organizational structure). Each subsection is a particular program.
Seems like whoever designed the graphic was a fan of Tufte, and the scaling of the shaded areas does quantitatively reflect funding changes. However, most people have a tough time estimating relative areas of irregular polygons. Award funding in physics (the left-most section of the middle region) is down 85% relative to past years. Math is down 72%. Chemistry is down 57%. Materials is down 63%. Earth sciences is down 80%. Polar programs (you know, those folks who run all the amazing experiments in Antarctica) is down 88%.
I know my readers are likely tired of me harping on NSF, but it's both important and a comparatively transparent example of what is also happening at other agencies. If you are a US citizen and think that this is the wrong path, then push on your congressional delegation about the upcoming budget.
Some years ago I speculated that it would nice if a certain mathematical object existed, and even nicer if it were to satisfy an ordinary differential equation of a special sort. I was motivated by a particular physical question, and it seemed very natural to me to imagine such an object... So natural that I was sure that it must already have been studied, the equation for it known. As a result, every so often I'd go down a rabbit hole of a literature dig, but not with much success because it isn't entirely clear where best to look. Then I'd get involved with other projects and forget all about the matter.
Last year I began to think about it again because it might be useful in a method I was developing for a paper, went through the cycle of wondering, and looking for a while, then forgot all about it in thinking about other things.
Then, a little over a month ago at the end of March, while starting on a long flight across the continent, I started thinking about it again, and given that I did not have a connection to the internet to hand, took another approach: I got out a pencil and began mess around in my notebook and just derive what I thought the equation for this object should be, given certain properties it should have. One property is that it should in some circumstances reduce to a known powerful equation (often associated with the legendary 1975 work of Gel'fand and Dikii*) satisfied by the diagonal resolvent $latex {\widehat R}(E,x) {=}\langle x|({\cal H}-E)^{-1}|x\rangle$ of a Schrodinger Hamiltonian $latex {\cal H}=-\hbar^2\partial^2_x+u(x)$. It is:
Here, $latex E$ is an energy of the Hamiltonian, in potential $latex u(x)$, and $latex x$ is a coordinate on the real line.
The object itself would be a generalisation of the diagonal resolvent $latex {\widehat R}(E,x)$, although non-diagonal in the energy, not the [...] Click to continue reading this post →
Now they’re getting more publicity by claiming this will make the universe fizzle out sooner than expected. They’re claiming, for example, that a dead, cold star will emit Hawking radiation, and thus slowly lose mass and eventually disappear!
They admit that this would violate baryon conservation: after all, the protons and neutrons in the star would have to go away somehow! They admit they don’t know how this would happen. They just say that the gravitational field of the star will create particle-antiparticle pairs that will slowly radiate away, forcing the dead star to lose mass somehow to conserve energy.
If experts thought this had even a chance of being true, it would be the biggest thing since sliced bread—at least in the field of quantum gravity. Everyone would be writing papers about it, because if true it would be revolutionary. It would overturn calculations by experts which say that a stationary chunk of matter doesn’t emit Hawking radiation. It would also mean that quantum field theory in curved spacetime can only be consistent if baryon number fails to be conserved! This would be utterly shocking.
But in fact, these new papers have had almost zero effect on physics. There’s a short rebuttal, here:
Unfortunately, it seems the real experts on quantum field theory in curved spacetime have not come out and mentioned the correct way to think about this issue, which has been known at least since 1975. To them—or maybe I should dare to say “us”—it’s just well known that the gravitational field of a static mass does not cause the creation of particle-antiparticle pairs.
Of course, the referees should have rejected Wondrak, van Suijlekom and Falcke’s papers. But apparently none of those referees were experts on the subject at hand. So you can’t trust a paper just because it appears in a supposedly reputable physics journal. You have to actually understand the subject and assess the paper yourself, or talk to some experts you trust.
If I were a science journalist writing an article about a supposedly shocking development like this, I would email some experts and check to see if it’s for real. But plenty of science journalists don’t bother with that anymore: they just believe the press releases. So now we’re being bombarded with lazy articles like these:
The list goes on; these are just three. There’s no way what I say can have much effect against such a flood of misinformation. As Mark Twain said, “A lie can travel around the world and back again while the truth is lacing up its boots.” Actually he probably didn’t say that—but everyone keeps saying he did, illustrating the point perfectly.
Still, there might be a few people who both care and don’t already know this stuff. Instead of trying to give a mini-course here, let me simply point to an explanation of how things really work:
It’s technical, so it’s not easy reading if you haven’t studied quantum field theory and general relativity, but that’s unavoidable. It shows that in a static spacetime there is a well-defined concept of ‘vacuum’, and the vacuum is stable. Jorge Pullin pointed out the key sentence for present purposes:
Thus, if the underlying space-time admits a everywhere time-like Killing field, the vacuum state is indeed stable and phenomena such as the spontaneous creation of particles do not occur.
This condition of having an “everywhere time-like Killing field” says that a spacetime has time translation symmetry. Ashtekar and Magnon also assume that spacetime is globally hyperbolic and that the wave equation for a massive spin-zero particle has a smooth solution given smooth initial data. All this lets us define a concept of energy for solutions of this equation. It also lets us split solutions into positive-frequency solutions, which correspond to particles, and negative-frequency ones, which correspond to antiparticles. We can thus set up quantum field theory in way we’re used to on Minkowski spacetime, where there’s a well-defined vacuum which does not decay into particle-antiparticle pairs.
The Schwarzschild solution, which describes a static black hole, also has a Killing field. But this ceases to be timelike at the event horizon, so this result does not apply to that!
I could go into more detail if required, but you can find a more pedagogical treatment in this standard textbook:
• Robert Wald, Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics, University of Chicago Press, Chicago, 1994.
In particular, go to Section 4.3, which is on quantum field theory in stationary spacetimes.
I also can’t resist citing this thesis by a student of mine:
This thesis covers the case of electromagnetism, while Ashtekar and Magnon, and also Wald, focus on a massive scalar field for simplicity.
So: it’s been rigorously shown that the gravitational field of a static object does not create particle-antiparticle pairs. This has been known for decades. Now some people have done a crude approximate calculation that seems to show otherwise. Some flaws in the approximation have been pointed out. Of course the authors of the calculation don’t believe their approximation is flawed. We could argue about that for a long time. But it’s scarcely worth thinking about, because no approximations were required to settle this issue. It was settled over 50 years ago, and the new work is not shedding new light on the issue: it’s much more hand-wavy than the old work.
Back before satellites, to transmit radio waves over really long distances folks bounced them off the ionosphere—a layer of charged particles in the upper atmosphere. Unfortunately this layer only reflects radio waves with frequencies up to 30 megahertz. This limits the rate at which information can be transmitted.
How to work around this?
METEOR BURST COMMUNICATIONS!
On average, 100 million meteorites weighing about a milligram hit the Earth each day. They vaporize about 120 kilometers up. Each one creates a trail of ions that lasts about a second. And you can bounce radio waves with a frequency up to 100 megahertz off this trail.
That’s not a huge improvement, and you need to transmit in bursts whenever a suitable meterorite comes your way, but the military actually looked into doing this.
The National Bureau of Standards tested a burst-mode system in 1958 that used the 50-MHz band and offered a full-duplex link at 2,400 bits per second. The system used magnetic tape loops to buffer data and transmitters at both ends of the link that operated continually to probe for a path. Whenever the receiver at one end detected a sufficiently strong probe signal from the other end, the transmitter would start sending data. The Canadians got in on the MBC action with their JANET system, which had a similar dedicated probing channel and tape buffer. In 1954 they established a full-duplex teletype link between Ottawa and Nova Scotia at 1,300 bits per second with an error rate of only 1.5%.
There’s a lot more to the story. For example, until recently people used this method in the western United States to report the snow pack from mountain tops!
The system was called SNOTEL, and you can read more about it here:
On Wednesday May 14, 2025 I’ll be giving a talk at 2 pm Pacific Time, or 10 pm UK time. The talk is for physics students at the Universidade de São Paulo in Brazil, organized by Artur Renato Baptista Boyago.
Abstract. The 20th century was the century of fundamental physics. What about the 21st? Progress on fundamental physics has been slow since about 1980, but there is exciting progress in other fields, such as condensed matter. This requires an adjustment in how we think about the goal of physics.
You can see my slides here, or watch a video of the talk here:
Rachel Greenfeld and I have just uploaded to the arXiv our paper Some variants of the periodic tiling conjecture. This paper explores variants of the periodic tiling phenomenon that, in some cases, a tile that can translationally tile a group, must also be able to translationally tile the group periodically. For instance, for a given discrete abelian group , consider the following question:
Question 1 (Periodic tiling question) Let be a finite subset of . If there is a solution to the tiling equation , must there exist a periodic solution to the same equation ?
We know that the answer to this question is positive for finite groups (trivially, since all sets are periodic in this case), one-dimensional groups with finite, and in , but it can fail for for certain finite , and also for for sufficiently large ; see this previous blog post for more discussion. But now one can consider other variants of this question:
Instead of considering level one tilings , one can consider level tilings for a given natural number (so that every point in is covered by exactly translates of ), or more generally for some periodic function .
Instead of requiring and to be indicator functions, one can allow these functions to be integer-valued, thus we are now studying convolution equations where are given integer-valued functions (with periodic and finitely supported).
We are able to obtain positive answers to three such analogues of the periodic tiling conjecture for three cases of this question. The first result (which was kindly shared with us by Tim Austin), concerns the homogeneous problem . Here the results are very satisfactory:
Theorem 2 (First periodic tiling result) Let be a discrete abelian group, and let be integer-valued and finitely supported. Then the following are equivalent.
(i) There exists an integer-valued solution to that is not identically zero.
(ii) There exists a periodic integer-valued solution to that is not identically zero.
(iii) There is a vanishing Fourier coefficient for some non-trivial character of finite order.
By combining this result with an old result of Henry Mann about sums of roots of unity, as well as an even older decidability result of Wanda Szmielew, we obtain
Corollary 3 Any of the statements (i), (ii), (iii) is algorithmically decidable; there is an algorithm that, when given and as input, determines in finite time whether any of these assertions hold.
Now we turn to the inhomogeneous problem in , which is the first difficult case (periodic tiling type results are easy to establish in one dimension, and trivial in zero dimensions). Here we have two results:
Theorem 4 (Second periodic tiling result) Let , let be periodic, and let be integer-valued and finitely supported. Then the following are equivalent.
(i) There exists an integer-valued solution to .
(ii) There exists a periodic integer-valued solution to .
Theorem 5 (Third periodic tiling result) Let , let be periodic, and let be integer-valued and finitely supported. Then the following are equivalent.
(i) There exists an indicator function solution to .
(ii) There exists a periodic indicator function solution to .
In particular, the previously established case of periodic tiling conjecture for level one tilings of , is now extended to higher level. By an old argument of Hao Wang, we now know that the statements mentioned in Theorem 5 are now also algorithmically decidable, although it remains open whether the same is the case for Theorem 4. We know from past results that Theorem 5 cannot hold in sufficiently high dimension (even in the classic case ), but it also remains open whether Theorem 4 fails in that setting.
Following past literature, we rely heavily on a structure theorem for solutions to tiling equations , which roughly speaking asserts that such solutions must be expressible as a finite sum of functions that are one-periodic (periodic in a single direction). This already explains why tiling is easy to understand in one dimension, and why the two-dimensional case is more tractable than the case of general dimension. This structure theorem can be obtained by averaging a dilation lemma, which is a somewhat surprising symmetry of tiling equations that basically arises from finite characteristic arguments (viewing the tiling equation modulo for various large primes ).
For Theorem 2, one can take advantage of the fact that the homogeneous equation is preserved under finite difference operators : if solves , then also solves the same equation . This freedom to take finite differences one to selectively eliminate certain one-periodic components of a solution to the homogeneous equation until the solution is a pure one-periodic function, at which point one can appeal to an induction on dimension, to equate parts (i) and (ii) of the theorem. To link up with part (iii), we also take advantage of the existence of retraction homomorphisms from to to convert a vanishing Fourier coefficient into an integer solution to .
The inhomogeneous results are more difficult, and rely on arguments that are specific to two dimensions. For Theorem 4, one can also perform finite differences to analyze various components of a solution to a tiling equation , but the conclusion now is that the these components are determined (modulo ) by polynomials of one variable. Applying a retraction homomorphism, one can make the coefficients of these polynomials rational, which makes the polynomials periodic. This turns out to reduce the original tiling equation to a system of essentially local combinatorial equations, which allows one to “periodize” a non-periodic solution by periodically repeating a suitable block of the (retraction homomorphism applied to the) original solution.
Theorem 5 is significantly more difficult to establish than the other two results, because of the need to maintain the solution in the form of an indicator function. There are now two separate sources of aperiodicity to grapple with. One is the fact that the polynomials involved in the components may have irrational coefficients (see Theorem 1.3 of our previous paper for an explicit example of this for a level 4 tiling). The other is that in addition to the polynomials (which influence the fractional parts of the components ), there is also “combinatorial” data (roughly speaking, associated to the integer parts of ) which also interact with each other in a slightly non-local way. Once one can make the polynomial coefficients rational, there is enough periodicity that the periodization approach used for the second theorem can be applied to the third theorem; the main remaining challenge is to find a way to make the polynomial coefficients rational, while still maintaining the indicator function property of the solution .
It turns out that the restriction homomorphism approach is no longer available here (it makes the components unbounded, which makes the combinatorial problem too difficult to solve). Instead, one has to first perform a second moment analysis to discern more structure about the polynomials involved. It turns out that the components of an indicator function can only utilize linear polynomials (as opposed to polynomials of higher degree), and that one can partition into a finite number of cosets on which only three of these linear polynomials are “active” on any given coset. The irrational coefficients of these linear polynomials then have to obey some rather complicated, but (locally) finite, sentence in the theory of first-order linear inequalities over the rationals, in order to form an indicator function . One can then use the Weyl equidistribution theorem to replace these irrational coefficients with rational coefficients that obey the same constraints (although one first has to ensure that one does not accidentally fall into the boundary of the constraint set, where things are discontinuous). Then one can apply periodization to the remaining combinatorial data to conclude.
A key technical problem arises from the discontinuities of the fractional part operator at integers, so a certain amount of technical manipulation (in particular, passing at one point to a weak limit of the original tiling) is needed to avoid ever having to encounter this discontinuity.
The human race has made huge progress in the past few thousand years, gradually improving the living condition of human beings by learning how to cure illness; improving farming; harvesting, storing, and using energy in several forms; and countless other activities.
Progress is measured over long time scales, and on metrics related to the access to innovations by all, as Ford once noted. So it is natural for us to consider ourselves lucky to have lived "in the best of times".
Why, if you were born 400 years ago, e.g., you would probably never even learn what a hot shower is! And even only 100 years ago you could have been watching powerless as your children died of diseases that today elicit little worry.
In the news this week was the joint announcement by the presidents of the European Commission and France of initiatives about welcoming top researchers from abroad, with the aim being especially to encourage researchers from the USA to cross the Atlantic. I've seen some discussion online about this among people I know and thought I'd add a few comments here, for those outside Europe thinking about making such a jump.
Firstly, what is the new initiative? Various programmes have been put in place; on the EU side it seems to be encouraging applications to Marie Curie Fellowships for postdocs and ERC grants. It looks like there is some new money, particularly for Marie Curie Fellowships for incoming researchers. Applying for these is generally good advice, as they are prestigious programs that open the way to a career; in my field a Marie Curie often leads to a permanent position, and an ERC grant is so huge that it opens doors everywhere. In France, the programme seems to be an ANR programme targeting specific strategic fields, so unlikely to be relevant for high-energy physicists (despite the fact that they invited Mark Thomson to speak at the meeting). But France can be a destination for the European programmes, and there are good reasons for choose France as a destination.
So the advice would seem to be to try out life in France with a Marie-Curie Fellowship, and then apply through the usual channels for a permanent position. This is very reasonable, because it makes little sense to move permanently before having some idea of what life and research is actually like here first. I would heartily recommend it. There are several permanent positions available every year in the CNRS at the junior level, but because of the way the CNRS hiring works -- via a central committee, that decides for positions in the whole country -- if someone leaves it is not very easy to replace them, and people job-hopping is a recurrent problem. There is also the possibility for people to enter the CNRS at a senior level, with up to one position available in theoretical physics most years.
I wrote a bit last year where I mentioned some of the great things about the CNRS but I will add a bit now. Firstly, what is it? It is a large organisation that essentially just hires permanent researchers, who work in laboratories throughout the country. Most of these laboratories are hosted by universities, such as my lab (the LPTHE) which is hosted by Sorbonne University. Most of these laboratories are mixed, meaning that they also include university staff, i.e. researchers who also teach undergraduates. University positions have a similar but parallel career to the CNRS, but since the teaching is done in French, and because the positions only open on a rather unpredictable basis, I won't talk about them today. The CNRS positions are 100% research; there is little administrative overhead, and therefore plenty of time to focus on what is important. This is the main advantage of such positions; but also the fact that the organisation of researchers is done into laboratories is a big difference to the Anglo-Saxon model. My lab is relatively small, yet contains a large number of people working in HEP, and this provides a very friendly environment with lots of interesting interactions, without being lost in a labyrinthine organisation or having key decisions taken by people working in vastly different (sub) fields.
The main criticisms I have seen bandied around on social media about the CNRS are that the pay is not competitive, and that CNRS researchers are lazy/do not work. I won't comment about pay, because it's difficult to compare. But there is plenty of oversight by the CNRS committee -- a body of our peers elected by all researchers -- which scrutinises activity, in addition to deciding on hiring and promotions. If people were really sitting on their hands then this would be spotted and nipped in the bud; but the process of doing this is not onerous or intrusive, precisely because it is done by our peers. In fact, the yearly and five-yearly reports serve a useful role in helping people to focus their activities and plan for the next one to five years. There is also evaluation of laboratories and universities (the HCERES, which will now be changed into something else) that however seems sensible: it doesn't seem to lead to the same sort of panic or perverse incentives that the (equivalent) REF seems to induce in the UK, for example.
The people I know are incredibly hard-working and productive. This is, to be fair, also a product of the fact that we have relatively few PhD students compared to other countries. This is partly by design: the philosophy is that it is unfair to train lots of students who can never get permanent positions in the field. As a result, we take good care of our students, and the students we have tend to be good; but since we have the time, we mostly do research ourselves, rather than just being managers.
So the main reason to choose France is to be allowed to do the research you want to do, without managerialisation, bureaucrats or other obstacles interfering. If that sounds appealing, then I suggest getting in touch and/or arranging to visit. A visit to the RPP or one of the national meetings would be a great way to start. The applications for Marie Curie fellowships are open now, and the CNRS competition opens in December with a deadline usually in early January.
I’ve now been blogging for nearly twenty years—through five presidential administrations, my own moves from Waterloo to MIT to UT Austin, my work on algebrization and BosonSampling and BQP vs. PH and quantum money and shadow tomography, the publication of Quantum Computing Since Democritus, my courtship and marriage and the birth of my two kids, a global pandemic, the rise of super-powerful AI and the terrifying downfall of the liberal world order.
Yet all that time, through more than a thousand blog posts on quantum computing, complexity theory, philosophy, the state of the world, and everything else, I chased a form of recognition for my blogging that remained elusive.
Until now.
This week I received the following email:
I emailed regarding your blog Shtetl-Optimized Blog which was selected by FeedSpot as one of the Top 50 Quantum Computing Blogs on the web.
We recommend adding your website link and other social media handles to get more visibility in our list, get better ranking and get discovered by brands for collaboration.
We’ve also created a badge for you to highlight this recognition. You can proudly display it on your website or share it with your followers on social media.
We’d be thankful if you can help us spread the word by briefly mentioning Top 50 Quantum Computing Blogs in any of your upcoming posts.
Please let me know if you can do the needful.
You read that correctly: Shtetl-Optimized is now officially one of the top 50 quantum computing blogs on the web. You can click the link to find the other 49.
Maybe it’s not unrelated to this new notoriety that, over the past few months, I’ve gotten a massively higher-than-usual volume of emailed solutions to the P vs. NP problem, as well as the other Clay Millennium Problems (sometimes all seven problems at once), as well as quantum gravity and life, the universe, and everything. I now get at least six or seven confident such emails per day.
While I don’t spend much time on this flood of scientific breakthroughs (how could I?), I’d like to note one detail that’s new. Many of the emails now include transcripts where ChatGPT fills in the details of the emailer’s theories for them—unironically, as though that ought to clinch the case. Who said generative AI wasn’t poised to change the world? Indeed, I’ll probably need to start relying on LLMs myself to keep up with the flood of fan mail, hate mail, crank mail, and advice-seeking mail.
Anyway, thanks for reading everyone! I look forward to another twenty years of Shtetl-Optimized, if my own health and the health of the world cooperate.
Yesterday, the Texas State Legislature heard public comments about SB37, a bill that would give a state board direct oversight over course content and faculty hiring at public universities, perhaps inspired by Trump’s national crackdown on higher education. (See here or here for coverage.) So, encouraged by a friend in the history department, I submitted the following public comment, whatever good it will do.
I’m a computer science professor at UT, although I’m writing in my personal capacity. For 20 years, on my blog and elsewhere, I’ve been outspoken in opposing woke radicalism on campus and (especially) obsessive hatred of Israel that often veers into antisemitism, even when that’s caused me to get attacked from my left. Nevertheless, I write to strongly oppose SB37 in its current form, because of my certainty that no world-class research university can survive ceding control over its curriculum and faculty hiring to the state. If this bill passes, for example, it will severely impact my ability to recruit the most talented computer scientists to UT Austin, if they have competing options that will safeguard their academic freedom as traditionally conceived. Even if our candidates are approved, the new layer of bureaucracy will make it difficult and slow for us to do anything. For those concerned about intellectual diversity in academia, a much better solution would include safeguarding tenure and other protections for faculty with heterodox views, and actually enforcing content-neutral time, place, and manner rules for protests and disruptions. UT has actually done a better job on these things than many other universities in the US, and could serve as a national model for how viewpoint diversity can work — but not under an intolerably stifling regime like the one proposed by this bill.
Grant Sanderson, of 3blue1brown, has put up a phenomenal YouTube video explaining Grover’s algorithm, and dispelling the fundamental misconception about quantum computing, that QC works simply by “trying all the possibilities in parallel.” Let me not futz around: this video explains, in 36 minutes, what I’ve tried to explain over and over on this blog for 20 years … and it does it better. It’s a masterpiece. Yes, I consulted with Grant for this video (he wanted my intuitions for “why is the answer √N?”), and I even have a cameo at the end of it, but I wish I had made the video. Damn you, Grant!
The incomparably great, and absurdly prolific, blogger Zvi Mowshowitz and yours truly spend 1 hour and 40 minutes discussing AI existential risk, education, blogging, and more. I end up “interviewing” Zvi, who does the majority of the talking, which is fine by me, as he has many important things to say! (Among them: his searing critique of those K-12 educators who see it as their life’s mission to prevent kids from learning too much too fast—I’ve linked his best piece on this from the header of this blog.) Thanks so much to Rick Coyle for arranging this conversation.
Progress in quantum complexity theory! In 2000, John Watrous showed that the Group Non-Membership problem is in the complexity class QMA (Quantum Merlin-Arthur). In other words, if some element g is not contained in a given subgroup H of an exponentially large finite group G, which is specified via a black box, then there’s a short quantum proof that g∉H, with only ~log|G| qubits, which can be verified on a quantum computer in time polynomial in log|G|. This soon raised the question of whether Group Non-Membership could be used to separate QMA from QCMA by oracles, where QCMA (Quantum Classical Merlin Arthur), defined by Aharonov and Naveh in 2002, is the subclass of QMA where the proof needs to be classical, but the verification procedure can still be quantum. In other words, could Group Non-Membership be the first non-quantum example where quantum proofs actually help?
In 2006, alas, Greg Kuperberg and I showed that the answer was probably “no”: Group Non-Membership has “polynomial QCMA query complexity.” This means that there’s a QCMA protocol for the problem where Arthur makes only polylog|G| quantum queries to the group oracle—albeit, possibly an exponential in log|G| number of quantum computation steps besides that! To prove our result, Greg and I needed to make mild use of the Classification of Finite Simple Groups, one of the crowning achievements of 20th-century mathematics (its proof is about 15,000 pages long). We conjectured (but couldn’t prove) that someone else, who knew more about the Classification than we did, could show that Group Non-Membership was simply in QCMA outright.
Interestingly, the Group Membership problem had also been a candidate for separating BQP/qpoly, or quantum polynomial time with polynomial-size quantum advice—my personal favorite complexity class—from BQP/poly, or the same thing with polynomial-size classical advice. And it might conceivably still be! The authors explain to me that their protocol doesn’t put Group Membership (with group G and subgroup H depending only on the input length n) into BQP/poly, the reason being that their short classical witnesses for g∉H depend on both g and H, in contrast to Watrous’s quantum witnesses which depended only on H. So there’s still plenty that’s open here! Actually, for that matter, I don’t know of good evidence that the entire Group Membership problem isn’t in BQP—i.e., that quantum computers can’t just solve the whole thing outright, with no Merlins or witnesses in sight!
Anyway, huge congratulations to Le Gall, Nishimura, and Thakkar for peeling back our ignorance of these matters a bit further! Reeeeeeeee!
Potential big progress in quantum algorithms!Vittorio Giovannetti, Seth Lloyd, and Lorenzo Maccone (GLM) have given what they present as a quantum algorithm to estimate the determinant of an n×n matrix A, exponentially faster in some contexts than we know how to do it classically.
[Update (May 5): In the comments, Alessandro Luongo shares a paper where he and Changpeng Shao describe what appears to be essentially the same algorithm back in 2020.]
The algorithm is closely related to the 2008 HHL (Harrow-Hassidim-Lloyd) quantum algorithm for solving systems of linear equations. Which means that anyone who knows the history of this class of quantum algorithms knows to ask immediately: what’s the fine print? A couple weeks ago, when I visited Harvard and MIT, I had a chance to catch up with Seth Lloyd, so I asked him, and he kindly told me. Firstly, we assume the matrix A is Hermitian and positive semidefinite. Next, we assume A is sparse, and not only that, but there’s a QRAM data structure that points to its nonzero entries, so you don’t need to do Grover search or the like to find them, and can query them in coherent superposition. Finally, we assume that all the eigenvalues of A are at least some constant λ>0. The algorithm then estimates det(A), to multiplicative error ε, in time that scales linearly with log(n), and polynomially with 1/λ and 1/ε.
Now for the challenge I leave for ambitious readers: is there a classical randomized algorithm to estimate the determinant under the same assumptions and with comparable running time? In other words, can the GLM algorithm be “Ewinized”? Seth didn’t know, and I think it’s a wonderful crisp open question! On the one hand, if Ewinization is possible, it wouldn’t be the first time that publicity on this blog had led to the brutal murder of a tantalizing quantum speedup. On the other hand … well, maybe not! I also consider it possible that the problem solved by GLM—for exponentially-large, implicitly-specified matrices A—is BQP-complete, as for example was the general problem solved by HHL. This would mean, for example, that one could embed Shor’s factoring algorithm into GLM, and that there’s no hope of dequantizing it unless P=BQP. (Even then, though, just like with the HHL algorithm, we’d still face the question of whether the GLM algorithm was “independently useful,” or whether it merely reproduced quantum speedups that were already known.)
Anyway, quantum algorithms research lives! So does dequantization research! If basic science in the US is able to continue at all—the thing I promised not to talk about in this post—we’ll have plenty to keep us busy over the next few years.
For the third time in 9 years I am visiting San Pedro de Atacama, a jewel in the middle of nowhere in northern Chile. The Atacama desert is a stretch of extremely dry land at high altitude, which makes it exceptionally attractive for astronomical activities. In its whereabouts, e.g., are some of the largest telescopes in the world - the Cerro Paranal Very Large Telescope (VLT), and the planned Extremely Large Telescope (ELT) now being built in Cerro Armazones. And I have news that an even larger telescope, tentatively dubbed RLT for Ridiculously Large Telescope, is being planned in the region...
Do you know when an engineer built the first artificial automaton—the first human-made machine that operated by itself, without external control mechanisms that altered the machine’s behavior over time as the machine undertook its mission?
The ancient Greek thinker Archytas of Tarentum reportedly created it about 2,300 years ago. Steam propelled his mechanical pigeon through the air.
For centuries, automata cropped up here and there as curiosities and entertainment. The wealthy exhibited automata to amuse and awe their peers and underlings. For instance, the French engineer Jacques de Vauconson built a mechanical duck that appeared to eat and then expel grains. The device earned the nickname the Digesting Duck…and the nickname the Defecating Duck.
Vauconson also invented a mechanical loom that helped foster the Industrial Revolution. During the 18th and 19th centuries, automata began to enable factories, which changed the face of civilization. We’ve inherited the upshots of that change. Nowadays, cars drive themselves, Roombas clean floors, and drones deliver packages.1 Automata have graduated from toys to practical tools.2
Rather, classical automata have. What of their quantum counterparts?
Scientists have designed autonomous quantum machines, and experimentalists have begun realizing them. The roster of such machines includes autonomous quantum engines, refrigerators, and clocks. Much of this research falls under the purview of quantum thermodynamics, due to the roles played by energy in these machines’ functioning: above, I defined an automaton as a machine free of time-dependent control (exerted by a user). Equivalently, according to a thermodynamicist mentality, we can define an automaton as a machine on which no user performs any work as the machine operates. Thermodynamic work is well-ordered energy that can be harnessed directly to perform a useful task. Often, instead of receiving work, an automaton receives access to a hot environment and a cold environment. Heat flows from the hot to the cold, and the automaton transforms some of the heat into work.
Quantum automata appeal to me because quantum thermodynamics has few practical applications, as I complained in my previous blog post. Quantum thermodynamics has helped illuminate the nature of the universe, and I laud such foundational insights. Yet we can progress beyond laudation by trying to harness those insights in applications. Some quantum thermal machines—quantum batteries, engines, etc.—can outperform their classical counterparts, according to certain metrics. But controlling those machines, and keeping them cold enough that they behave quantum mechanically, costs substantial resources. The machines cost more than they’re worth. Quantum automata, requiring little control, offer hope for practicality.
To illustrate this hope, my group partnered with Simone Gasparinetti’s lab at Chalmer’s University in Sweden. The experimentalists created an autonomous quantum refrigerator from superconducting qubits. The quantum refrigerator can help reset, or “clear,” a quantum computer between calculations.
Artist’s conception of the autonomous-quantum-refrigerator chip. Credit: Chalmers University of Technology/Boid AB/NIST.
After we wrote the refrigerator paper, collaborators and I raised our heads and peered a little farther into the distance. What does building a useful autonomous quantum machine take, generally? Collaborators and I laid out guidelines in a “Key Issues Review” published in Reports in Progress on Physics last November.
We based our guidelines on DiVincenzo’s criteria for quantum computing. In 1996, David DiVincenzo published seven criteria that any platform, or setup, must meet to serve as a quantum computer. He cast five of the criteria as necessary and two criteria, related to information transmission, as optional. Similarly, our team provides ten criteria for building useful quantum automata. We regard eight of the criteria as necessary, at least typically. The final two, optional guidelines govern information transmission and machine transportation.
Time-dependent external control and autonomy
DiVincenzo illustrated his criteria with multiple possible quantum-computing platforms, such as ions. Similarly, we illustrate our criteria in two ways. First, we show how different quantum automata—engines, clocks, quantum circuits, etc.—can satisfy the criteria. Second, we illustrate how quantum automata can consist of different platforms: ultracold atoms, superconducting qubits, molecules, and so on.
Nature has suggested some of these platforms. For example, our eyes contain autonomous quantum energy transducers called photoisomers, or molecular switches. Suppose that such a molecule absorbs a photon. The molecule may use the photon’s energy to switch configuration. This switching sets off chemical and neurological reactions that result in the impression of sight. So the quantum switch transduces energy from light into mechanical, chemical, and electric energy.
Photoisomer. (Image by Todd Cahill, from Quantum Steampunk.)
My favorite of our criteria ranks among the necessary conditions: every useful quantum automata must produce output worth the input. How one quantifies a machine’s worth and cost depends on the machine and on the user. For example, an agent using a quantum engine may care about the engine’s efficiency, power, or efficiency at maximum power. Costs can include the energy required to cool the engine to the quantum regime, as well as the control required to initialize the engine. The agent also chooses which value they regard as an acceptable threshold for the output produced per unit input. I like this criterion because it applies a broom to dust that we quantum thermodynamicists often hide under a rug: quantum thermal machines’ costs. Let’s begin building quantum engines that perform more work than they require to operate.
One might object that scientists and engineers are already sweating over nonautonomous quantum machines. Companies, governments, and universities are pouring billions of dollars into quantum computing. Building a full-scale quantum computer by hook or by crook, regardless of classical control, is costing enough. Eliminating time-dependent control sounds even tougher. Why bother?
Fellow Quantum Frontiers blogger John Preskill pointed out one answer, when I described my new research program to him in 2022: control systems are classical—large and hot. Consider superconducting qubits—tiny quantum circuits—printed on a squarish chip about the size of your hand. A control wire terminates on each qubit. The rest of the wire runs off the edge of the chip, extending to classical hardware standing nearby. One can fit only so many wires on the chip, so one can fit only so many qubits. Also, the wires, being classical, are hotter than the qubits should be. The wires can help decohere the circuits, introducing errors into the quantum information they store. The more we can free the qubits from external control—the more autonomy we can grant them—the better.
Besides, quantum automata exemplify quantum steampunk, as my coauthor Pauli Erker observed. I kicked myself after he did, because I’d missed the connection. The irony was so thick, you could have cut it with the retractible steel knife attached to a swashbuckling villain’s robotic arm. Only two years before, I’d read The Watchmaker of Filigree Street, by Natasha Pulley. The novel features a Londoner expatriate from Meiji Japan, named Mori, who builds clockwork devices. The most endearing is a pet-like octopus, called Katsu, who scrambles around Mori’s workshop and hoards socks.
Does the world need a quantum version of Katsu? Not outside of quantum-steampunk fiction…yet. But a girl can dream. And quantum automata now have the opportunity to put quantum thermodynamics to work.
1And deliver pizzas. While visiting the University of Pittsburgh a few years ago, I was surprised to learn that the robots scurrying down the streets were serving hungry students.
Quantum computing finds itself in a peculiar situation. On the technological side, after billions of dollars and decades of research, working quantum computers are nearing fruition. But still, the number one question asked about quantum computers is the same as it was two decades ago: What are they good for? The honest answer reveals an elephant in the room: We don’t fully know yet. For theorists like me, this is an opportunity, a call to action.
Technological momentum
Suppose we do not have quantum computers in a few decades time. What will be the reason? It’s unlikely that we’ll encounter some insurmountable engineering obstacle. The theoretical basis of quantum error-correction is solid, and several platforms are approaching or below the error-correction threshold (Harvard, Yale, Google). Experimentalists believe today’s technology can scale to 100 logical qubits and gates—the megaquop era. If mankind spends $100 billion over the next few decades, it’s likely we could build a quantum computer.
A more concerning reason that quantum computing might fail is that there is not enough incentive to justify such a large investment in R&D and infrastructure. Let’s make a comparison to nuclear fusion. Like quantum hardware, they have challenging science and engineering problems to solve. However, if a nuclear fusion lab were to succeed in their mission of building a nuclear fusion reactor, the application would be self-evident. This is not the case for quantum computing—it is a sledgehammer looking for nails to hit.
Nevertheless, industry investment in quantum computing is currently accelerating. To maintain the momentum, it is critical to match investment growth and hardware progress with algorithmic capabilities. The time to discover quantum algorithms is now.
Empowered theorists
Theory research is forward-looking and predictive. Theorists such as Geoffrey Hinton laid the foundations of the current AI revolution. But decades later, with an abundance of computing hardware, AI has become much more of an empirical field. I look forward to the day that quantum hardware reaches a state of abundance, but that day is not yet here.
Today, quantum computing is an area where theorists have extraordinary leverage. A few pages of mathematics by Peter Shor inspired thousands of researchers, engineers and investors to join the field. Perhaps another few pages by someone reading this blog will establish a future of world-altering impact for the industry. There are not many places where mathematics has such potential for influence. An entire community of experimentalists, engineers, and businesses are looking to the theorists for ideas.
The Challenge
Traditionally, it is thought that the ideal quantum algorithm would exhibit three features. First, it should be provably correct, giving a guarantee that executing the quantum circuit reliably will achieve the intended outcome. Second, the underlying problem should be classically hard—the output of the quantum algorithm should be computationally hard to replicate with a classical algorithm. Third, it should be useful, with the potential to solve a problem of interest in the real world. Shor’s algorithm comes close to meeting all of these criteria. However, demanding all three in an absolute fashion may be unnecessary and perhaps even counterproductive to progress.
Provable correctness is important, since today we cannot yet empirically test quantum algorithms on hardware at scale. But what degree of evidence should we require for classical hardness? Rigorous proof of classical hardness is currently unattainable without resolving major open problems like P vs NP, but there are softer forms of proof, such as reductions to well-studied classical hardness assumptions.
I argue that we should replace the ideal of provable hardness with a more pragmatic approach: The quantum algorithm should outperform the best known classical algorithm that produces the same output by a super-quadratic speedup.1 Emphasizing provable classical hardness might inadvertently impede the discovery of new quantum algorithms, since a truly novel quantum algorithm could potentially introduce a new classical hardness assumption that differs fundamentally from established ones. The back-and-forth process of proposing and breaking new assumptions is a productive direction that helps us triangulate where quantum advantage lies.
It may also be unproductive to aim directly at solving existing real-world problems with quantum algorithms. Fundamental computational tasks with quantum advantage are special and we have very few examples, yet they necessarily provide the basis for any eventual quantum application. We should search for more of these fundamental tasks and match them to applications later.
That said, it is important to distinguish between quantum algorithms that could one day provide the basis for a practically relevant computation, and those that will not. In the real world, computations are not useful unless they are verifiable or at least repeatable. For instance, consider a quantum simulation algorithm that computes a physical observable. If two different quantum computers run the simulation and get the same answer, one can be confident that this answer is correct and that it makes a robust prediction about the world. Some problems such as factoring are naturally easy to verify classically, but we can set the bar even lower: The output of a useful quantum algorithm should at least be repeatable by another quantum computer.
There is a subtle fourth requirement of paramount importance that is often overlooked, captured by the following litmus test: If given a quantum computer tomorrow, could you implement your quantum algorithm? In order to do so, you need not only a quantum algorithm but also a distribution over its inputs on which to run it. Classical hardness must then be judged in the average case over this distribution of inputs, rather than in the worst case.
I’ll end this section with a specific caution regarding quantum algorithms whose output is the expectation value of an observable. A common reason these proposals fail to be classically hard is that the expectation value exponentially concentrates over the distribution of inputs. When this happens, a trivial classical algorithm can replicate the quantum result by simply outputting the concentrated (typical) value for every input. To avoid this, we must seek ensembles of quantum circuits whose expectation values exhibit meaningful variation and sensitivity to different inputs.
We can crystallize these priorities into the following challenge:
The Challenge Find a quantum algorithm and a distribution over its inputs with the following features: — (Provable correctness.) The quantum algorithm is provably correct. — (Classical hardness.) The quantum algorithm outperforms the best known classical algorithm that performs the same task by a super-quadratic speedup, in the average-case over the distribution of inputs. — (Potential utility.) The output is verifiable, or at least repeatable.
We can categorize quantum algorithms by the form of their output. First, there are quantum algorithms for search problems, which produce a bitstring satisfying some constraints. This could be the prime factors of a number, a planted feature in some dataset, or the solution to an optimization problem. Next, there are quantum algorithms that compute a value to some precision, for example the expectation value of some physical observable. Then there are proofs of quantumness, which involve a verifier who generates a test using some hidden key, and the key can be used to verify the output. Finally, there are quantum algorithms which sample from some distribution.
Hamiltonian simulation is perhaps the most widely heralded source of quantum utility. Physics and chemistry contain many quantities that Nature computes effortlessly, yet remain beyond the reach of even our best classical simulations. Quantum computation is capable of simulating Nature directly, giving us strong reason to believe that quantum algorithms can compute classically-hard quantities.
There are already many examples where a quantum computer could help us answer an unsolved scientific question, like determining the phase diagram of the Hubbard model or the ground energy of FeMoCo. These undoubtedly have scientific value. However, they are isolated examples, whereas we would like evidence that the pool of quantum-solvable questions is inexhaustible. Can we take inspiration from strongly correlated physics to write down a concrete ensemble of Hamiltonian simulation instances where there is a classically-hard observable? This would gather evidence for the sustained, broad utility of quantum simulation, and would also help us understand where and how quantum advantage arises.
Over in the computer science community, there has been a lot of work on oracle separations such as welded trees and forrelation, which should give us confidence in the abilities of quantum computers. Can we instantiate these oracles in a way that pragmatically remains classically hard? This is necessary in order to pass our earlier litmus test of being ready to run the quantum algorithm tomorrow.
The issue with these broad frameworks is that they often do not specify a distribution over inputs. Can we find novel ensembles of inputs to these frameworks which exhibit super-quadratic speedups? BQP-completeness shows that one has translated the notion of quantum computation into a different language, which allows one to embed an existing quantum algorithm such as Shor’s algorithm into your framework. But in order to discover a new quantum algorithm, you must find an ensemble of BQP computations which does not arise from Shor’s algorithm.
Table I claims that sampling tasks alone are not useful since they are not even quantumly repeatable. One may wonder if sampling tasks could be useful in some way. After all, classical Monte Carlo sampling algorithms are widely used in practice. However, applications of sampling typically use samples to extract meaningful information or specific features of the underlying distribution. For example, Monte Carlo sampling can be used to evaluate integrals in Bayesian inference and statistical physics. In contrast, samples obtained from random quantum circuits lack any discernible features. If a collection of quantum algorithms generated samples containing meaningful signals from which one could extract classically hard-to-compute values, those algorithms would effectively transition into the compute a value category.
Table I also claims that proofs of quantumness are not useful. This is not completely true—one potential application is generating certifiable randomness. However, such applications are generally cryptographic rather than computational in nature. Specifically, proofs of quantumness cannot help us solve problems or answer questions whose solutions we do not already know.
Finally, there are several exciting directions proposing applications of quantum technologies in sensing and metrology, communication, learning with quantum memory, and streaming. These are very interesting, and I hope that mankind’s second century of quantum mechanics brings forth all flavors of capabilities. However, the technological momentum is mostly focused on building quantum computers for the purpose of computational advantage, and so this is where breakthroughs will have the greatest immediate impact.
Don’t be too afraid
At the annual QIP conference, only a handful of papers out of hundreds each year attempt to advance new quantum algorithms. Given the stakes, why is this number so low? One common explanation is that quantum algorithm research is simply too difficult. Nevertheless, we have seen substantial progress in quantum algorithms in recent years. After an underwhelming lack of end-to-end proposals with the potential for utility between the years 2000 and 2020, Table I exhibits several breakthroughs from the past 5 years.
In between blind optimism and resigned pessimism, embracing a mission-driven mindset can propel our field forward. We should allow ourselves to adopt a more exploratory, scrappier approach: We can hunt for quantum advantages in yet-unstudied problems or subtle signals in the third decimal place. The bar for meaningful progress is lower than it might seem, and even incremental advances are valuable. Don’t be too afraid!
Quadratic speedups are widespread but will not form the basis of practical quantum advantage due to the overheads associated with quantum error-correction. ︎
The Mathematics Division at Stellenbosch University in South Africa is looking to hire a new permanent appointment at Lecturer / Senior Lecturer level (other levels may be considered too under the appropriate circumstances).
Preference will be given to candidates working in number theory or a related area, but those working in other areas of mathematics will definitely also be considered.
The closing date for applications is 30 April 2025. For more details, kindly see the official advertisement.
Consider a wonderful career in the winelands area of South Africa!
Where Does Meaning Live in a Sentence? Math Might Tell Us.
The mathematician Tai-Danae Bradley is using category theory to try
to understand both human and AI-generated language.
It’s a nicely set up Q&A, with questions like “What’s something category theory lets you see that you can’t otherwise?” and “How do you use category theory to understand language?”
We’ll get back to measurement, interference and the double-slit experiment just as soon as I can get my math program to produce pictures of the relevant wave functions reliably. I owe you some further discussion of why measurement (and even interactions without measurement) can partially or completely eliminate quantum interference.
But in the meantime, I’ve gotten some questions and some criticism for arguing that superposition is an OR, not an AND. It is time to look closely at this choice, and understand both its strengths and its limitations, and how we have to move beyond it to fully appreciate quantum physics. [I probably should have written this article earlier — and I suspect I’ll need to write it again someday, as it’s a tricky subject.]
The Question of Superposition
Just to remind you of my definitions (we’ll see examples in a moment): objects that interact with one another form a system, and a system is at any time in a certain quantum state, consisting of one or more possibilities combined in some way and described by what is often called a “wave function”. If the number of possibilities described by the wave function is more than one, then physicists say that the state of the quantum system is a superposition of two or more basic states. [Caution: as we’ll explore in later posts, the number of states in the superposition can depend on one’s choice of “basis”.]
As an example, suppose we have two boxes, L and R for left and right, and two atoms, one of hydrogen H and one of nitrogen N. Our physical system consists of the two atoms, and depending on which box each atom is in, the system can exist in four obvious possibilities, shown in Fig. 1:
HL NL (i.e. both the hydrogen atom and the nitrogen atom are in the left box)
HL NR
HR NL
HR NR
Figure 1: The four intuitively obvious options for how to store the two atoms in the boxes correspond to four basic states of the quantum system.
Before quantum physics, we would have thought those were the only options; each atom must be in one box or the other. But in quantum physics there are many more non-obvious possibilities.
In particular, we could put the system in a superposition of the form HL NL + HR NR, shown in Fig. 2. In the jargon of physics, “the system is in a superposition of HL NL and HR NR“. Note the use of the word “and” here. But don’t read too much into it; jargon often involves linguistic shorthand, and can be arbitrary and imprecise. The question I’m focused on here is not “what do physicists say?”, but “what does it actually mean?”
Figure 2: A quantum system can be in a superposition, such as this one represented by two basic states related by a “+” symbol. (This is not the most general case, as discussed below.)
In particular, does it mean that “HL NLAND HR NR” are true? Or does it mean “HL NLOR HR NR” is true? Or does it mean something else?
The Problems with “AND”
First, let’s see why the “AND” option has a serious problem.
In ordinary language, if I say that “A AND B are true”, then I mean that one can check that A is true and also, separately, that B is true — i.e., both A and B are true. With this meaning in mind, it’s clear that experiments do not encourage us to view superposition as an AND. (There are theory interpretations of quantum physics that do encourage the use of “AND”, a point I’ll return to.)
Experiment Is Skeptical
Specifically, if a system is in a quantum superposition of two states A and B, no experiment will ever show that
A is true AND
B is true.
Instead, in any experiment explicitly designed to check whether A is true and whether B is true, the result will only reveal, at best, that
A is true and B is not true OR
B is true and A is not true.
The result might also be ambiguous, neither confirming nor denying that either one is true. But no measurement will ever show that both A AND B are definitively true. The two possibilities A and B are mutually exclusive in any actual measurement that is sensitive to the question.
In our case, if we go looking for our two atoms in the state HL NL + HR NR — if we do position measurements on both of them — we will either find both of them in the left box OR both of them in the right box. 1920’s quantum physics may be weird, but it does not allow measurements of an atom to find it in two places at the same time: an atom has a position, even if it is inherently uncertain, and if I make a serious attempt to locate it, I will find only one answer (within the precision of the measurement). [Measurement itself requires a long discussion, which I won’t attempt here; but see this post and the following one.]
And so, in this case, a measurement will find that one box has two atoms and the other has zero. Yet if we use “AND” in describing the superposition, we end up saying “both atoms are in the left box AND both atoms are in the right box”, which seems to imply that both atoms are in both boxes, contrary to any experiment. Again, certain theoretical approaches might argue that they are in both boxes, but we should obviously be very cautious when experiment disagrees with theoretical reasoning.
The Fortunate and/or Unfortunate Cat
The example of Schrodinger’s cat is another context in which some writers use “and” in describing what is going on.
A reminder of the cat experiment: We have an atom which may decay now or later, according to a quantum process whose timing we cannot predict. If the atom decays, it initiates a chain reaction which kills the cat. If the atom and the cat are placed inside a sealed box, isolating them completely from the rest of the universe, then the initial state, with an intact atom (Ai) and a Live cat (CL), will evolve to a state in a superposition roughly of the form AiCL+AdCD, where Ad refers to a decayed atom and CD refers to a Dead cat. (More precisely, the state will take the form c1AiCL+c2AdCD, where c1 and c2 are complex numbers with |c1|2 + |c2|2 = 1; but we can ignore these numbers for now.)
Figure 3: As in Figure 2, a superposition can even, in principle, be applied to macroscopic objects. This includes the famous Schrodinger cat state.
Leaving aside that the experiment is both unethical and impossible in practice, it raises an important point about the word “AND”. It includes a place where we must say “AND“; there’s no choice.
As we close the box to start the experiment, the atom is intact AND the cat is alive; both are simultaneously true, as measurement can verify. The state that we use to describe this, AiCL, is a mathematical product: implicitly “AiCL” means AixCL, where x is the “times” symbol.
Figure 4: It is unambiguous that the initial state of the cat-atom system is that the atom is intact AND the cat is alive: AixCL.
Later, the state to which the system evolves is a sum of two products — a superposition (AixCL) + (AdxCD) which includes two “AND” relationships
1) “the atom is intact AND the cat is alive” (AixCL) 2) “the atom has decayed AND the cat is dead” (AdxCD)
In each of these two possibilities, the state of the atom and the state of the cat are perfectly correlated; if you know one, you know the other. To use language consistent with English (and all other languages with which I am familiar), we must use “AND” to describe this correlation. (Note: in this particular example, correlation does in fact imply causation — but that’s not a requirement here. Correlation is enough.)
It is then often said that, theoretically, that “before we open the box, the cat is both alive AND dead”. But again, if we open the box to find out, experimentally, we will find out either that “the cat is alive OR the cat is dead.” So we should think this through carefully.
We’ve established that “x” must mean “AND“, as in Fig. 4. So let’s try to understand the “+” that appears in the superposition (AixCL) + (AdxCD). It is certainly the case that such a state doesn’t tell us whether CL is true or CD is true, or even that it is meaningful to say that only one is true.
But suppose we decide that “+” means “AND“, also. Then we end up saying
“(the cat is alive AND the atom is intact) AND (the cat is dead AND the atom has decayed.)”
That’s very worrying. In ordinary English, if I’m referring to some possible facts A,B,C, and D, and I tell you that “(A AND B are true) AND (C AND D are true)”, the logic of the language implies that A AND B AND C AND D are all true. But that standard logic would leads to a falsehood. It is absolutely not the case, in the state (AixCL) + (AdxCD), that CL is true and Ad is true — we will never find, in any experiment, that the cat is alive and yet the atom has decayed. That could only happen if the system were in a superposition that includes the possibility AdxCL. Nor (unless we wait a few years and the cat dies of old age) can it be the case that CD is true and Ai is true.
And so, if “x” means “AND” and “+” means “AND“, it’s clear that these are two different meanings of “AND.”
“AND” and “AND”
Is that okay? Well, lots of words have multiple meanings. Still, we’re not used to the idea of “AND” being ambiguous in English. Nor are “x” and “+” usually described with the same word. So using “AND” is definitely problematic.
(That said, people who like to think in terms of parallel “universes” or “branches” in which all possibilities happen [the many-worlds interpretation] may actually prefer to have two meanings of “AND”, one for things that happen in two different branches, and one for things that happen in the same branch. But this has some additional problems too, as we’ll see later when we get to the subtleties of “OR”.)
These issues are why, in my personal view, “OR” is better when one first learns quantum physics. I think it makes it easier to explain how quantum physics is both related to standard probabilities and yet goes beyond it. For one thing, “or” is already ambiguous in English, so we’re used to the idea that it might have multiple meanings. For another, we definitely need “+” to be conceptually different from “x“, so it is confusing, pedagogically, to start right off by saying that both mathematical operators are “AND”.
But “OR” is not without its problems.
The Problems with “OR”
In normal English, saying “the atom is intact and the cat is alive” OR “the atom has decayed and the cat is dead” would tell us two possible facts about the current contents of the box, one of which is definitely true.
But in quantum physics, the use of “OR” in the Schrodinger cat superposition does not tell us what is currently happening inside the box. It does tell us the state of the system at the moment, but all that does is predict the possible outcomes that would be observed if the box were opened right now(and their probabilities.) That’s less information than telling us the properties of what is in the closed box.
The advantage of “OR” is that it does tell us the two outcomes of opening the box, upon which we will find
“The atom is intact AND the cat is alive” OR
“The atom has decayed AND the cat is dead”
Similarly, for our box of atoms, it tells us that if we attempt to locate the atoms, we will find that
“the hydrogen atom is in the left box AND the nitrogen atom is in the left box” OR
“the hydrogen atom is in the right box AND the nitrogen atom is in the right box”
In other words, this use of AND and OR agrees with what experiments actually find. Better this than the alternative, it seems to me.
Nevertheless, just because it is better doesn’t mean it is unproblematic.
The Usual Or
The word “OR” is already ambiguous in usual English, in that it could mean
either A is true or B is true
A is true or B is true or both are true
Which of these two meanings is intended in an English sentence has to be determined by context, or explained by the speaker. Here I’m focused on the first meaning.
Returning to our first example of Figs. 1 and 2, suppose I hand the two atoms to you and ask you to put them in either box, whichever one you choose. You do so, but you don’t tell me what your choice was, and you head off on a long vacation.
While I wait for you to return, what can I say about the two atoms? Assuming you followed my instructions, I would say that
“both atoms are in the left box OR both atoms are in the right box”
In doing so, I’m using “or” in its “either…or…” sense in ordinary English. I don’t know which box you chose, but I still know (Fig. 5) that the system is either definitely in the HL NL state ORdefinitely in the HR NR state of Fig. 1. I know this without doing any measurement, and I’m only uncertain about which is which because I’m missing information that you could have provided me. The information is knowable; I just don’t have it.
Figure 5: The atoms were definitely put into in one box or the other, but nobody told me which box was selected.
But this uncertainty about which box the atoms are in is completely different from the uncertainty that arises from putting the atoms in the superposition state HL NL + HR NR!
The Superposition OR
If the system is in the state HL NL + HR NR, i.e. what I’ve been calling (“HL NLOR HR NR“), it is in a state of inherent uncertainty of whether the two atoms are in the left box or in the right box. It is not that I happen not to know which box the atoms are in, but rather that this information is not knowable within the rules of quantum physics. Even if you yourself put the atoms into this superposition, you don’t know which box they’re in any more than I do.
The only thing we can try to do is perform an experiment and see what the answer is. The problem is that we cannot necessarily infer, if we find both atoms in the left box, that the two atoms were in that box prior to that measurement.
If we do try to make that assumption, we find ourselves in apparent contradiction with experiment. The issue is quantum interference. If we repeat the whole process, but instead of opening the boxes to see where the atoms are, we first bring the two boxes together and measure the atoms’ properties, we will observe quantum interference effects. As I have discussed in my recent series of five posts on interference (starting here), quantum interference can only occur when a system takes at least two paths to its current state; but if the two atoms were definitely in one box or definitely in the other, then there would be only one path in Fig. 6.
Figure 6: In the superposition state, the atoms cannot simply be in definite but unknown locations, as in Fig. 5. If the boxes are joined and then opened, quantum interference will occur, implying the system has evolved via two paths to a single state.
Prior to the measurement, the system had inherent uncertainty about the question, and while measurement removes the current uncertainty, it does not in general remove the past uncertainty. The act of measurement changes the state of the system — more precisely, it changes the state of the larger system that includes both atoms and the measurement device — and so establishing meaningfully that the two atoms are now in the left box is not sufficient to tell us meaningfully that the two atoms were previously and definitively in the left box.
So if this is “OR“, it is certainly not what it usually means in English!
This Superposition or That One?
And it gets worse, because we can take more complex examples. As I mentioned when discussing the poor cat, the superposition HL NL+ HR NR is actually one in a large class of superpositions, of the form c1 HL NL + c2 HR NR , where c1 and c2 are complex numbers. A second simple example of such a superposition is HL NL– HR NR, with a minus sign instead of a plus sign.
So suppose I had asked you to put the two atoms in a superposition either of the form HL NL+ HR NRor HL NL– HR NR, your choice; and suppose you did so without telling me which superposition you chose. What would I then know?
I would know that the system is either in the state (HL NL + HR NR) or in the state (HL NL – HR NR), depending on what you chose to do. In words, what I would know is that the system is represented by
(HL NLOR HR NR) OR (HL NLOR HR NR)
Uh oh. Now we’re as badly off as we were with “AND“.
First, the “OR” in the center is a standard English “OR” — it means that the system is definitely in one superposition or the other, but I don’t know which one — which isn’t the same thing as the “OR“s in the parentheses, which are “OR“s of superposition that only tell us what the results of measurements might be.
Second, the two “OR“s in the parentheses are different, since one means “+” and the other means “–“. In some other superposition state, the OR might mean 3/5 + i 4/5, where i is the standard imaginary number equal to the square root of -1. In English, there’s obviously no room for all this complexity. [Note that I’d have the same problem if I used “AND” for superpositions instead.]
So even if “OR” is better, it’s still not up to the task. Superposition forces us to choose whether to have multiple meanings of “AND” or multiple meanings of “OR”, including meanings that don’t hold in ordinary language. In a sense, the “+” (or “-” or whatever) in a superposition is a bit more “AND” than standard English “OR”, but it’s also a bit more “OR” than a standard English “AND”. It’s something truly new and unfamiliar.
Experts in the foundational meaning of quantum physics argue over whether to use “OR” or “AND”. It’s not an argument I want to get into. My goal here is to help you understand how quantum physics works with the minimum of interpretation and the minimum of mathematics. This requires precise language, of course. But here we find we cannot avoid a small amount of math — that of simple numbers, sometimes even complex numbers — because ordinary language simply can’t capture the logic of what quantum physics can do.
I will continue, for consistency, to use “OR” for a superposition, but going forward we must admit and recognize its limitations, and become more sophisticated about what it does and doesn’t mean. One should understand my use of “OR“, and the “pre-quantum viewpoint” that I often employ, as pedagogical methodology, not a statement about nature. Specifically, I have been trying to clarify the crucial idea of the space of possibilities, and to show examples of how quantum physics goes beyond pre-quantum physics. I find the “pre-quantum viewpoint”, where it is absolutely required that we use “OR”, helps students get the basics straight. But it is true that the pre-quantum viewpoint obscures some of the full richness and complexity of quantum phenomena, much of which arises precisely because the quantum “OR” is not the standard “OR” [and similarly if you prefer “AND” instead.] So we have to start leaving it behind.
There are many more layers of subtlety yet to be uncovered [for instance, what if my system is in a state (A OR B), but I make a measurement that can’t directly tell me whether A is true or B is true?] but this is enough for today.
I’m grateful to Jacob Barandes for a discussion about some of these issues.
Conceptual Summary
When we use “A AND B” in ordinary language, we mean “A is true and B is true”.
When we use “A OR B” in ordinary language, we find “OR” is ambiguous even in English; it may mean
“either A is true or B is true”, or
“A is true or B is true or both are true.”
In my recent posts, when I say a superposition c1 A + c2 B can be expressed as “A OR B”, I mean something that I cannot mean in English, because such a meaning would never normally occur to us:
I mean that the result of an appropriate measurement carried out at this moment will give the result A or the result B (but not both).
I do so without generally implying that the state of the system, if I don’t carry out the measurement, is definitely A or definitely B (though unknown).
Instead the system could be viewed as being in an uncanny state of being that we’re not used to, for which neither ordinary “AND” nor ordinary “OR” applies.
Note also that using either “AND” or “OR” is unable to capture the difference between superpositions that involve the same states but differ in the numbers c1, c2.
The third bullet point is open to different choices about “AND” and “OR“, and open to different interpretation about what superposition states imply about the systems that are in them. There are different consistent ways to combine the language and concepts, and the particular choice I’ve made is pragmatic, not dogmatic. For a single set of blog posts that tell a coherent story, I have to to pick a single consistent language; but it’s a choice. Once one’s understanding of quantum physics is strong, it’s both valuable and straightforward to consider other possible choices.
With the stock market crash and the big protests across the US, I’m finally feeling a trace of optimism that Trump’s stranglehold on the nation will weaken. Just a trace.
I still need to self-medicate to keep from sinking into depression — where ‘self-medicate’, in my case, means studying fun math and physics I don’t need to know. I’ve been learning about the interactions between number theory and group theory. But I haven’t been doing enough physics! I’m better at that, and it’s more visceral: more of a bodily experience, imagining things wiggling around.
So, I’ve been belatedly trying to lessen my terrible ignorance of nuclear physics. Nuclear physics is a fascinating application of quantum theory, but it’s less practical than chemistry and less sexy than particle physics, so I somehow skipped over it.
I’m finding it worth looking at! Right away it’s getting me to think about quantum ellipsoids.
Nuclear physics forces you to imagine blobs of protons and neutrons wiggling around in a very quantum-mechanical way. Nuclei are too complicated to fully understand. We can simulate them on a computer, but simulation is not understanding, and it’s also very hard: one book I’m reading points out that one computation you might want to do requires diagonalizing a matrix. So I’d rather learn about the many simplified models of nuclei people have created, which offer partial understanding… and lots of beautiful math.
Protons minimize energy by forming pairs with opposite spin. Same for neutrons. Each pair acts like a particle in its own right. So nuclei act very differently depending on whether they have an even or odd number of protons, and an even or odd number of neutrons!
The ‘Interacting Boson Model’ is a simple approximate model of ‘even-even’ atomic nuclei: nuclei with an even number of protons and an even number of neutrons. It treats the nucleus as consisting of bosons, each boson being either a pair of nucleons — that is, either protons or neutrons — where the members of a pair have opposite spin but are the same in every other way. So, these bosons are a bit like the paired electrons responsible for superconductivity, called ‘Cooper pairs’.
However, in the Interacting Boson Model we assume our bosons all have either spin 0 (s-bosons) or spin 2 (d-bosons), and we ignore all properties of the bosons except their spin angular momentum. A spin-0 particle has 1 spin state, since the spin-0 representation of is 1-dimensional. A spin-2 particle has 5, since the spin-2 representation is 5-dimensional.
If we assume the maximum amount of symmetry among all 6 states, both s-boson and d-boson states, we get a theory with symmetry! And part of why I got interested in this stuff was that it would be fun to see a rather large group like showing up as symmetries — or approximate symmetries — in real world physics.
More sophisticated models recognize that not all these states behave the same, so they assume a smaller group of symmetries.
But there are some simpler questions to start with.
How do we make a spin-0 or spin-2 particle out of two nucleons? That’s easy. Two nucleons with opposite spin have total spin 0. But if they’re orbiting each other, they have orbital angular momentum too, so the pair can act like a particle with spin 0, 1, 2, 3, etc.
Why are these bosons in the Interacting Boson Model assumed to have spin 0 or spin 2, but not spin 1 or any other spin? This is a lot harder. I assume that at some level the answer is “because this model works fairly well”. But why does it work fairly well?
By now I’ve found two answers for this, and I’ll tell you the more exciting answer, which I found in this book:
Igal Talmi, Simple Models of Complex Nuclei: the Shell Model and Interacting Boson Model, Harwood Academic Publishers, Chur, Switzerland, 1993.
In the ‘liquid drop model’ of nuclei, you think of a nucleus as a little droplet of fluid. You can think of an even-even nucleus as a roughly ellipsoidal droplet, which however can vibrate. But we need to treat it using quantum mechanics. So we need to understand quantum ellipsoids!
The space of ellipsoids in centered at the origin is 6-dimensional, because these ellipsoids are described by equations like
and there are 6 coefficients here. Not all nuclei are close to spherical! But perhaps it’s easiest to start by thinking about ellipsoids that are close to spherical, so that
where are small. If our nucleus were classical, we’d want equations that describe how these numbers change with time as our little droplet oscillates.
But the nucleus is deeply quantum mechanical. So in the Interacting Boson Model, invented by Iachello, it seems we replace with operators on a Hilbert space, say , and introduce corresponding momentum operators , obeying the usual ‘canonical commutation relations’:
As usual, we can take this Hilbert space to either be or ‘Fock space’ of : the Hilbert space completion of the symmetric algebra of . These are two descriptions of the same thing. The Fock space of gets an obvious representation of the unitary group , since that group acts on . And gets an obvious representation of , since rotations act on ellipsoids and thus on the tuples that we’re using to describe ellipsoids.
The latter description lets us see where the s-bosons and d-bosons are coming from! Our representation of on splits into two summands:
the (real) spin-0 representation, which is 1-dimensional because it takes just one number to describe the rotation-invariant aspects of the shape of an ellipsoid centered at the origin: for example, its volume. In physics jargon this number tells us the monopole moment of the mass distribution of our nucleus.
the (real) spin-2 representation, which is 5-dimensional because it takes 5 numbers to describe all other aspects of the shape of an ellipsoid centered at the origin. You need 2 numbers to say in which direction its longest axis points, one number to say how long that axis is, 1 number to say which direction the second-longest axis point in (it’s at right angles to the longest axis), and 1 number to say how long it is. In physics jargon these 5 numbers tell us the quadrupole moment of our nucleus.
This shows us why we don’t get spin-1 bosons! We’d get them if the mass distribution of our nucleus could have a nonzero dipole moment. In other words, we’d get them if we added linear terms to our equation
But by conservation of momentum, we can assume the center of mass of our nucleus stays at the origin, and set these linear terms to zero.
As usual, we can take linear combinations of the operators and to get annihilation and creation operators for s-bosons and d-bosons. If we want, we can think of these bosons as nucleon pairs. But we don’t need that microscopic interpretation if we don’t want it: we can just say we’re studying the quantum behavior of an oscillating ellipsoid!
After we have our Hilbert space and these operators on it, we can write down a Hamiltonian for our nucleus, or various possible candidate Hamiltonians, in terms of these operators. Talmi’s book goes into a lot of detail on that. And then we can compare the oscillations these Hamiltonians predict to what we see in the lab. (Often we just see the frequencies of the standing waves, which are proportional to the eigenvalues of the Hamiltonian.)
So, from a high-level mathematical viewpoint, what we’ve done is try to define a manifold of ellipsoid shapes, and then form its cotangent bundle , and then quantize that and start studying ‘quantum ellipsoids’.
Pretty cool! And there’s a lot more to say about it. But I’m wondering if there might be a better manifold of ellipsoid shapes than just . After all, when or become negative things go haywire: our ellipsoid can turn into a hyperboloid! The approach I’ve described is probably fine ‘perturbatively’, i.e. when are small. But it may not be the best when our ellipsoid oscillates so much it gets far from spherical.
I think we need a real algebraic geometer here. In both senses of the word ‘real’.
The quantum double-slit experiment, in which objects are sent toward a wall with two slits and then recorded on a screen behind the wall, creates an interference pattern that builds up gradually, object by object. And yet, it’s crucial that the path of each object on its way to the screen remain unknown. If one measures which of the slits each object passes through, the interference pattern never appears.
Strange things are said about this. There are vague, weird slogans: “measurement causes the wave function to collapse“; “the particle interferes with itself“; “electrons are both particles and waves“; etc. One reads that the objects are particles when they reach the screen, but they are waves when they go through the slits, causing the interference — unless their passage through the slits is measured, in which case they remain particles.
But in fact the equations of 1920s quantum physics say something different and not vague in the slightest — though perhaps equally weird. As we’ll see today, the elimination of interference by measurement is no mystery at all, once you understand both measurement and interference. Those of you who’ve followed my recent posts on these two topics will find this surprisingly straightforward; I guarantee you’ll say, “Oh, is that all?” Other readers will probably want to read
When do we expect quantum interference? As I’ll review in a moment, there’s a simple criterion:
a system of objects (not the objects themselves!) will exhibit quantum interference if the system, initially in a superposition of possibilities, reaches a single possibility via two or more pathways.
To remind you what that means, let’s compare two contrasting cases (covered carefully in this post.) Figs. 1a and 1b show pre-quantum animations of different quantum systems, in which two balls (drawn blue and orange) are in a superposition of moving left OR moving right. I’ve chosen to stop each animation right at the moment when the blue ball in the top half of the superposition is at the same location as the blue ball in the bottom half, because if the orange ball weren’t there, this is when we’d expect it to see quantum interference.
But for interference to occur, the orange ball, too, must at that same moment be in the same place in both parts of the superposition. That does happen for the system in Fig. 1a — the top and bottom parts of the figure line up exactly, and so interference will occur. But the system in Fig. 1b, whose top and bottom parts never look the same, will not show quantum interference.
Fig. 1a: A system of two balls in a superposition, from a pre-quantum viewpoint. As the system evolves, a moment is reached when the two parts of the superposition are identical. As the system has then reached a single possibility via two routes, quantum interference may result.
Figure 1b: Similar to Fig. 1a, except that when the blue ball is at the same location in both parts of the superposition, the orange ball is at two different locations. At no moment are the two possibilities in the superposition the same, so quantum interference cannot occur.
In other words, quantum interference requires that the two possibilities in the superposition become identical at some moment in time. Partial resemblance is not enough.
The Measurement
A measurement always involves an interaction of some sort between the object we want to measure and the device doing the measurement. We will typically
For today’s purposes, the details of the second step won’t matter, so I’ll focus on the first step.
Setting Up
We’ll call the object going through the slits a “particle”, and we’ll call the measurement device a “measuring ball” (or just “ball” for short.) The setup is depicted in Fig. 2, where the particle is approaching the slits and the measuring ball lies in wait.
Figure 2: A particle (blue) approaches a wall with two slits, behind which sits a screen where the particle’s arrival will be detected. Also present is a lightweight measuring ball (black), ready to fly in and measure the particle’s position by colliding with it as it passes through the wall.
If No Measurement is Made at the Slits
Suppose we allow the particle to proceed and we make no measurement of its location as it passes through the slits. Then we can leave the ball where it is, at the position I’ve marked M in Fig. 3. If the particle makes it through the wall, it must pass through one slit or the other, leaving the system in a superposition of the form
the particle is near the left slit [and the ball is at position M] OR
the particle is near the right slit [and the ball is at position M]
as shown at the top of Fig. 3. (Note: because the ball and particle are independent [unentangled] in this superposition, it can be written in factored form as in Fig. 12 of this post.)
From here, the particle (whose motion is now quite uncertain as a result of passing through a narrow slit) can proceed unencumbered to the screen. Let’s say it arrives at the point marked P, as at the bottom of Fig. 3.
Figure 3: (Top) As the particle passes through the slits, the system is set into a superposition of two possibilities in which the particle passes through the left slit OR the right slit. (The particle’s future motion is quite uncertain, as indicated by the green arrows.) In both possibilities, the measuring ball is at point M. (Bottom) If the particle arrives at point P on the screen, then the two possibilties in the superposition become identical, as in Fig. 1a, so quantum interference can result. This will be true no matter what point P we choose, and so an interference pattern will be seen across the whole screen.
Crucially, both halves of the superposition now describe the same situation: particle at P, ball at M. The system has arrived here via two paths:
The particle went through the left slit and arrived at the point P (with the ball always at M), OR
The particle went through the right slit and arrived at the point P (with the ball always at M).
Therefore, since the system has reached a single possibility via two different routes, quantum interference may be observed.
But now let’s make the measurement. We’ll do it by throwing the ball rapidly toward the particle, timed carefully so that, as shown in Fig. 4, either
the particle is at the left slit, in which case the ball passes behind it and travels onward, OR
the particle is at the right slit, in which case the ball hits it and bounces back.
(Recall that I assumed the measuring ball is lightweight, so the collision doesn’t much affect the particle; for instance, the particle might be an heavy atom, while the measuring ball is a light atom.)
Figure 4: As the particle moves through the wall, the ball is sent rapidly in motion. If the particle passes through the right slit, the ball will hit it and bounce back; if the particle passes through the left slit, the ball will miss it and will continue to the left.
The ball’s late-time behavior reveals — and thus measures — the particle’s behavior as it passed through the wall:
the ball moving to the left means the particle went through the left slit;
the ball moving to the right means the particle went through the right slit.
To make this measurement complete and permanent requires a longer story with more details; for instance, we might choose to amplify the result with a Geiger counter. But the details don’t matter, and besides, that takes place later. Let’s keep our focus on what happens next.
The Effect of the Measurement
What happens next is that the particle reaches the point P on the screen. It can do this whether it traveled via the left slit or via the right slit, just as before, and so you might think there should still be an interference pattern. However, remembering Figs. 1a and 1b and the criterion for interference, take a look at Fig. 5.
Figure 5: Following the measurement made in Fig. 4, the arrival of the particle at the point P on the screen finds the ball in two possible locations, depending on which slit the particle went through. In contrast to Fig. 3, the two parts of the superposition are not identical, and so (as in Fig. 1b) no quantum interference pattern will be observed.
Even though the particle by itself could have taken two paths to the point P, the system as a whole is still in a superposition of two different possibilities, not one — more like Fig. 1b than like Fig. 1a. Specifically,
the particle is at position P and the ball is at location ML (which happens if, in Fig. 4, the particle was near the left slit and the ball continued to the left); OR
the particle is at position P and the ball is at location MR (which happens if, in Fig. 4, the particle was near the right slit and the ball bounced back to the right).
The measurement process — by the very definition of “measurement” as a procedure that segregates left-slit cases from right-slit cases — has resulted in the two parts of the superposition being different even when they both have the particle reaching the same point P. Therefore, in contrast to Fig. 3, quantum interference between the two parts of the superposition cannot occur.
And that’s it. That’s all there is to it.
Looking Ahead.
The double-slit experiment is hard to understand if one relies on vague slogans. But if one relies on the math, one sees that many of the seemingly mysterious features of the experiment are in fact straightforward.
I’ll say more about this in future posts. In particular, to convince you today’s argument is really correct, I’ll look more closely at the quantum wave function corresponding to Figs. 3-5, and will reproduce the same phenomenon in simpler examples. Then we’ll apply the resulting insights to other cases, including
measurements that do not destroy interference,
measurements that only partly destroy interference,
destruction of interference without measurement, and
Now finally, we come to the heart of the matter of quantum interference, as seen from the perspective of in 1920’s quantum physics. (We’ll deal with quantum field theory later this year.)
Last time I looked at some cases of two particle states in which the particles’ behavior is independent — uncorrelated. In the jargon, the particles are said to be “unentangled”. In this situation, and only in this situation, the wave function of the two particles can be written as a product of two wave functions, one per particle. As a result, any quantum interference can be ascribed to one particle or the other, and is visible in measurements of either one particle or the other. (More precisely, it is observable in repeated experiments, in which we do the same measurement over and over.)
In this situation, because each particle’s position can be studied independent of the other’s, we can be led to think any interference associated with particle 1 happens near where particle 1 is located, and similarly for interference involving the second particle.
But this line of reasoning only works when the two particles are uncorrelated. Once this isn’t true — once the particles are entangled — it can easily break down. We saw indications of this in an example that appeared at the ends of my last two posts (here and here), which I’m about to review. The question for today is: what happens to interference in such a case?
Correlation: When “Where” Breaks Down
Let me now review the example of my recent posts. The pre-quantum system looks like this
Figure 1: An example of a superposition, in a pre-quantum view, where the two particles are correlated and where interference will occur that involves both particles together.
Notice the particles are correlated; either both particles are moving to the left ORboth particles are moving to the right. (The two particles are said to be “entangled”, because the behavior of one depends upon the behavior of the other.) As a result, the wave function cannot be factored (in contrast to most examples in my last post) and we cannot understand the behavior of particle 1 without simultaneously considering the behavior of particle 2. Compare this to Fig. 2, an example from my last post in which the particles are independent; the behavior of particle 2 is the same in both parts of the superposition, independent of what particle 1 is doing.
Figure 2: Unlike Fig. 1, here the two particles are uncorrelated; the behavior of particle 2 is the same whether particle 1 is moving left OR right. As a result, interference can occur for particle 1 separately from any behavior of particle 2, as shown in this post.
Let’s return now to Fig. 1. The wave function for the corresponding quantum system, shown as a graph of its absolute value squared on the space of possibilities, behaves as in Fig. 3.
Figure 3: The absolute-value-squared of the wave function for the system in Fig, 1, showing interference as the peaks cross. Note the interference fringes are diagonal relative to the x1 and x2 axes.
But as shown last time in Fig. 19, at the moment where the interference in Fig. 3 is at its largest, if we measure particle 1 we see no interference effect. More precisely, if we do the experiment many times and measure particle 1 each time, as depicted in Fig. 4, we see no interference pattern.
Figure 4: The result of repeated experiments in which we measure particle 1, at the moment of maximal interference, in the system of Fig. 3. Each new experiment is shown as an orange dot; results of past experiments are shown in blue. No interference effect is seen.
We see something analogous if we measure particle 2.
Yet the interference is plain as day in Fig. 3. It’s obvious when we look at the full two-dimensional space of possibilities, even though it is invisible in Fig. 4 for particle 1 and in the analogous experiment for particle 2. So what measurements, if any, can we make that can reveal it?
The clue comes from the fact that the interference fringes lie at a 45 degree angle, perpendicular neither to the x1 axis nor to the x2 axis but instead to the axis for the variable 1/2(x1 + x2), the average of the positions of particle 1 and 2. It’s that average position that we need to measure if we are to observe the interference.
But doing so requires we that we measure both particles’ positions. We have to measure them both every time we repeat the experiment. Only then can we start making a plot of the average of their positions.
When we do this, we will find what is shown in Fig 5.
The top row shows measurements of particle 1.
The bottom row shows measurements of particle 2.
And the middle row shows a quantity that we infer from these measurements: their average.
For each measurement, I’ve drawn a straight orange line between the measurement of x1 and the measurement of x2; the center of this line lies at the average position 1/2(x1+x2). The actual averages are then recorded in a different color, to remind you that we don’t measure them directly; we infer them from the actual measurements of the two particles’ positions.
Figure 5: As in Fig. 4, the result of repeated experiments in which we measure both particles’ positions at the moment of maximal interference in Fig. 3. Top and bottom rows show the position measurements of particles 1 and 2; the middle row shows their average. Each new experiment is shown as two orange dots, they are connected by an orange line, at whose midpoint a new yellow dot is placed. Results of past experiments are shown in blue. No interference effect is seen in the individual particle positions, yet one appears in their average.
In short, the interference is not associated with either particle separately — none is seen in either the top or bottom rows. Instead, it is found within the correlation between the two particles’ positions. This is something that neither particle can tell us on its own.
And where is the interference? It certainly lies near 1/2(x1+x2)=0. But this should worry you. Is that really a point in physical space?
You could imagine a more extreme example of this experiment in which Fig. 5 shows particle 1 located in Boston and particle 2 located in New York City. This would put their average position within appropriately-named Middletown, Connecticut. (I kid you not; check for yourself.) Would we really want to say that the interference itself is located in Middletown, even though it’s a quiet bystander, unaware of the existence of two correlated particles that lie in opposite directions 90 miles (150 km) away?
After all, the interference appears in the relationship between the particles’ positions in physical space, not in the positions themselves. Its location in the space of possibilities (Fig. 3) is clear. Its location in physical space (Fig. 5) is anything but.
Still, I can imagine you pondering whether it might somehow make sense to assign the interference to poor, unsuspecting Middletown. For that reason, I’m going to make things even worse, and take Middletown out of the middle.
A Second System with No Where
Here’s another system with interference, whose pre-quantum version is shown in Figs. 6a and 6b:
Figure 6a: Another system in a superposition with entangled particles, shown in its pre-quantum version in physical space. In part A of the superposition both particles are stationary, while in part B they move oppositely.
Figure 6b: The same system as in Fig. 6a, depicted in the space of possibilities with its two initial possibilities labeled as stars. Possibility A remains where it is, while possibility B moves toward and intersects with possibility A, leading us to expect interference in the quantum wave function.
The corresponding wave function is shown in Fig. 7. Now the interference fringes are oriented diagonally the other way compared to Fig. 3. How are we to measure them this time?
Figure 7: The absolute-value-squared of the wave function for the system shown in Fig. 6. The interference fringes lie on the opposite diagonal from those of Fig. 3.
The average position 1/2(x1+x2) won’t do; we’ll see nothing interesting there. Instead the fringes are near (x1-x2)=4 — that is, they occur when the particles, no matter where they are in physical space, are at a distance of four units. We therefore expect interference near 1/2(x1-x2)=2. Is it there?
In Fig. 8 I’ve shown the analogue of Figs. 4 and 5, depicting
the measurements of the two particle positions x1 and x2, along with
their average 1/2(x1+x2) plotted between them (in yellow)
(half) their difference 1/2(x1-x2) plotted below them (in green).
That quantity 1/2(x1-x2) is half the horizontal length of the orange line. Hidden in its behavior over many measurements is an interference pattern, seen in the bottom row, where the 1/2(x1-x2) measurements are plotted. [Note also that there is no interference pattern in the measurements of 1/2(x1+x2), in contrast to Fig. 4.]
Figure 8: For the system of Figs. 6-7, repeated experiments in which the measurement of the position of particle 1 is plotted in the top row (upper blue points), that of particle 2 is plotted in the third row (lower blue points), their average is plotted between (yellow points), and half their difference is plotted below them (green points.) Each new set of measurements is shown as orange points connected by an orange line, as in Fig. 5. An interference pattern is seen only in the difference.
Now the question of the hour: where is the interference in this case? It is found near 1/2(x1-x2)=2 — but that certainly is not to be identified with a legitimate position in physical space, such as the point x=2.
First of all, making such an identification in Fig. 8 would be like saying that one particle is in New York and the other is in Boston, while the interference is 150 kilometers offshore in the ocean. But second and much worse, I could change Fig. 8 by moving both particles 10 units to the left and repeating the experiment. This would cause x1, x2, and 1/2(x1+x2) in Fig. 8 to all shift left by 10 units, moving them off your computer screen, while leaving 1/2(x1-x2) unchanged at 2. In short, all the orange and blue and yellow points would move out of your view, while the green points would remain exactly where they are. The difference of positions — a distance — is not a position.
If 10 units isn’t enough to convince you, let’s move the two particles to the other side of the Sun, or to the other side of the galaxy. The interference pattern stubbornly remains at 1/2(x1-x2)=2. The interference pattern is in a difference of positions, so it doesn’t care whether the two particles are in France, Antarctica, or Mars.
We can move the particles anywhere in the universe, as long as we take them together with their average distance remaining the same, and the interference pattern remains exactly the same. So there’s no way we can identify the interference as being located at a particular value of x, the coordinate of physical space. Trying to do so creates nonsense.
This is totally unlike interference in water waves and sound waves. That kind of interference happens in a someplace; we can say where the waves are, how big they are at a particular location, and where their peaks and valleys are in physical space. Quantum interference is not at all like this. It’s something more general, more subtle, and more troubling to our intuition.
[By the way, there’s nothing special about the two combinations 1/2(x1+x2) and 1/2(x1-x2), the average or the difference. It’s easy to find systems where the intereference arises in the combination x1+2x2, or 3x1-x2, or any other one you like. In none of these is there a natural way to say “where” the interference is located.]
The Profound Lesson
From these examples, we can begin to learn a central lesson of modern physics, one that a century of experimental and theoretical physics have been teaching us repeatedly, with ever greater subtlety. Imagining reality as many of us are inclined to do, as made of localized objects positioned in and moving through physical space — the one-dimensional x-axis in my simple examples, and the three-dimensional physical space that we take for granted when we specify our latitude, longitude and altitude — is simply not going to work in a quantum universe. The correlations among objects have observable consequences, and those correlations cannot simply be ascribed locations in physical space. To make sense of them, it seems we need to expand our conception of reality.
In the process of recognizing this challenge, we have had to confront the giant, unwieldy space of possibilities, which we can only visualize for a single particle moving in up to three dimensions, or for two or three particles moving in just one dimension. In realistic circumstances, especially those of quantum field theory, the space of possibilities has a huge number of dimensions, rendering it horrendously unimaginable. Whether this gargantuan space should be understood as real — perhaps even more real than physical space — continues to be debated.
Indeed, the lessons of quantum interference are ones that physicists and philosophers have been coping with for a hundred years, and their efforts to make sense of them continue to this day. I hope this series of posts has helped you understand these issues, and to appreciate their depth and difficulty.
Looking ahead, we’ll soon take these lessons, and other lessons from recent posts, back to the double-slit experiment. With fresher, better-informed eyes, we’ll examine its puzzles again.
This is a bit of a shaggy dog story, but I think it’s fun. There’s also a moral about the nature of mathematical research.
Once I was interested in the McGee graph, nicely animated here by Mamouka Jibladze:
This is the unique (3,7)-cage, meaning a graph such that each vertex has 3 neighbors and the shortest cycle has length 7. Since it has a very symmetrical appearance, I hoped it would be connected to some interesting algebraic structures. But which?
I read on Wikipedia that the symmetry group of the McGee graph has order 32. Let’s call it the McGee group. Unfortunately there are many different 32-element groups — 51 of them, in fact! — and the article didn’t say which one this was. (It does now.)
and Gordon Royle said the McGee group is “not a super-interesting group, it is SmallGroup(32,43) in either GAP or Magma”. Knowing this let me look up the McGee group on this website, which is wonderfully useful if you’re studying finite groups:
There I learned that the McGee group is the so-called holomorph of the cyclic group : that is, the semidirect product of and its automorphism group:
I resisted getting sucked into the general study of holomorphs, or what happens when you iterate the holomorph construction. Instead, I wanted a more concrete description of the McGee group.
is not just an abelian group: it’s a ring! Since multiplication in a ring distributes over addition, we can get automorphisms of the group by multiplying by those elements that have multiplicative inverses. These invertible elements form a group
called the multiplicative group of . In fact these give all the automorphisms of the group .
In short, the McGee group is
This is very nice, because this is the group of all transformations of of the form
If we think of as a kind of line — called the ‘affine line over ’ — these are precisely all the affine transformations of this line. Thus, the McGee group deserves to be called
This suggests that we can build the McGee graph in some systematic way starting from the affine line over . This turns out to be a bit complicated, because the vertices come in two kinds. That is, the McGee group doesn’t act transitively on the set of vertices. Instead, it has two orbits, shown as red and blue dots here:
The 8 red vertices correspond straightforwardly to the 8 points of the affine line, but the 16 blue vertices are more tricky. There are also the edges to consider: these come in three kinds! Greg Egan figured out how this works, and I wrote it up:
About two weeks ago, I gave a Zoom talk at the Illustrating Math Seminar about some topics on my blog Visual Insight. I mentioned that the McGee group is SmallGroup(32,43) and the holomorph of . And then someone — alas, I forget who — instantly typed in the chat that this is one of the two smallest groups with an amazing property! Namely, this group has an outer automorphism that maps each element to an element conjugate to it.
I didn’t doubt this for a second. To paraphrase what Hardy said when he received Ramanujan’s first letter, nobody would have the balls to make up this shit. So, I posed a challenge to find such an exotic outer automorphism:
An automorphism is class-preserving if for each there exists some such that
If you can use the same for every we call an inner automorphism. But some groups have class-preserving automorphisms that are not inner! These are the class-preserving outer automorphisms.
I don’t know if class-preserving outer automorphisms are good for anything, or important in any way. They mainly just seem intriguingly spooky. An outer automorphism that looks inner if you examine its effect on any one group element is nothing I’d ever considered. So I wanted to see an example.
Rising to my challenge, Greg Egan found a nice explicit formula for some class-preserving outer automorphisms of the McGee group.
As we’ve seen, any element of the McGee group is a transformation
so let’s write it as a pair . Greg Egan looked for automorphisms of the McGee group that are of the form
for some function
It is easy to check that is an automorphism if and only if
Moreover, is an inner automorphism if and only if
for some .
Now comes something cool noticed by Joshua Grochow: these formulas are an instance of a general fact about group cohomology!
Suppose we have a group acting as automorphisms of an abelian group . Then we can define the cohomology to be the group of -cocycles modulo -coboundaries. We only need the case here. A 1-cocycle is none other than a function obeying
while a 1-coboundary is one of the form
for some . You can check that every 1-coboundary is a 1-cocycle. is the group of 1-cocycles modulo 1-coboundaries.
In this situation we can define the semidirect product , and for any we can define a function
by
Now suppose and suppose is abelian. Then by straightforward calculations we can check:
is an automorphism iff is a 1-cocycle
and
is an inner automorphism iff is a 1-coboundary!
Thus, will have outer automorphisms if .
When then is abelian and is the McGee group. This puts Egan’s idea into a nice context. But we still need to actually find maps that give outer automorphisms of the McGee group, and then find class-preserving ones. I don’t know how to do that using general ideas from cohomology. Maybe someone smart could do the first part, but the ‘class-preserving’ condition doesn’t seem to emerge naturally from cohomology.
Anyway, Egan didn’t waste his time with such effete generalities: he actually found all choices of for which
is a class-preserving outer automorphism of the McGee group. Namely:
Last Saturday after visiting my aunt in Santa Barbara I went to Berkeley to visit the applied category theorists at the Topos Institute. I took a train, to lessen my carbon footprint a bit. The trip took 9 hours — a long time, but a beautiful ride along the coast and then through forests and fields.
The day before taking the train, I discovered my laptop was no longer charging! So, I bought a pad of paper. And then, while riding the train, I checked by hand that Egan’s first choice of really is a cocycle, and really is not a coboundary, so that it defines an outer automorphism of the McGee group. Then — and this was fairly easy — I checked that it defines a class-preserving automorphism. It was quite enjoyable, since I hadn’t done any long calculations recently.
One moral here is that interesting ideas often arise from the interactions of many people. The results here are not profound, but they are certainly interesting, and they came from online conversations with Greg Egan, Gordon Royle, Joshua Grochow, the mysterious person who instantly knew that the McGee group was one of the two smallest groups with a class-preserving outer automorphism, and others.
But what does it all mean, mathematically? Is there something deeper going on here, or is it all just a pile of curiosities?
What did we actually do, in the end? Following the order of logic rather than history, maybe this. We started with a commutative ring , took its group of affine transformations , and saw this group must have outer automorphisms if
We saw this cohomology group really is nonvanishing when and . Furthermore, we found a class-preserving outer automorphism of .
This raises a few questions:
What is the cohomology in general?
What are the outer automorphisms of ?
When does have class-preserving outer automorphisms?
This guide is intended for the Chapman undergraduate students who are attending this year’s APS Global Summit. It may be useful for others as well.
The APS Global Summit is a ginormous event, featuring dozens of parallel sessions at any given time. It can be exciting for first-time attendees, but also overwhelming. Here, I compile some advice on how to navigate the meeting and some suggestions for sessions and events you might like to attend.
General Advice
Use the online schedule and the mobile app to help you navigate the meeting. If you create a login, the online schedule allows you to add things to your personalized schedule, which you can view on the app at the meeting. This is a very useful thing to do because making decisions of where to go on the fly is difficult.
Do not overschedule yourself. I know it is tempting to figure out how to go to as many things as you can, and run between sessions on opposite sides of the convention center. This will be harder to accomplish than you imagine. The meeting gets very crowded and it is exhausting to sit through a full three-hour session of talks. Schedule some break time and, where possible, schedule blocks of time in one location rather than running all over the place.
You will have noticed that most talks at the meeting are 12min long (10min + 2min). These are called contributed talks. Since they are so short, they are more like adverts for the work than a detailed explanation. They are usually aimed at experts and, quite frankly, many speakers do not know how to give these talks well. It is not worth attending these talks unless one of the following applies:
You are already an expert in that research area.
You are strongly considering doing research in that area.
You are there to support your friends and colleagues who are speaking in that session.
You are so curious about the research area that you are prepared to sit through a lot of opaque talks to get some idea of what is going on in the area.
The session is on a topic that is unusually accessible or the session is aimed at undergraduate students.
Instead, you should prioritize attending the following kinds of talks, which you can search for using the filters on the schedule:
Plenary talks: These are aimed at a general physics audience and are usually by famous speakers (famous by physics standards anyway). Some of these might also be…
Invited Sessions: These sessions consist of 30min talks by invited speakers in a common research area. There is no guarantee that they will be accessible to novices, but it is much more likely than with the contributed talks. Go to any invited sessions on areas of physics you are curious about.
Focus Sessions: Focus sessions consist mainly of contributed talks, but they also have one or two 30min invited talks. It is not considered rude to switch sessions between talks, so do not be afraid to just attend the invited talks. They are not always scheduled at the beginning of the session. In fact, some groups deliberately stagger the times of the invited talks so that people can see the invited talks in more than one focus session.
There are sessions that list “Undergraduate Students” as part of their target audience. A lot of these are “Undergraduate Research” sessions. It can be interesting to go to one or two of these to see the variety of undergraduate research experiences that are on offer. However, I would not advise only going to sessions on this list. For one thing, undergraduate research projects are not banned from the other sessions, so many of the best undergraduate projects will not be in those sessions. Going to sessions by topic is a better bet most of the time.
It is helpful to filter the sessions on the schedule by the organizing Unit (Division, Topical Group, or Forum). You can find a list of APS units here. For example, if you are particularly interested in Quantum Information and Computation then you will want to look at the sessions organized by DQI (Division of Quantum Information). Sessions organized by Forums are often particularly accessible, as they tend to be about less technical issues (DEI, Education, History and Philosophy, etc.)
The next sections contain some more specific suggestions about events, talks and sessions that you might like to attend.
Orientation and Networking Events
I have never been to an orientation or networking event at the APS meeting, but then again I did not go to the APS meeting as a student. Networking is one of the best things you can do at the meeting, so do take any opportunities to meet and talk to people.
The student lunch with the Experts is especially worth it because you get a one-on-eight meeting with a physicist who works on a topic you are interested in. You also get a free lunch. Spaces are limited, so you need to sign up for it on the Sunday, and early if you want to get your choice of expert.
Generally speaking, food is very expensive in the convention center. Therefore, the more places you can get free food the better. There are networking events, some of which are aimed at students and some of which have free meals. Other good bets for free food include the receptions and business meetings. (With a business meeting you may have to first sit through a boring administrative meeting for an APS unit, but at least the DQI meeting will feature me talking about The Quantum Times.)
Sessions Chaired by Chapman Faculty
The next few sections highlight talks and sessions that involve people at Chapman. You may want to come to these not only to support local people, but also to find out more about areas of research that you might want to do undergraduate research projects in.
The following sessions are being chaired by Chapman faculty. The chair does not give a talk during the session, but acts as a host. But chairs usually work in the areas that the session is about, so it is a good way to get more of an overview of things they are interested in.
Talks involving Chapman Faculty, Postdocs and Students
The talks listed below all have someone who is currently affiliated with Chapman as one or more of the authors. The Chapman person is not necessarily the person giving the talk.
The people giving the talks, especially if they are students or postdocs, would appreciate your support. It is also a good way of finding out more about research that is going on at Chapman.
Posters involving Chapman Faculty, Postdocs and Students
Poster sessions last longer than talks, so you can view the posters at your leisure. The presenter is supposed to stand by their poster and talk to people who come to see it. The following posters are being presented by Chapman undergraduates. Please drop by and support them.
Thursday March 20, 10:00am – 1:00pm, Anaheim Convention Center, Exhibit Hall A
These are sessions that reflect my own interests. It is a good bet that you will find me at one of these, unless I am teaching, or someone I know is speaking somewhere else. There are multiple sessions at the same time, but what I will typically do is select the one that has the most interesting looking talk at the time and switch sessions from time to time or take a break from sessions entirely if I get bored.
It is worthwhile to spend some time in the exhibit hall. It features a Careers Fair and a Grad School Fair, which will be larger and more relevant to physics students than other such fairs you might attend in the area.
But, of course, the main purpose of going to the exhibition hall is to acquire SWAG. Some free items I have obtained from past APS exhibit halls include:
Rubik’s cubes
Balls that light up when you bounce them
Yo-Yos
Wooden model airplanes
Snacks
T-shits
Tote bags
Enough stationery items to last for the rest of your degree
Free magazines and journals
Free or heavily discounted books
I recommend going when the hall first opens to get the highest quality SWAG.
Fun Stuff
Other fun stuff to do at this year’s meeting includes:
QuantumFest: This starts with the Quantum Jubilee event on Saturday, but there are events all week some of which you have to be registered for the meeting for. Definitely reserve a spot for the LabEscape escpae room. I have done one of their rooms before and it is fun.
Physics Rock-n-Roll Singalong: A very nerdy APS meeting tradition. Worth attending once in your life. Probably only once though.
This week’s lectures on instantons in my gauge theory class (a very important kind of theory for understanding many phenomenon in nature – light is an example of a phenomenon that is described by gauge theory) were a lot of fun to do, and mark the culmination of a month-long … Click to continue reading this post →
I had this neat calculation in my drawer and on the occasion of quantum mechanic's 100th birthday in 2025, I decided I submit a talk about it to the March meeting of the DPG, the German physical society, in Göttingen. And to have to show something, I put it out on the arxiv today. The idea is as follows:
The GHZ experiment is a beautiful version of Bell's inequality that demonstrates you get to wrong conclusions when you assume that a property of a quantum system has to have some (unknown) value even when you don't measure it. I would say it shows quantum theory is not realistic, in the sense that unmeasured properties do not have secret values (different for example from classical statistical mechanics where you could imagine to actually measure the exact position of molecule number 2342 in your container of gas). For details, see the paper or this beautiful explanation by Coleman. I should mention here that there is another way out by assuming some non-local forces that conspire to make the result come out right never the less.
On the other hand there is Bohmian mechanics. This is well known to be a non-local theory (as the time evolution of its particles depend on the positions of all other particles in the system or even universe) but what I found more interesting is also realistic: There, it is claimed that all that matters are particles positions (including the positions of pointers on your measurement devices that you might interpret as showing something different than positions for example velocities or field strengths or whatever) and those have all (possibly unknown) values at all times even if you don't measure them.
So how can the two be brought together? There might be an obstacle in the fact that GHZ is usually presented to be a correlation of spins and in the Bohmian literature spins are not really positions, you will always have to make use of some Stern-Gerlach experiments to translate those into actual positions. But we can circumvent this the other way: We don't really need spins, we just need observables of the commutation relation of Pauli matrices. You might think that those cannot be realised with position measurements as they always commute but this is only true as you do the position measurements at equal times. If you wait between them, you can in fact have almost Pauli type operators.
So we can set up a GHZ experiment in terms of three particles in three boxes and for each particle you measure whether it is in the left or the right half of the box but for each particle you decide if you do it at time 0 or at a later moment. You can look at the correlation of the three measurements as a function of time (of course, as you measure different particles, the actual measurements you do still commute independent of time) and what you find is the blue line in
GHZ correlations vs. Bohmian correlations
You can also (numerically) solve the Bohmian equation of motion and compute the expectation of the correlation of positions of the three particles at different times which gives the orange line, clearly something else. No surprise, the realistic theory cannot predict the outcome of an experiment that demonstrates that quantum theory is not realistic. And the non-local character of the evolution equation does not help either.
To save the Bohmian theory, one can in fact argue that I have computed the wrong thing: After measuring the position of one particle at time 0 or by letting it interact with a measuring device, the future time evolution of all particles is affected and one should compute that correlation with the corrected (effectively collapsed) wave function. That, however, I cannot do and I claim is impossible since it would depend on the details of how the first particle's position is actually measured (whereas the orthodox prediction above is independent of those details as those interactions commute with the later observations). In any case, at least my interpretation is that if you don't want to predict the correlation wrong the best you can do is to say you cannot do the calculation as it depends on unknown details (but the result of course shouldn't).
In any case, the standard argument why Bohmian mechanics is indistinguishable from more conventional treatments is that all that matters are position correlations and since those are given by psi-squared they are the same for all approaches. But I show this is not the case for these multi-time correlations.
Post script: What happens when you try to discuss physics with a philosopher: